summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-12-03netfilter: nf_nat: Handle routing changes in MASQUERADE targetJozsef Kadlecsik
When the route changes (backup default route, VPNs) which affect a masqueraded target, the packets were sent out with the outdated source address. The patch addresses the issue by comparing the outgoing interface directly with the masqueraded interface in the nat table. Events are inefficient in this case, because it'd require adding route events to the network core and then scanning the whole conntrack table and re-checking the route for all entry. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-03netfilter: ctnetlink: nla_policy updatesFlorian Westphal
Add stricter checking for a few attributes. Note that these changes don't fix any bug in the current code base. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-03netfilter: kill support for per-af queue backendsFlorian Westphal
We used to have several queueing backends, but nowadays only nfnetlink_queue remains. In light of this there doesn't seem to be a good reason to support per-af registering -- just hook up nfnetlink_queue on module load and remove it on unload. This means that the userspace BIND/UNBIND_PF commands are now obsolete; the kernel will ignore them. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-03netfilter: ctnetlink: dump entries from the dying and unconfirmed listsPablo Neira Ayuso
This patch adds a new operation to dump the content of the dying and unconfirmed lists. Under some situations, the global conntrack counter can be inconsistent with the number of entries that we can dump from the conntrack table. The way to resolve this is to allow dumping the content of the unconfirmed and dying lists, so far it was not possible to look at its content. This provides some extra instrumentation to resolve problematic situations in which anyone suspects memory leaks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-03netfilter: nf_conntrack: improve nf_conn object traceabilityPablo Neira Ayuso
This patch modifies the conntrack subsystem so that all existing allocated conntrack objects can be found in any of the following places: * the hash table, this is the typical place for alive conntrack objects. * the unconfirmed list, this is the place for newly created conntrack objects that are still traversing the stack. * the dying list, this is where you can find conntrack objects that are dying or that should die anytime soon (eg. once the destroy event is delivered to the conntrackd daemon). Thus, we make sure that we follow the track for all existing conntrack objects. This patch, together with some extension of the ctnetlink interface to dump the content of the dying and unconfirmed lists, will help in case to debug suspected nf_conn object leaks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-03netfilter: ipset: Increase the number of maximal sets automaticallyJozsef Kadlecsik
The max number of sets was hardcoded at kernel cofiguration time and could only be modified via a module parameter. The patch adds the support of increasing the max number of sets automatically, as needed. The array of sets is incremented by 64 new slots if we run out of empty slots. The absolute limit for the maximal number of sets is limited by 65534. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-02bna: remove useless calls to memset().Cyril Roelandt
These calls are followed by calls to memcpy() on the same memory area, so they can safely be removed. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02tcp: don't abort splice() after small transfersWilly Tarreau
TCP coalescing added a regression in splice(socket->pipe) performance, for some workloads because of the way tcp_read_sock() is implemented. The reason for this is the break when (offset + 1 != skb->len). As we released the socket lock, this condition is possible if TCP stack added a fragment to the skb, which can happen with TCP coalescing. So let's go back to the beginning of the loop when this happens, to give a chance to splice more frags per system call. Doing so fixes the issue and makes GRO 10% faster than LRO on CPU-bound splice() workloads instead of the opposite. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02net: fix sparse endianness warnings on sock_commonEric Dumazet
# make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/inet_hashtables.o ... net/ipv4/inet_hashtables.c:242:7: warning: restricted __portpair degrades to integer net/ipv4/inet_hashtables.c:242:7: warning: restricted __addrpair degrades to integer ... Move __portpair/__addrpair from include/net/inet_hashtables.h to include/net/sock.h where we need them in struct sock_common Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ling Ma <ling.ma.program@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: Correct PFC disablementBarak Witkowski
bnx2x driver could only have enabled pfc via usage of dcbnl; now, it can also correctly disable it. Signed-off-by: Barak Witkowski <barak@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: fix 'Ethtool -A' when autonegYuval Mintz
When configuring pauses using 'ethtool -A', the requested values have effect when used together with autoneg (up to this point, when configured for autoneg, driver ignored requested pause configuration) Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: prevent DCB if disabled in nvramBarak Witkowski
Signed-off-by: Barak Witkowski <barak@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: Handle a rarely missed interruptYaniv Rosner
A rare case of no link due to a missed interrupt may occur due to a race condition between acknowledging the IGU via the BAR and restoring the NIG interrupt mask via the GRC. To solve it, we wait for the IGU ack command to finish prior to restoring the NIG interrupt mask. Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: mask CPL_OF interruptYuval Mintz
Unmasked interrupt caused "FATAL HW block attention set2 0x20" messages to erroneously appear, as the associated interrupt is fully recoverable. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: IGU parse error cause probe failureBarak Witkowski
If IGU parse error is encountered during the probing process, the error propagates and the probe gracefully fails (until now, such errors were ignored, later causing mischief). Signed-off-by: Barak Witkowski <barak@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: Ext. config accessed only on non-E1x.Yuval Mintz
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: nvram enables dropless flow controlYuval Mintz
It is now possible to enable dropless flow control via nvram. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: Correct advertised speed/duplexYuval Mintz
If link is down due to management (and not due to actual phy link being lost), driver should still behave as if the link is down; Querying via ethtool about speed/duplex state should result in 'UNKNOWN' (same behaviour as when link is actually down). Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: Filter packets on FCoE ringsDmitry Kravkov
Whenever bnx2x fails to transmit a packet due to a full Tx ring, if the ring size is zero (indicating an FCoE ring) driver filters the packet out and gracefully continues. Driver also gathers statistics on such filtered packets. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: Management can control PFC/ETSBarak Witkowski
If configured for PFC/ETS by management, configure chip regardless of the presence of a remote peer which supports DCBX. Signed-off-by: Barak Witkowski <barak@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: parity recovery flow enhancementBarak Witkowski
Parity recovery was enhanced in order to handle a few more corner cases. Signed-off-by: Barak Witkowski <barak@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02bnx2x: revised and corrected SPIO accessYuval Mintz
Changed naming convention of SPIO macros, and prevented access to invalid SPIOs. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02net/mlx4_en: Set number of rx/tx channels using ethtoolAmir Vadai
Add support to changing number of rx/tx channels using ethtool ('ethtool -[lL]'). Where the number of tx channels specified in ethtool is the number of rings per user priority - not total number of tx rings. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02net/mlx4_en: Fix TX moderation info loss after set_ringparam is calledAmir Vadai
We need to re-set tx moderation information after calling set_ringparam else default tx moderation will be used. Also avoid related code duplication, by putting it in a utility function. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02MAINTAINERS: Add Mellanox ethernet driver - mlx4_enAmir Vadai
Set mlx4_en maintainer to Amir Vadai instead of Yevgeny Petrilin. Signed-off-by: Amir Vadai <amirv@mellanox.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01Merge git://git.infradead.org/users/dwmw2/atmDavid S. Miller
David Woodhouse says: ==================== This is the result of pulling on the thread started by Krzysztof Mazur's original patch 'pppoatm: don't send frames to destroyed vcc'. Various problems in the pppoatm and br2684 code are solved, some of which were easily triggered and would panic the kernel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-02solos-pci: remove list_vccs() debugging functionDavid Woodhouse
No idea why we've gone so long dumping a list of VCCs with vci==0 on every ->open() call... Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-12-02solos-pci: use GFP_KERNEL where possible, not GFP_ATOMICDavid Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-12-02solos-pci: clean up pclose() functionDavid Woodhouse
- Flush pending TX skbs from the queue rather than waiting for them all to complete (suggested by Krzysztof Mazur <krzysiek@podlesie.net>). - Clear ATM_VF_ADDR only when the PKT_PCLOSE packet has been submitted. - Don't clear ATM_VF_READY at all — vcc_destroy_socket() does that for us. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-12-02pppoatm: optimise PPP channel wakeups after sock_owned_by_user()David Woodhouse
We don't need to schedule the wakeup tasklet on *every* unlock; only if we actually blocked the channel in the first place. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>
2012-12-02br2684: allow assign only on a connected socketKrzysztof Mazur
The br2684 does not check if used vcc is in connected state, causing potential Oops in pppoatm_send() when vcc->send() is called on not fully connected socket. Now br2684 can be assigned only on connected sockets; otherwise -EINVAL error is returned. Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-12-02solos-pci: Fix leak of skb received for unknown vccNathan Williams
... and ensure that the next skb is set up for RX in the DMA case. Signed-off-by: Nathan Williams <nathan@traverse.com.au> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-12-02br2684: fix module_put() raceDavid Woodhouse
The br2684 code used module_put() during unassignment from vcc with hope that we have BKL. This assumption is no longer true. Now owner field in atmvcc is used to move this module_put() to vcc_destroy_socket(). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>
2012-12-02pppoatm: fix missing wakeup in pppoatm_send()David Woodhouse
Now that we can return zero from pppoatm_send() for reasons *other* than the queue being full, that means we can't depend on a subsequent call to pppoatm_pop() waking the queue, and we might leave it stalled indefinitely. Use the ->release_cb() callback to wake the queue after the sock is unlocked. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>
2012-12-02br2684: don't send frames on not-ready vccDavid Woodhouse
Avoid submitting packets to a vcc which is being closed. Things go badly wrong when the ->pop method gets later called after everything's been torn down. Use the ATM socket lock for synchronisation with vcc_destroy_socket(), which clears the ATM_VF_READY bit under the same lock. Otherwise, we could end up submitting a packet to the device driver even after its ->ops->close method has been called. And it could call the vcc's ->pop method after the protocol has been shut down. Which leads to a panic. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>
2012-12-02atm: add release_cb() callback to vccDavid Woodhouse
The immediate use case for this is that it will allow us to ensure that a pppoatm queue is woken after it has to drop a packet due to the sock being locked. Note that 'release_cb' is called when the socket is *unlocked*. This is not to be confused with vcc_release() — which probably ought to be called vcc_close(). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>
2012-12-02solos-pci: wait for pending TX to complete when releasing vccDavid Woodhouse
We should no longer be calling the old pop routine for the vcc, after vcc_release() has completed. Make sure we wait for any pending TX skbs to complete, by waiting for our own PKT_PCLOSE control skb to be sent. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-12-01qlcnic: remove duplicated include from qlcnic_sysfs.cWei Yongjun
Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01myri10ge: fix incorrect use of ntohs()Andrew Gallatin
1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2 incorrectly used ntohs() rather than htons() in myri10ge_vlan_rx(). Thanks to Fengguang Wu, Yuanhan Liu's kernel-build tester for pointing out this bug. Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01ipv6: unify logic evaluating inet6_dev's accept_ra propertyShmulik Ladkani
As of 026359b [ipv6: Send ICMPv6 RSes only when RAs are accepted], the logic determining whether to send Router Solicitations is identical to the logic determining whether kernel accepts Router Advertisements. However the condition itself is repeated in several code locations. Unify it by introducing 'ipv6_accept_ra()' accessor. Also, simplify the condition expression, making it more readable. No semantic change. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01tcp: change default tcp hash sizeEric Dumazet
As time passed, available memory increased faster than number of concurrent tcp sockets. As a result, a machine with 4GB of ram gets a hash table with 524288 slots, using 8388608 bytes of memory. Lets change that by a 16x factor (one slot for 128 KB of ram) Even if a small machine needs a _lot_ of sockets, tcp lookups are now very efficient, using one cache line per socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next Ben Hutchings says: ==================== 1. More workarounds for TX queue flush failures that can occur during interface reconfiguration. 2. Fix spurious failure of a firmware request running during a system clock change, e.g. ntpd started at the same time as driver load. 3. Fix inconsistent statistics after a firmware upgrade. 4. Fix a variable (non-)initialisation in offline self-test that can make it more disruptive than intended. 5. Fix a race that can (at least) cause an assertion failure. 6. Miscellaneous cleanup. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to ixgbe, igb and e1000e. Majority of the changes are against igb. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01ixgbe: Do not parse past IP header on fragments beyond the firstAlexander Duyck
This change makes it so that only the first fragment in a series of fragments will have the L4 header pulled. Previously we were always pulling the L4 header as well and in the case of UDP this can harm performance since only the first fragment will have the header, the rest just contain data which should be left in the paged portion of the packet. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-12-01e1000e: cosmetic cleanup of commentsBruce Allan
Update comments to conform to the preferred style for networking code as described in ./Documentation/CodingStyle and checked for in the recently added checkpatch NETWORKING_BLOCK_COMMENT_STYLE test. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-12-01igb: Fix SerDes autoneg flow control.Carolyn Wyborny
This patch enables flow control to be set in SerDes autoneg mode. This is done the way it is done for copper, but relies on a different set of register/bit checks since this is all done within the MAC registers. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-12-01igb: Unset sigdetect for SERDES loopback on 82580 and i350Carolyn Wyborny
This patch unsets the sigdetect bit for SERDES loopback tests on 82580 and i350 parts. The loopback test can fail on these parts without this setting. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-12-01igb: Workaround for global device reset problem on 82580.Carolyn Wyborny
Due to a hw errata, the global device reset doesn't always work on 82580 devices. This patch works around the problem not trying to do a global device reset on these devices. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-12-01igb: Refactoring of i210 file.Carolyn Wyborny
This patch refactors the functions in e1000_i210.c in order to remove need for prototypes. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-12-01igb: Acquire, release semaphore for writing each EEPROM pageAkeem G. Abodunrin
This patch allows software acquires and releases NVM resource for writing each EEPROM page, instead of holding semaphore for the whole data block which is too long and could trigger write fails on unpredictable addresses. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>