summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-06-29batman-adv: Fix use-after-free/double-free of tt_req_nodeSven Eckelmann
The tt_req_node is added and removed from a list inside a spinlock. But the locking is sometimes removed even when the object is still referenced and will be used later via this reference. For example batadv_send_tt_request can create a new tt_req_node (including add to a list) and later re-acquires the lock to remove it from the list and to free it. But at this time another context could have already removed this tt_req_node from the list and freed it. CPU#0 batadv_batman_skb_recv from net_device 0 -> batadv_iv_ogm_receive -> batadv_iv_ogm_process -> batadv_iv_ogm_process_per_outif -> batadv_tvlv_ogm_receive -> batadv_tvlv_ogm_receive -> batadv_tvlv_containers_process -> batadv_tvlv_call_handler -> batadv_tt_tvlv_ogm_handler_v1 -> batadv_tt_update_orig -> batadv_send_tt_request -> batadv_tt_req_node_new spin_lock(...) allocates new tt_req_node and adds it to list spin_unlock(...) return tt_req_node CPU#1 batadv_batman_skb_recv from net_device 1 -> batadv_recv_unicast_tvlv -> batadv_tvlv_containers_process -> batadv_tvlv_call_handler -> batadv_tt_tvlv_unicast_handler_v1 -> batadv_handle_tt_response spin_lock(...) tt_req_node gets removed from list and is freed spin_unlock(...) CPU#0 <- returned to batadv_send_tt_request spin_lock(...) tt_req_node gets removed from list and is freed MEMORY CORRUPTION/SEGFAULT/... spin_unlock(...) This can only be solved via reference counting to allow multiple contexts to handle the list manipulation while making sure that only the last context holding a reference will free the object. Fixes: a73105b8d4c7 ("batman-adv: improved client announcement mechanism") Signed-off-by: Sven Eckelmann <sven@narfation.org> Tested-by: Martin Weinelt <martin@darmstadt.freifunk.net> Tested-by: Amadeus Alfa <amadeus@chemnitz.freifunk.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29batman-adv: replace WARN with rate limited output on non-existing VLANSimon Wunderlich
If a VLAN tagged frame is received and the corresponding VLAN is not configured on the soft interface, it will splat a WARN on every packet received. This is a quite annoying behaviour for some scenarios, e.g. if bat0 is bridged with eth0, and there are arbitrary VLAN tagged frames from Ethernet coming in without having any VLAN configuration on bat0. The code should probably create vlan objects on the fly and transparently transport these VLAN-tagged Ethernet frames, but until this is done, at least the WARN splat should be replaced by a rate limited output. Fixes: 354136bcc3c4 ("batman-adv: fix kernel crash due to missing NULL checks") Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29net: phy: Manage fixed PHY address space using IDAFlorian Fainelli
If we have a system which uses fixed PHY devices and calls fixed_phy_register() then fixed_phy_unregister() we can exhaust the number of fixed PHYs available after a while, since we keep incrementing the variable phy_fixed_addr, but we never decrement it. This patch fixes that by converting the fixed PHY allocation to using IDA, which takes care of the allocation/dealloaction of the PHY addresses for us. Fixes: a75951217472 ("net: phy: extend fixed driver with fixed_phy_register()") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28NFS: Fix another OPEN_DOWNGRADE bugTrond Myklebust
Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: cd9288ffaea4 ("NFSv4: Fix another bug in the close/open_downgrade code") Cc: stable@vger.kernel.org # 2.6.33+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-06-28audit: move audit_get_tty to reduce scope and kabi changesRichard Guy Briggs
The only users of audit_get_tty and audit_put_tty are internal to audit, so move it out of include/linux/audit.h to kernel.h and create a proper function rather than inlining it. This also reduces kABI changes. Suggested-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: line wrapped description] Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-28audit: move calcs after alloc and check when logging set loginuidRichard Guy Briggs
Move the calculations of values after the allocation in case the allocation fails. This avoids wasting effort in the rare case that it fails, but more importantly saves us extra logic to release the tty ref. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-28Merge branch 'for-4.7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Two trivial fixes - one for a bug in the allocation failure path and the other a compiler warning fix" * 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata: sata_mv: fix mis-conversion in mv_write_cached_reg() ata: fix return value check in ahci_seattle_get_port_info()
2016-06-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fix from Jiri Kosina: "Regression fix for multitouch palm rejection from Allen Hung" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: multitouch: enable palm rejection for Windows Precision Touchpad Revert "HID: multitouch: enable palm rejection if device implements confidence usage"
2016-06-28sock_diag: do not broadcast raw socket destructionWillem de Bruijn
Diag intends to broadcast tcp_sk and udp_sk socket destruction. Testing sk->sk_protocol for IPPROTO_TCP/IPPROTO_UDP alone is not sufficient for this. Raw sockets can have the same type. Add a test for sk->sk_type. Fixes: eb4cb008529c ("sock_diag: define destruction multicast groups") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28connector: fix out-of-order cn_proc netlink message deliveryAaron Campbell
The proc connector messages include a sequence number, allowing userspace programs to detect lost messages. However, performing this detection is currently more difficult than necessary, since netlink messages can be delivered to the application out-of-order. To fix this, leave pre-emption disabled during cn_netlink_send(), and use GFP_NOWAIT. The following was written as a test case. Building the kernel w/ make -j32 proved a reliable way to generate out-of-order cn_proc messages. int main(int argc, char *argv[]) { static uint32_t last_seq[CPU_SETSIZE], seq; int cpu, fd; struct sockaddr_nl sa; struct __attribute__((aligned(NLMSG_ALIGNTO))) { struct nlmsghdr nl_hdr; struct __attribute__((__packed__)) { struct cn_msg cn_msg; struct proc_event cn_proc; }; } rmsg; struct __attribute__((aligned(NLMSG_ALIGNTO))) { struct nlmsghdr nl_hdr; struct __attribute__((__packed__)) { struct cn_msg cn_msg; enum proc_cn_mcast_op cn_mcast; }; } smsg; fd = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); if (fd < 0) { perror("socket"); } sa.nl_family = AF_NETLINK; sa.nl_groups = CN_IDX_PROC; sa.nl_pid = getpid(); if (bind(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) { perror("bind"); } memset(&smsg, 0, sizeof(smsg)); smsg.nl_hdr.nlmsg_len = sizeof(smsg); smsg.nl_hdr.nlmsg_pid = getpid(); smsg.nl_hdr.nlmsg_type = NLMSG_DONE; smsg.cn_msg.id.idx = CN_IDX_PROC; smsg.cn_msg.id.val = CN_VAL_PROC; smsg.cn_msg.len = sizeof(enum proc_cn_mcast_op); smsg.cn_mcast = PROC_CN_MCAST_LISTEN; if (send(fd, &smsg, sizeof(smsg), 0) != sizeof(smsg)) { perror("send"); } while (recv(fd, &rmsg, sizeof(rmsg), 0) == sizeof(rmsg)) { cpu = rmsg.cn_proc.cpu; if (cpu < 0) { continue; } seq = rmsg.cn_msg.seq; if ((last_seq[cpu] != 0) && (seq != last_seq[cpu] + 1)) { printf("out-of-order seq=%d on cpu=%d\n", seq, cpu); } last_seq[cpu] = seq; } /* NOTREACHED */ perror("recv"); return -1; } Signed-off-by: Aaron Campbell <aaron@monkey.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28Bridge: Fix ipv6 mc snooping if bridge has no ipv6 addressdaniel
The bridge is falsly dropping ipv6 mulitcast packets if there is: 1. No ipv6 address assigned on the brigde. 2. No external mld querier present. 3. The internal querier enabled. When the bridge fails to build mld queries, because it has no ipv6 address, it slilently returns, but keeps the local querier enabled. This specific case causes confusing packet loss. Ipv6 multicast snooping can only work if: a) An external querier is present OR b) The bridge has an ipv6 address an is capable of sending own queries Otherwise it has to forward/flood the ipv6 multicast traffic, because snooping cannot work. This patch fixes the issue by adding a flag to the bridge struct that indicates that there is currently no ipv6 address assinged to the bridge and returns a false state for the local querier in __br_multicast_querier_exists(). Special thanks to Linus Lüssing. Fixes: d1d81d4c3dd8 ("bridge: check return value of ipv6_dev_get_saddr()") Signed-off-by: Daniel Danzberger <daniel@dd-wrt.com> Acked-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28HID: multitouch: enable palm rejection for Windows Precision TouchpadAllen Hung
The usage Confidence is mandary to Windows Precision Touchpad devices. If it is examined in input_mapping on a WIndows Precision Touchpad, a new add quirk MT_QUIRK_CONFIDENCE desgned for such devices will be applied to the device. A touch with the confidence bit is not set is determined as invalid. Tested on Dell XPS13 9343 Cc: stable@vger.kernel.org # v4.5+ Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3 Signed-off-by: Allen Hung <allen_hung@dell.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-06-28Revert "HID: multitouch: enable palm rejection if device implements ↵Allen Hung
confidence usage" This reverts commit 25a84db15b3f ("HID: multitouch: enable palm rejection if device implements confidence usage") The commit enables palm rejection for Win8 Precision Touchpad devices but the quirk MT_QUIRK_VALID_IS_CONFIDENCE it is using is not working very properly. This quirk is originally designed for some WIn7 touchscreens. Use of this for a Win8 Precision Touchpad will cause unexpected pointer jumping problem. Cc: stable@vger.kernel.org # v4.5+ Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3 Signed-off-by: Allen Hung <allen_hung@dell.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-06-28mac80211: Fix mesh estab_plinks counting in STA removal caseJouni Malinen
If a user space program (e.g., wpa_supplicant) deletes a STA entry that is currently in NL80211_PLINK_ESTAB state, the number of established plinks counter was not decremented and this could result in rejecting new plink establishment before really hitting the real maximum plink limit. For !user_mpm case, this decrementation is handled by mesh_plink_deactive(). Fix this by decrementing estab_plinks on STA deletion (mesh_sta_cleanup() gets called from there) so that the counter has a correct value and the Beacon frame advertisement in Mesh Configuration element shows the proper value for capability to accept additional peers. Cc: stable@vger.kernel.org Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-28net/mlx5: use mlx5_buf_alloc_node instead of mlx5_buf_alloc in mlx5_wq_ll_createWang Sheng-Hui
Commit 311c7c71c9bb ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") introduced mlx5_*_alloc_node() but missed changing some calling and warn messages. This patch introduces 2 changes: * Use mlx5_buf_alloc_node() instead of mlx5_buf_alloc() in mlx5_wq_ll_create() * Update the failure warn messages with _node postfix for mlx5_*_alloc function names Fixes: 311c7c71c9bb ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com> Acked-By: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28Merge branch 'bgmac-fixes'David S. Miller
Florian Fainelli says: ==================== net: bgmac: Random fixes This patch series fixes a few issues spotted by code inspection and actual testing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28net: bgmac: Remove superflous netif_carrier_on()Florian Fainelli
bgmac_open() calls phy_start() to initialize the PHY state machine, which will set the interface's carrier state accordingly, no need to force that as this could be conflicting with the PHY state determined by PHYLIB. Fixes: dd4544f05469 ("bgmac: driver for GBit MAC core on BCMA bus") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28net: bgmac: Start transmit queue in bgmac_openFlorian Fainelli
The driver does not start the transmit queue in bgmac_open(). If the queue was stopped prior to closing then re-opening the interface, we would never be able to wake-up again. Fixes: dd4544f05469 ("bgmac: driver for GBit MAC core on BCMA bus") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28net: bgmac: Fix SOF bit checkingFlorian Fainelli
We are checking for the Start of Frame bit in the ctl1 word, while this bit is set in the ctl0 word instead. Read the ctl0 word and update the check to verify that. Fixes: 9cde94506eac ("bgmac: implement scatter/gather support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28bonding: fix 802.3ad aggregator reselectionJay Vosburgh
Since commit 7bb11dc9f59d ("bonding: unify all places where actor-oper key needs to be updated."), the logic in bonding to handle selection between multiple aggregators has not functioned. This affects only configurations wherein the bonding slaves connect to two discrete aggregators (e.g., two independent switches, each with LACP enabled), thus creating two separate aggregation groups within a single bond. The cause is a change in 7bb11dc9f59d to no longer set AD_PORT_BEGIN on a port after a link state change, which would cause the port to be reselected for attachment to an aggregator as if were newly added to the bond. We cannot restore the prior behavior, as it contradicts IEEE 802.1AX 5.4.12, which requires ports that "become inoperable" (lose carrier, setting port_enabled=false as per 802.1AX 5.4.7) to remain selected (i.e., assigned to the aggregator). As the port now remains selected, the aggregator selection logic is not invoked. A side effect of this change is that aggregators in bonding will now contain ports that are link down. The aggregator selection logic does not currently handle this situation correctly, causing incorrect aggregator selection. This patch makes two changes to repair the aggregator selection logic in bonding to function as documented and within the confines of the standard: First, the aggregator selection and related logic now utilizes the number of active ports per aggregator, not the number of selected ports (as some selected ports may be down). The ad_select "bandwidth" and "count" options only consider ports that are link up. Second, on any carrier state change of any slave, the aggregator selection logic is explicitly called to insure the correct aggregator is active. Reported-by: Veli-Matti Lintu <veli-matti.lintu@opinsys.fi> Fixes: 7bb11dc9f59d ("bonding: unify all places where actor-oper key needs to be updated.") Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28ipmr/ip6mr: Initialize the last assert time of mfc entries.Tom Goff
This fixes wrong-interface signaling on 32-bit platforms for entries created when jiffies > 2^31 + MFC_ASSERT_THRESH. Signed-off-by: Tom Goff <thomas.goff@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28s390: fix test_fp_ctl inline assembly contraintsMartin Schwidefsky
The test_fp_ctl function is used to test if a given value is a valid floating-point control. The inline assembly in test_fp_ctl uses an incorrect constraint for the 'orig_fpc' variable. If the compiler chooses the same register for 'fpc' and 'orig_fpc' the test_fp_ctl() function always returns true. This allows user space to trigger kernel oopses with invalid floating-point control values on the signal stack. This problem has been introduced with git commit 4725c86055f5bbdcdf "s390: fix save and restore of the floating-point-control register" Cc: stable@vger.kernel.org # v3.13+ Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-28Revert "s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL"Michael Holzheu
This reverts commit 852ffd0f4e23248b47531058e531066a988434b5. There are use cases where an intermediate boot kernel (1) uses kexec to boot the final production kernel (2). For this scenario we should provide the original boot information to the production kernel (2). Therefore clearing the boot information during kexec() should not be done. Cc: stable@vger.kernel.org # v3.17+ Reported-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-28arc: unwind: warn only once if DW2_UNWIND is disabledAlexey Brodkin
If CONFIG_ARC_DW2_UNWIND is disabled every time arc_unwind_core() gets called following message gets printed in debug console: ----------------->8--------------- CONFIG_ARC_DW2_UNWIND needs to be enabled ----------------->8--------------- That message makes sense if user indeed wants to see a backtrace or get nice function call-graphs in perf but what if user disabled unwinder for the purpose? Why pollute his debug console? So instead we'll warn user about possibly missing feature once and let him decide if that was what he or she really wanted. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-06-28ARC: unwind: ensure that .debug_frame is generated (vs. .eh_frame)Vineet Gupta
With recent binutils update to support dwarf CFI pseudo-ops in gas, we now get .eh_frame vs. .debug_frame. Although the call frame info is exactly the same in both, the CIE differs, which the current kernel unwinder can't cope with. This broke both the kernel unwinder as well as loadable modules (latter because of a new unhandled relo R_ARC_32_PCREL from .rela.eh_frame in the module loader) The ideal solution would be to switch unwinder to .eh_frame. For now however we can make do by just ensureing .debug_frame is generated by removing -fasynchronous-unwind-tables .eh_frame generated with -gdwarf-2 -fasynchronous-unwind-tables .debug_frame generated with -gdwarf-2 Fixes STAR 9001058196 Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-06-27Merge tag 'for-v4.7-rc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel. * tag 'for-v4.7-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power_supply: tps65217-charger: Fix NULL deref during property export power_supply: power_supply_read_temp only if use_cnt > 0
2016-06-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: vmmouse - remove port reservation Input: elantech - add more IC body types to the list Input: wacom_w8001 - ignore invalid pen data packets Input: wacom_w8001 - w8001_MAX_LENGTH should be 13 Input: xpad - fix oops when attaching an unknown Xbox One gamepad MAINTAINERS: add Pali Rohár as reviewer of ALPS PS/2 touchpad driver Input: add HDMI CEC specific keycodes Input: add BUS_CEC type Input: xpad - fix rumble on Xbox One controllers with 2015 firmware
2016-06-27Merge branch 'rc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild regression fix from Michal Marek: "The problem is that commit 9c8fa9bc08f6 ("kbuild: fix if_change and friends to consider argument order") fixed a potential missed rebuild, but this results in unnnecessary rebuilds with the packaging targets. Which is still more correct than the previous logic, but also very annoying" * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kbuild: Initialize exported variables
2016-06-27Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Various small cifs/smb3 fixes, include some for stable, and some from the recent SMB3 test event" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: File names with trailing period or space need special case conversion Fix reconnect to not defer smb3 session reconnect long after socket reconnect cifs: check hash calculating succeeded cifs: dynamic allocation of ntlmssp blob cifs: use CIFS_MAX_DOMAINNAME_LEN when converting the domain name cifs: stuff the fl_owner into "pid" field in the lock request
2016-06-27Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - Missing length check for user-space GETALG request - Bogus memmove length in ux500 driver - Incorrect priority setting for vmx driver - Incorrect ABI selection for vmx driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: user - re-add size check for CRYPTO_MSG_GETALG crypto: ux500 - memmove the right size crypto: vmx - Increase priority of aes-cbc cipher crypto: vmx - Fix ABI detection
2016-06-27vsock: make listener child lock ordering explicitStefan Hajnoczi
There are several places where the listener and pending or accept queue child sockets are accessed at the same time. Lockdep is unhappy that two locks from the same class are held. Tell lockdep that it is safe and document the lock ordering. Originally Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> sent a similar patch asking whether this is safe. I have audited the code and also covered the vsock_pending_work() function. Suggested-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27ipv6: enforce egress device match in per table nexthop lookupsPaolo Abeni
with the commit 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups"), net hop lookup is first performed on route creation in the passed-in table. However device match is not enforced in table lookup, so the found route can be later discarded due to egress device mismatch and no global lookup will be performed. This cause the following to fail: ip link add dummy1 type dummy ip link add dummy2 type dummy ip link set dummy1 up ip link set dummy2 up ip route add 2001:db8:8086::/48 dev dummy1 metric 20 ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy1 metric 20 ip route add 2001:db8:8086::/48 dev dummy2 metric 21 ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy2 metric 21 RTNETLINK answers: No route to host This change fixes the issue enforcing device lookup in ip6_nh_lookup_table() v1->v2: updated commit message title Fixes: 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups") Reported-and-tested-by: Beniamino Galvani <bgalvani@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27Merge tag 'linux-can-fixes-for-4.7-20160623' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-06-23 this is a pull request of 3 patches for the upcoming linux-4.7 release. The first two patches are by Oliver Hartkopp fixing oopes in the generic CAN device netlink handling. Jimmy Assarsson's patch for the kvaser_usb driver adds support for more devices by adding their USB product ids. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode.Quentin Casasnovas
I couldn't get Xen to boot a L2 HVM when it was nested under KVM - it was getting a GP(0) on a rather unspecial vmread from Xen: (XEN) ----[ Xen-4.7.0-rc x86_64 debug=n Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e008:[<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450 (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (d1v0) (XEN) rax: ffff82d0801e6288 rbx: ffff83003ffbfb7c rcx: fffffffffffab928 (XEN) rdx: 0000000000000000 rsi: 0000000000000000 rdi: ffff83000bdd0000 (XEN) rbp: ffff83000bdd0000 rsp: ffff83003ffbfab0 r8: ffff830038813910 (XEN) r9: ffff83003faf3958 r10: 0000000a3b9f7640 r11: ffff83003f82d418 (XEN) r12: 0000000000000000 r13: ffff83003ffbffff r14: 0000000000004802 (XEN) r15: 0000000000000008 cr0: 0000000080050033 cr4: 00000000001526e0 (XEN) cr3: 000000003fc79000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around <ffff82d0801e629e> (vmx_get_segment_register+0x14e/0x450): (XEN) 00 00 41 be 02 48 00 00 <44> 0f 78 74 24 08 0f 86 38 56 00 00 b8 08 68 00 (XEN) Xen stack trace from rsp=ffff83003ffbfab0: ... (XEN) Xen call trace: (XEN) [<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450 (XEN) [<ffff82d0801f3695>] get_page_from_gfn_p2m+0x165/0x300 (XEN) [<ffff82d0801bfe32>] hvmemul_get_seg_reg+0x52/0x60 (XEN) [<ffff82d0801bfe93>] hvm_emulate_prepare+0x53/0x70 (XEN) [<ffff82d0801ccacb>] handle_mmio+0x2b/0xd0 (XEN) [<ffff82d0801be591>] emulate.c#_hvm_emulate_one+0x111/0x2c0 (XEN) [<ffff82d0801cd6a4>] handle_hvm_io_completion+0x274/0x2a0 (XEN) [<ffff82d0801f334a>] __get_gfn_type_access+0xfa/0x270 (XEN) [<ffff82d08012f3bb>] timer.c#add_entry+0x4b/0xb0 (XEN) [<ffff82d08012f80c>] timer.c#remove_entry+0x7c/0x90 (XEN) [<ffff82d0801c8433>] hvm_do_resume+0x23/0x140 (XEN) [<ffff82d0801e4fe7>] vmx_do_resume+0xa7/0x140 (XEN) [<ffff82d080164aeb>] context_switch+0x13b/0xe40 (XEN) [<ffff82d080128e6e>] schedule.c#schedule+0x22e/0x570 (XEN) [<ffff82d08012c0cc>] softirq.c#__do_softirq+0x5c/0x90 (XEN) [<ffff82d0801602c5>] domain.c#idle_loop+0x25/0x50 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) GENERAL PROTECTION FAULT (XEN) [error_code=0000] (XEN) **************************************** Tracing my host KVM showed it was the one injecting the GP(0) when emulating the VMREAD and checking the destination segment permissions in get_vmx_mem_address(): 3) | vmx_handle_exit() { 3) | handle_vmread() { 3) | nested_vmx_check_permission() { 3) | vmx_get_segment() { 3) 0.074 us | vmx_read_guest_seg_base(); 3) 0.065 us | vmx_read_guest_seg_selector(); 3) 0.066 us | vmx_read_guest_seg_ar(); 3) 1.636 us | } 3) 0.058 us | vmx_get_rflags(); 3) 0.062 us | vmx_read_guest_seg_ar(); 3) 3.469 us | } 3) | vmx_get_cs_db_l_bits() { 3) 0.058 us | vmx_read_guest_seg_ar(); 3) 0.662 us | } 3) | get_vmx_mem_address() { 3) 0.068 us | vmx_cache_reg(); 3) | vmx_get_segment() { 3) 0.074 us | vmx_read_guest_seg_base(); 3) 0.068 us | vmx_read_guest_seg_selector(); 3) 0.071 us | vmx_read_guest_seg_ar(); 3) 1.756 us | } 3) | kvm_queue_exception_e() { 3) 0.066 us | kvm_multiple_exception(); 3) 0.684 us | } 3) 4.085 us | } 3) 9.833 us | } 3) + 10.366 us | } Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine Control Structure", I found that we're enforcing that the destination operand is NOT located in a read-only data segment or any code segment when the L1 is in long mode - BUT that check should only happen when it is in protected mode. Shuffling the code a bit to make our emulation follow the specification allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests without problems. Fixes: f9eb4af67c9d ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions") Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Eugene Korenevsky <ekorenevsky@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-27KVM: LAPIC: cap __delay at lapic_timer_advance_nsMarcelo Tosatti
The host timer which emulates the guest LAPIC TSC deadline timer has its expiration diminished by lapic_timer_advance_ns nanoseconds. Therefore if, at wait_lapic_expire, a difference larger than lapic_timer_advance_ns is encountered, delay at most lapic_timer_advance_ns. This fixes a problem where the guest can cause the host to delay for large amounts of time. Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-27KVM: x86: move nsec_to_cycles from x86.c to x86.hMarcelo Tosatti
Move the inline function nsec_to_cycles from x86.c to x86.h, as the next patch uses it from lapic.c. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-27pvclock: Get rid of __pvclock_read_cycles in function pvclock_read_flagsMinfei Huang
There is a generic function __pvclock_read_cycles to be used to get both flags and cycles. For function pvclock_read_flags, it's useless to get cycles value. To make this function be more effective, get this variable flags directly in function. Signed-off-by: Minfei Huang <mnghuan@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-27pvclock: Cleanup to remove function pvclock_get_nsec_offsetMinfei Huang
Function __pvclock_read_cycles is short enough, so there is no need to have another function pvclock_get_nsec_offset to calculate tsc delta. It's better to combine it into function __pvclock_read_cycles. Remove useless variables in function __pvclock_read_cycles. Signed-off-by: Minfei Huang <mnghuan@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-27pvclock: Add CPU barriers to get correct version valueMinfei Huang
Protocol for the "version" fields is: hypervisor raises it (making it uneven) before it starts updating the fields and raises it again (making it even) when it is done. Thus the guest can make sure the time values it got are consistent by checking the version before and after reading them. Add CPU barries after getting version value just like what function vread_pvclock does, because all of callees in this function is inline. Fixes: 502dfeff239e8313bfbe906ca0a1a6827ac8481b Cc: stable@vger.kernel.org Signed-off-by: Minfei Huang <mnghuan@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-27make nfs_atomic_open() call d_drop() on all ->open_context() errors.Al Viro
In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code" unconditional d_drop() after the ->open_context() had been removed. It had been correct for success cases (there ->open_context() itself had been doing dcache manipulations), but not for error ones. Only one of those (ENOENT) got a compensatory d_drop() added in that commit, but in fact it should've been done for all errors. As it is, the case of O_CREAT non-exclusive open on a hashed negative dentry racing with e.g. symlink creation from another client ended up with ->open_context() getting an error and proceeding to call nfs_lookup(). On a hashed dentry, which would've instantly triggered BUG_ON() in d_materialise_unique() (or, these days, its equivalent in d_splice_alias()). Cc: stable@vger.kernel.org # v3.10+ Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-06-27KVM: arm/arm64: Stop leaking vcpu pid referencesJames Morse
kvm provides kvm_vcpu_uninit(), which amongst other things, releases the last reference to the struct pid of the task that was last running the vcpu. On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool, then killing it with SIGKILL results (after some considerable time) in: > cat /sys/kernel/debug/kmemleak > unreferenced object 0xffff80007d5ea080 (size 128): > comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s) > hex dump (first 32 bytes): > 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffff8000001b30ec>] create_object+0xfc/0x278 > [<ffff80000071da34>] kmemleak_alloc+0x34/0x70 > [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8 > [<ffff8000000d0474>] alloc_pid+0x34/0x4d0 > [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338 > [<ffff8000000b633c>] _do_fork+0x74/0x320 > [<ffff8000000b66b0>] SyS_clone+0x18/0x20 > [<ffff800000085cb0>] el0_svc_naked+0x24/0x28 > [<ffffffffffffffff>] 0xffffffffffffffff On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(), on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free(). Signed-off-by: James Morse <james.morse@arm.com> Fixes: 749cf76c5a36 ("KVM: ARM: Initial skeleton to compile KVM support") Cc: <stable@vger.kernel.org> # 3.10+ Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-27arm64: KVM: fix build with CONFIG_ARM_PMU disabledSudeep Holla
When CONFIG_ARM_PMU is disabled, we get the following build error: arch/arm64/kvm/sys_regs.c: In function 'pmu_counter_idx_valid': arch/arm64/kvm/sys_regs.c:564:27: error: 'ARMV8_PMU_CYCLE_IDX' undeclared (first use in this function) if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) ^ arch/arm64/kvm/sys_regs.c:564:27: note: each undeclared identifier is reported only once for each function it appears in arch/arm64/kvm/sys_regs.c: In function 'access_pmu_evcntr': arch/arm64/kvm/sys_regs.c:592:10: error: 'ARMV8_PMU_CYCLE_IDX' undeclared (first use in this function) idx = ARMV8_PMU_CYCLE_IDX; ^ arch/arm64/kvm/sys_regs.c: In function 'access_pmu_evtyper': arch/arm64/kvm/sys_regs.c:638:14: error: 'ARMV8_PMU_CYCLE_IDX' undeclared (first use in this function) if (idx == ARMV8_PMU_CYCLE_IDX) ^ arch/arm64/kvm/hyp/switch.c:86:15: error: 'ARMV8_PMU_USERENR_MASK' undeclared (first use in this function) write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); This patch fixes the build with CONFIG_ARM_PMU disabled. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-27net: smsc911x: Fix bug where PHY interrupts are overwritten by 0Jeremy Linton
By default, mdiobus_alloc() sets the PHYs to polling mode, but a pointer size memcpy means that a couple IRQs end up being overwritten with a value of 0. This means that PHY_POLL is disabled and results in unpredictable behavior depending on the PHY's location on the MDIO bus. Remove that memcpy and the now unused phy_irq member to force the SMSC911x PHYs into polling mode 100% of the time. Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-26Linux 4.7-rc5Linus Torvalds
2016-06-26Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two straightforward fixes. One is a concurrency issue only affecting SAS connected SATA drives, but which could hang the storage subsystem if it triggers (because the outstanding command count on error never goes back to zero) and the other is a NO_TAG fallout from the switch to hostwide tags which causes the system to crash on module insertion (we've checked carefully and only the 53c700 family of drivers is vulnerable to this issue)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: 53c700: fix BUG on untagged commands scsi: fix race between simultaneous decrements of ->host_failed
2016-06-25Merge branch 'for-linus-4.7-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes part 2 from Chris Mason: "This has one patch from Omar to bring iterate_shared back to btrfs. We have a tree of work we queue up for directory items and it doesn't lend itself well to shared access. While we're cleaning it up, Omar has changed things to use an exclusive lock when there are delayed items" * 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes
2016-06-25Merge branch 'for-linus-4.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I have a two part pull this time because one of the patches Dave Sterba collected needed to be against v4.7-rc2 or higher (we used rc4). I try to make my for-linus-xx branch testable on top of the last major so we can hand fixes to people on the list more easily, so I've split this pull in two. This first part has some fixes and two performance improvements that we've been testing for some time. Josef's two performance fixes are most notable. The transid tracking patch makes a big improvement on pretty much every workload" * 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: Force stripesize to the value of sectorsize btrfs: fix disk_i_size update bug when fallocate() fails Btrfs: fix error handling in map_private_extent_buffer Btrfs: fix error return code in btrfs_init_test_fs() Btrfs: don't do nocow check unless we have to btrfs: fix deadlock in delayed_ref_async_start Btrfs: track transid for delayed ref flushing
2016-06-25Merge tag 'sound-4.7-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Again pretty calm weeks: we've had only a few trivial / stable HD-audio fixes in addition to a possible race fix for snd-dummy driver spotted by syzkaller" * tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: dummy: Fix a use-after-free at closing ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup ALSA: hda - Fix the headset mic jack detection on Dell machine ALSA: hda/tegra: iomem fixups for sparse warnings ALSA: hdac_regmap - fix the register access for runtime PM
2016-06-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 kprobe fix from Thomas Gleixner: "A single fix clearing the TF bit when a fault is single stepped" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes/x86: Clear TF bit in fault on single-stepping
2016-06-25Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "A couple of scheduler fixes: - force watchdog reset while processing sysrq-w - fix a deadlock when enabling trace events in the scheduler - fixes to the throttled next buddy logic - fixes for the average accounting (missing serialization and underflow handling) - allow kernel threads for fallback to online but not active cpus" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Allow kthreads to fall back to online && !active cpus sched/fair: Do not announce throttled next buddy in dequeue_task_fair() sched/fair: Initialize throttle_count for new task-groups lazily sched/fair: Fix cfs_rq avg tracking underflow kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w sched/debug: Fix deadlock when enabling sched events sched/fair: Fix post_init_entity_util_avg() serialization