summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/netDavid S. Miller
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-08-12 This series contains updates to i40e and e1000e. Lucas provides a fix for i40e to resolve a compile issue where a header was missing in the #includes. Wei Yongjun provides a fix for i40e to resolve a sparse warning, where a non-static function should be static. Julia Lawall provides a fix for i40e which was found using Coccinelle, where there was a typo in the name of the type given to sizeof(). Rickard Strandqvist provides a fix for i40e to replace the use of strncpy() with strlcpy() to avoid strings that lack null termination. Jean Sacren provides two e1000e fixes, first is a comment fix and second removes an excessive space character in a debug message. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13Merge branch 'xen-netback-synchronization'David S. Miller
Wei Liu says: ==================== xen-netback: synchronisation between core driver and netback The zero-copy netback has far more interactions with core network driver than the old copying backend. One significant thing is that netback now relies on a callback from core driver to correctly release resources. However correct synchronisation between core driver and netback is missing. Currently netback relies on a loop to wait for core driver to release resources. This is proven not enough and erroneous recently, partly due to code structure, partly due to missing synchronisation. Short-live domains like OpenMirage unikernels can easily trigger race in backend, rendering backend unresponsive. This patch series aims to slove this issue by introducing proper synchronisation between core driver and netback. Chagges in v4: * avoid using wait queue * remove dedicated loop for netif_napi_del * remove unnecessary check on callback Change in v3: improve commit message in patch 1 Change in v2: fix Zoltan's email address in commit message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13xen-netback: remove loop waiting functionWei Liu
The original implementation relies on a loop to check if all inflight packets are freed. Now we have proper reference counting, there's no need to use loop anymore. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13xen-netback: don't stop dealloc kthread too earlyWei Liu
Reference count the number of packets in host stack, so that we don't stop the deallocation thread too early. If not, we can end up with xenvif_free permanently waiting for deallocation thread to unmap grefs. Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13xen-netback: move NAPI add/remove callsWei Liu
Originally netif_napi_add was in xenvif_init_queue and netif_napi_del was in xenvif_deinit_queue, while kthreads were handled in xenvif_connect and xenvif_disconnect. Move netif_napi_add and netif_napi_del to xenvif_connect and xenvif_disconnect so that they reside together with kthread operations. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13Merge branch 'xen-netback-debugfs'David S. Miller
Wei Liu says: ==================== xen-netback: fix debugfs code This small series fixes two problems in xen-netback debugfs code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13xen-netback: fix debugfs entry creationWei Liu
The original code is bogus. The function gets called in a loop which leaks entries created in previous rounds. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13xen-netback: fix debugfs write length checkWei Liu
Enlarge buffer size and check input length properly, so that we don't misuse -ENOSPC. Note that command like "kickXXXX" is still allowed, that's one patch for another day if we really want to be very strict on this. Reported-by: SeeChen Ng <seechen81@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13net-timestamp: fix missing tcp fragmentation casesWillem de Bruijn
Bytestream timestamps are correlated with a single byte in the skbuff, recorded in skb_shinfo(skb)->tskey. When fragmenting skbuffs, ensure that the tskey is set for the fragment in which the tskey falls (seqno <= tskey < end_seqno). The original implementation did not address fragmentation in tcp_fragment or tso_fragment. Add code to inspect the sequence numbers and move both tskey and the relevant tx_flags if necessary. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13net-timestamp: fix missing ACK timestampWillem de Bruijn
ACK timestamps are generated in tcp_clean_rtx_queue. The TSO datapath can break out early, causing the timestamp code to be skipped. Move the code up before the break. Reported-by: David S. Miller <davem@davemloft.net> Also fix a boundary condition: tp->snd_una is the next unacknowledged byte and between tests inclusive (a <= b <= c), so generate a an ACK timestamp if (prior_snd_una <= tskey <= tp->snd_una - 1). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13drivers/net/irda/donauboe.c: convert to module_pci_driverLibo Chen
Signed-off-by: Libo Chen <libo.chen@huawei.com> Cc: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13irda: Fix rd_frame control field initialization in irlap_send_rd_frame()Maks Naumov
Signed-off-by: Maks Naumov <maksqwe1@ukr.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13libcxgbi/cxgb4i : Fix ipv6 build failure caught with randconfigAnish Bhatt
Previous guard of IS_ENABLED(CONFIG_IPV6) is not sufficient when cxgbi drivers are built into kernel but ipv6 is not. v2: Use Kconfig to disable compiling cxgbi built into kernel when ipv6 is compiled as a module Fixes: e81fbf6cd652 ("libcxgbi:cxgb4i Guard ipv6 code with a config check") Fixes: fc8d0590d914 ("libcxgbi: Add ipv6 api to driver") Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13tg3: fix return value in tg3_get_stats64Govindarajulu Varadarajan
When tp->hw_stats is 0, tg3_get_stats64 should display previously recorded stats. So it returns &tp->net_stats_prev. But the caller, dev_get_stats, ignores the return value. Fix this by assigning tp->net_stats_prev to stats and returning stats. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13sunvnet: Schedule maybe_tx_wakeup() as a tasklet from ldc_rx pathSowmini Varadhan
At the tail of vnet_event(), if we hit the maybe_tx_wakeup() condition, we try to take the netif_tx_lock() in the recv-interrupt-context and can deadlock with dev_watchdog(). vnet_event() should schedule maybe_tx_wakeup() as a tasklet to avoid this deadlock Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13sunvnet: Do not spin in an infinite loop when vio_ldc_send() returns EAGAINSowmini Varadhan
ldc_rx -> vnet_rx -> .. -> vnet_walk_rx->vnet_send_ack should not spin into an infinite loop waiting EAGAIN to lift. The sender could have sent us a burst, and gone to lunch without doing any more ldc_read()'s. That should not cause the receiver to loop infinitely till soft-lockup kicks in. Similarly __vnet_tx_trigger should only loop on EAGAIN a finite number of times. The caller (vnet_start_xmit()) already has code to reset the dring state and bail on errors from __vnet_tx_trigger Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13sunvnet: Do not ask for an ACK for every dring transmitSowmini Varadhan
No need to ask for an ack with every vnet_start_xmit()- the single ACK with DRING_STOPPED is sufficient for the protocol, and we free the sk_buff in vnet_start_xmit itself, so we dont need an ACK back. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13lec: Fix bug introduced by b67bfe0d42cac56c512dd5da4b1b347a23f4b70achas williams - CONTRACTOR
b67bfe0d42cac56c512dd5da4b1b347a23f4b70a (hlist: drop the node parameter from iterators) dropped the node parameter from iterators which lec_tbl_walk() was using to iterate the list. Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13atm/svc: Fix blocking in wait loopchas williams - CONTRACTOR
One should not call blocking primitives inside a wait loop, since both require task_struct::state to sleep, so the inner will destroy the outer state. sigd_enq() will possibly sleep for alloc_skb(). Move sigd_enq() before prepare_to_wait() to avoid sleeping while waiting interruptibly. You do not actually need to call sigd_enq() after the initial prepare_to_wait() because we test the termination condition before calling schedule(). Based on suggestions from Peter Zijlstra. Signed-off-by: Chas Williams <chas@cmf.n4rl.navy.mil> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13myri10ge: check for DMA mapping errorsStanislaw Gruszka
On IOMMU systems DMA mapping can fail, we need to check for that possibility. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13openvswitch: Fix memory leak in ovs_vport_alloc() error pathChristoph Jaeger
ovs_vport_alloc() bails out without freeing the memory 'vport' points to. Picked up by Coverity - CID 1230503. Fixes: 5cd667b0a4 ("openvswitch: Allow each vport to have an array of 'port_id's.") Signed-off-by: Christoph Jaeger <cj@linux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Several networking final fixes and tidies for the merge window: 1) Changes during the merge window unintentionally took away the ability to build bluetooth modular, fix from Geert Uytterhoeven. 2) Several phy_node reference count bug fixes from Uwe Kleine-König. 3) Fix ucc_geth build failures, also from Uwe Kleine-König. 4) Fix klog false positivies when netlink messages go to network taps, by properly resetting the network header. Fix from Daniel Borkmann. 5) Sizing estimate of VF netlink messages is too small, from Jiri Benc. 6) New APM X-Gene SoC ethernet driver, from Iyappan Subramanian. 7) VLAN untagging is erroneously dependent upon whether the VLAN module is loaded or not, but there are generic dependencies that matter wrt what can be expected as the SKB enters the stack. Make the basic untagging generic code, and do it unconditionally. From Vlad Yasevich. 8) xen-netfront only has so many slots in it's transmit queue so linearize packets that have too many frags. From Zoltan Kiss. 9) Fix suspend/resume PHY handling in bcmgenet driver, from Florian Fainelli" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (55 commits) net: bcmgenet: correctly resume adapter from Wake-on-LAN net: bcmgenet: update UMAC_CMD only when link is detected net: bcmgenet: correctly suspend and resume PHY device net: bcmgenet: request and enable main clock earlier net: ethernet: myricom: myri10ge: myri10ge.c: Cleaning up missing null-terminate after strncpy call xen-netfront: Fix handling packets on compound pages with skb_linearize net: fec: Support phys probed from devicetree and fixed-link smsc: replace WARN_ON() with WARN_ON_SMP() xen-netback: Don't deschedule NAPI when carrier off net: ethernet: qlogic: qlcnic: Remove duplicate object file from Makefile wan: wanxl: Remove typedefs from struct names m68k/atari: EtherNEC - ethernet support (ne) net: ethernet: ti: cpmac.c: Cleaning up missing null-terminate after strncpy call hdlc: Remove typedefs from struct names airo_cs: Remove typedef local_info_t atmel: Remove typedef atmel_priv_ioctl com20020_cs: Remove typedef com20020_dev_t ethernet: amd: Remove typedef local_info_t net: Always untag vlan-tagged traffic on input. drivers: net: Add APM X-Gene SoC ethernet driver support. ...
2014-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull Sparc fixes from David Miller: "Sparc bug fixes, one of which was preventing successful SMP boots with mainline" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix pcr_ops initialization and usage bugs. sparc64: Do not disable interrupts in nmi_cpu_busy() sparc: Hook up seccomp and getrandom system calls. sparc: fix decimal printf format specifiers prefixed with 0x
2014-08-13Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic updates from Thomas Gleixner: "This is a major overhaul to the x86 apic subsystem consisting of the following parts: - Remove obsolete APIC driver abstractions (David Rientjes) - Use the irqdomain facilities to dynamically allocate IRQs for IOAPICs. This is a prerequisite to enable IOAPIC hotplug support, and it also frees up wasted vectors (Jiang Liu) - Misc fixlets. Despite the hickup in Ingos previous pull request - caused by the missing fixup for the suspend/resume issue reported by Borislav - I strongly recommend that this update finds its way into 3.17. Some history for you: This is preparatory work for physical IOAPIC hotplug. The first attempt to support this was done by Yinghai and I shot it down because it just added another layer of obscurity and complexity to the already existing mess without tackling the underlying shortcomings of the current implementation. After quite some on- and offlist discussions, I requested that the design of this functionality must use generic infrastructure, i.e. irq domains, which provide all the mechanisms to dynamically map linux interrupt numbers to physical interrupts. Jiang picked up the idea and did a great job of consolidating the existing interfaces to manage the x86 (IOAPIC) interrupt system by utilizing irq domains. The testing in tip, Linux-next and inside of Intel on various machines did not unearth any oddities until Borislav exposed it to one of his oddball machines. The issue was resolved quickly, but unfortunately the fix fell through the cracks and did not hit the tip tree before Ingo sent the pull request. Not entirely Ingos fault, I also assumed that the fix was already merged when Ingo asked me whether he could send it. Nevertheless this work has a proper design, has undergone several rounds of review and the final fallout after applying it to tip and integrating it into Linux-next has been more than moderate. It's the ground work not only for IOAPIC hotplug, it will also allow us to move the lowlevel vector allocation into the irqdomain hierarchy, which will benefit other architectures as well. Patches are posted already, but they are on hold for two weeks, see below. I really appreciate the competence and responsiveness Jiang has shown in course of this endavour. So I'm sure that any fallout of this will be addressed in a timely manner. FYI, I'm vanishing for 2 weeks into my annual kids summer camp kitchen duty^Wvacation, while you folks are drooling at KS/LinuxCon :) But HPA will have a look at the hopefully zero fallout until I'm back" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation x86/apic/vsmp: Make is_vsmp_box() static x86, apic: Remove enable_apic_mode callback x86, apic: Remove setup_portio_remap callback x86, apic: Remove multi_timer_check callback x86, apic: Replace noop_check_apicid_used x86, apic: Remove check_apicid_present callback x86, apic: Remove mps_oem_check callback x86, apic: Remove smp_callin_clear_local_apic callback x86, apic: Replace trampoline physical addresses with defaults x86, apic: Remove x86_32_numa_cpu_node callback x86: intel-mid: Use the new io_apic interfaces x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box() x86, irq: Clean up irqdomain transition code x86, irq, devicetree: Release IOAPIC pin when PCI device is disabled x86, irq, SFI: Release IOAPIC pin when PCI device is disabled x86, irq, mpparse: Release IOAPIC pin when PCI device is disabled x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled x86, irq: Introduce helper functions to release IOAPIC pin x86, irq: Simplify the way to handle ISA IRQ ...
2014-08-13Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/efix fixes from Peter Anvin: "Two EFI-related Kconfig changes, which happen to touch immediately adjacent lines in Kconfig and thus collapse to a single patch" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub x86/efi: Fix 3DNow optimization build failure in EFI stub
2014-08-13Merge branch 'x86-xsave-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/xsave changes from Peter Anvin: "This is a patchset to support the XSAVES instruction required to support context switch of supervisor-only features in upcoming silicon. This patchset missed the 3.16 merge window, which is why it is based on 3.15-rc7" * 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, xsave: Add forgotten inline annotation x86/xsaves: Clean up code in xstate offsets computation in xsave area x86/xsave: Make it clear that the XSAVE macros use (%edi)/(%rdi) Define kernel API to get address of each state in xsave area x86/xsaves: Enable xsaves/xrstors x86/xsaves: Call booting time xsaves and xrstors in setup_init_fpu_buf x86/xsaves: Save xstate to task's xsave area in __save_fpu during booting time x86/xsaves: Add xsaves and xrstors support for booting time x86/xsaves: Clear reserved bits in xsave header x86/xsaves: Use xsave/xrstor for saving and restoring user space context x86/xsaves: Use xsaves/xrstors for context switch x86/xsaves: Use xsaves/xrstors to save and restore xsave area x86/xsaves: Define a macro for handling xsave/xrstor instruction fault x86/xsaves: Define macros for xsave instructions x86/xsaves: Change compacted format xsave area header x86/alternative: Add alternative_input_2 to support alternative with two features and input x86/xsaves: Add a kernel parameter noxsaves to disable xsaves/xrstors
2014-08-13Merge tag 'metag-for-v3.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull metag architecture updates from James Hogan: "Just a couple of minor static analysis fixes, removal of a NULL check that should never happen, and fix an error check where an unsigned value was being checked to see if it was negative" * tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: metag: cachepart: Fix failure check metag: hugetlbpage: Remove null pointer checks that could never happen
2014-08-13Merge tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: - stable fix for a bug in nfs3_list_one_acl() - speed up NFS path walks by supporting LOOKUP_RCU - more read/write code cleanups - pNFS fixes for layout return on close - fixes for the RCU handling in the rpcsec_gss code - more NFS/RDMA fixes" * tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits) nfs: reject changes to resvport and sharecache during remount NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred NFS: fix two problems in lookup_revalidate in RCU-walk NFS: allow lockless access to access_cache NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU NFS: support RCU_WALK in nfs_permission() sunrpc/auth: allow lockless (rcu) lookup of credential cache. NFS: prepare for RCU-walk support but pushing tests later in code. NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used. NFS: add checks for returned value of try_module_get() nfs: clear_request_commit while holding i_lock pnfs: add pnfs_put_lseg_async pnfs: find swapped pages on pnfs commit lists too nfs: fix comment and add warn_on for PG_INODE_REF nfs: check wait_on_bit_lock err in page_group_lock sunrpc: remove "ec" argument from encrypt_v2 operation sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c ...
2014-08-13Merge tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull xfs update from Dave Chinner: "This update contains: - conversion of the XFS core to pass negative error numbers - restructing of core XFS code that is shared with userspace to fs/xfs/libxfs - introduction of sysfs interface for XFS - bulkstat refactoring - demand driven speculative preallocation removal - XFS now always requires 64 bit sectors to be configured - metadata verifier changes to ensure CRCs are calculated during log recovery - various minor code cleanups - miscellaneous bug fixes The diffstat is kind of noisy because of the restructuring of the code to make kernel/userspace code sharing simpler, along with the XFS wide change to use the standard negative error return convention (at last!)" * tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs: (45 commits) xfs: fix coccinelle warnings xfs: flush both inodes in xfs_swap_extents xfs: fix swapext ilock deadlock xfs: kill xfs_vnode.h xfs: kill VN_MAPPED xfs: kill VN_CACHED xfs: kill VN_DIRTY() xfs: dquot recovery needs verifiers xfs: quotacheck leaves dquot buffers without verifiers xfs: ensure verifiers are attached to recovered buffers xfs: catch buffers written without verifiers attached xfs: avoid false quotacheck after unclean shutdown xfs: fix rounding error of fiemap length parameter xfs: introduce xfs_bulkstat_ag_ichunk xfs: require 64-bit sector_t xfs: fix uflags detection at xfs_fs_rm_xquota xfs: remove XFS_IS_OQUOTA_ON macros xfs: tidy up xfs_set_inode32 xfs: allow inode allocations in post-growfs disk space xfs: mark xfs_qm_quotacheck as static ...
2014-08-13Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, reiserfs, UDF updates from Jan Kara: "Scalability improvements for quota, a few reiserfs fixes, and couple of misc cleanups (udf, ext2)" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: reiserfs: Fix use after free in journal teardown reiserfs: fix corruption introduced by balance_leaf refactor udf: avoid redundant memcpy when writing data in ICB fs/udf: re-use hex_asc_upper_{hi,lo} macros fs/quota: kernel-doc warning fixes udf: use linux/uaccess.h fs/ext2/super.c: Drop memory allocation cast quota: remove dqptr_sem quota: simplify remove_inode_dquot_ref() quota: avoid unnecessary dqget()/dqput() calls quota: protect Q_GETFMT by dqonoff_mutex
2014-08-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is a lot of refactoring and hardening of the libceph and rbd code here from Ilya that fix various smaller bugs, and a few more important fixes with clone overlap. The main fix is a critical change to the request_fn handling to not sleep that was exposed by the recent mutex changes (which will also go to the 3.16 stable series). Yan Zheng has several fixes in here for CephFS fixing ACL handling, time stamps, and request resends when the MDS restarts. Finally, there are a few cleanups from Himangi Saraogi based on Coccinelle" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (39 commits) libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly rbd: remove extra newlines from rbd_warn() messages rbd: allocate img_request with GFP_NOIO instead GFP_ATOMIC rbd: rework rbd_request_fn() ceph: fix kick_requests() ceph: fix append mode write ceph: fix sizeof(struct tYpO *) typo ceph: remove redundant memset(0) rbd: take snap_id into account when reading in parent info rbd: do not read in parent info before snap context rbd: update mapping size only on refresh rbd: harden rbd_dev_refresh() and callers a bit rbd: split rbd_dev_spec_update() into two functions rbd: remove unnecessary asserts in rbd_dev_image_probe() rbd: introduce rbd_dev_header_info() rbd: show the entire chain of parent images ceph: replace comma with a semicolon rbd: use rbd_segment_name_free() instead of kfree() ceph: check zero length in ceph_sync_read() ceph: reset r_resend_mds after receiving -ESTALE ...
2014-08-13Merge tag 'upstream-3.17-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull UBI/UBIFS changes from Artem Bityutskiy: "No significant changes, mostly small fixes here and there. The more important fixes are: - UBI deleted list items while iterating the list with 'list_for_each_entry' - The UBI block driver did not work properly with very large UBI volumes" * tag 'upstream-3.17-rc1' of git://git.infradead.org/linux-ubifs: (21 commits) UBIFS: Add log overlap assertions Revert "UBIFS: add a log overlap assertion" UBI: bugfix in ubi_wl_flush() UBI: block: Avoid disk size integer overflow UBI: block: Set disk_capacity out of the mutex UBI: block: Make ubiblock_resize return something UBIFS: add a log overlap assertion UBIFS: remove unnecessary check UBIFS: remove mst_mutex UBIFS: kernel-doc warning fix UBI: init_volumes: Ignore volumes with no LEBs UBIFS: replace seq_printf by seq_puts UBIFS: replace count*size kzalloc by kcalloc UBIFS: kernel-doc warning fix UBIFS: fix error path in create_default_filesystem() UBIFS: fix spelling of "scanned" UBIFS: fix some comments UBIFS: remove useless @ecc in struct ubifs_scan_leb UBIFS: remove useless statements UBIFS: Add missing break statements in dbg_chk_pnode() ...
2014-08-13powerpc/thp: Add tracepoints to track hugepage invalidateAneesh Kumar K.V
Add tracepoint to track hugepage invalidate. This help us in debugging difficult to track bugs. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/mm: Use read barrier when creating real_pteAneesh Kumar K.V
On ppc64 we support 4K hash pte with 64K page size. That requires us to track the hash pte slot information on a per 4k basis. We do that by storing the slot details in the second half of pte page. The pte bit _PAGE_COMBO is used to indicate whether the second half need to be looked while building real_pte. We need to use read memory barrier while doing that so that load of hidx is not reordered w.r.t _PAGE_COMBO check. On the store side we already do a lwsync in __hash_page_4K CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Use ACCESS_ONCE when loading pmdpAneesh Kumar K.V
We would get wrong results in compiler recomputed old_pmd. Avoid that by using ACCESS_ONCE CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Invalidate with vpn in loopAneesh Kumar K.V
As per ISA, for 4k base page size we compare 14..65 bits of VA specified with the entry_VA in tlb. That implies we need to make sure we do a tlbie with all the possible 4k va we used to access the 16MB hugepage. With 64k base page size we compare 14..57 bits of VA. Hence we cannot ignore the lower 24 bits of va while tlbie .We also cannot tlb invalidate a 16MB entry with just one tlbie instruction because we don't track which va was used to instantiate the tlb entry. CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Handle combo pages in invalidateAneesh Kumar K.V
If we changed base page size of the segment, either via sub_page_protect or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash table entries. We do a lazy hash page table flush for all mapped pages in the demoted segment. This happens when we handle hash page fault for these pages. We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte, that implies that we could possibly have older 64K hash pte entries in the hash page table and we need to invalidate those entries. Use _PAGE_COMBO to determine the page size with which we should invalidate the hash table entries on unmap. CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pteAneesh Kumar K.V
If we changed base page size of the segment, either via sub_page_protect or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash table entries. We do a lazy hash page table flush for all mapped pages in the demoted segment. This happens when we handle hash page fault for these pages. We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte, that implies that we could possibly have older 64K hash pte entries in the hash page table and we need to invalidate those entries. Handle this correctly for 16M pages CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Don't recompute vsid and ssize in loop on invalidateAneesh Kumar K.V
The segment identifier and segment size will remain the same in the loop, So we can compute it outside. We also change the hugepage_invalidate interface so that we can use it the later patch CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Add write barrier after updating the valid bitAneesh Kumar K.V
With hugepages, we store the hpte valid information in the pte page whose address is stored in the second half of the PMD. Use a write barrier to make sure clearing pmd busy bit and updating hpte valid info are ordered properly. CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc: reorder per-cpu NUMA information's initializationNishanth Aravamudan
There is an issue currently where NUMA information is used on powerpc (and possibly ia64) before it has been read from the device-tree, which leads to large slab consumption with CONFIG_SLUB and memoryless nodes. NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate after start_secondary(), similar to ia64, which is invoked via smp_init(). Commit 6ee0578b4daae ("workqueue: mark init_workqueues() as early_initcall()") made init_workqueues() be invoked via do_pre_smp_initcalls(), which is obviously before the secondary processors are online. Additionally, the following commits changed init_workqueues() to use cpu_to_node to determine the node to use for kthread_create_on_node: bce903809ab3f ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]") f3f90ad469342 ("workqueue: determine NUMA node of workers accourding to the allowed cpumask") Therefore, when init_workqueues() runs, it sees all CPUs as being on Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to a high number of slab deactivations (http://www.spinics.net/lists/linux-mm/msg67489.html). Fix this by initializing the powerpc-specific CPU<->node/local memory node mapping as early as possible, which on powerpc is do_init_bootmem(). Currently that function initializes the mapping for the boot CPU, but we extend it to setup the mapping for all possible CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the per-cpu values for all possible CPUs. That ensures that before the early_initcalls run (and really as early as possible), the per-cpu NUMA mapping is accurate. While testing memoryless nodes on PowerKVM guests with a fix to the workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a guest topology of: available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 node 1 size: 16336 MB node 1 free: 15329 MB node distances: node 0 1 0: 10 40 1: 40 10 the slab consumption decreases from Slab: 932416 kB SUnreclaim: 902336 kB to Slab: 395264 kB SUnreclaim: 359424 kB And we a corresponding increase in the slab efficiency from slab mem objs slabs used active active ------------------------------------------------------------ kmalloc-16384 337 MB 11.28% 100.00% task_struct 288 MB 9.93% 100.00% to slab mem objs slabs used active active ------------------------------------------------------------ kmalloc-16384 37 MB 100.00% 100.00% task_struct 31 MB 100.00% 100.00% Powerpc didn't support memoryless nodes until recently (64bb80d87f01 "powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and 8c272261194d "powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also helped improve memory consumption with these kind of environments. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/perf/hv-24x7: Use kmem_cache_freeHimangi Saraogi
Free memory allocated using kmem_cache_zalloc using kmem_cache_free rather than kfree. The Coccinelle semantic patch that makes this change is as follows: // <smpl> @@ expression x,E,c; @@ x = \(kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node\)(c,...) ... when != x = E when != &x ?-kfree(x) +kmem_cache_free(c,x) // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_infoThomas Falcon
A buffer returned by H_VTERM_PARTNER_INFO contains device information in big endian format, causing problems for little endian architectures. This patch ensures that they are in cpu endian. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc: Hard disable interrupts in xmonAnton Blanchard
xmon only soft disables interrupts. This seems like a bad idea - we certainly don't want decrementer and PMU exceptions going off when we are debugging something inside xmon. This issue was uncovered when the hard lockup detector went off inside xmon. To ensure we wont get a spurious hard lockup warning, I also call touch_nmi_watchdog() when exiting xmon. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc: remove duplicate definition of TEXASR_FSNishanth Aravamudan
It appears that commits 7f06f21d40a6 ("powerpc/tm: Add checking to treclaim/trechkpt") and e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support") both added definitions of TEXASR_FS. Remove one of them. At the same time, fix the alignment of the remaining definition (should be tab-separated like the rest of the #defines). Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/pseries: Avoid deadlock on removing ddwGavin Shan
Function remove_ddw() could be called in of_reconfig_notifier and we potentially remove the dynamic DMA window property, which invokes of_reconfig_notifier again. Eventually, it leads to the deadlock as following backtrace shows. The patch fixes the above issue by deferring releasing the dynamic DMA window property while releasing the device node. ============================================= [ INFO: possible recursive locking detected ] 3.16.0+ #428 Tainted: G W --------------------------------------------- drmgr/2273 is trying to acquire lock: ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \ .__blocking_notifier_call_chain+0x40/0x78 but task is already holding lock: ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \ .__blocking_notifier_call_chain+0x40/0x78 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((of_reconfig_chain).rwsem); lock((of_reconfig_chain).rwsem); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by drmgr/2273: #0: (sb_writers#4){.+.+.+}, at: [<c0000000001cbe70>] \ .vfs_write+0xb0/0x1f8 #1: ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \ .__blocking_notifier_call_chain+0x40/0x78 stack backtrace: CPU: 17 PID: 2273 Comm: drmgr Tainted: G W 3.16.0+ #428 Call Trace: [c0000000137e7000] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable) [c0000000137e70b0] [c00000000083cd34] .dump_stack+0x7c/0x9c [c0000000137e7130] [c0000000000b8afc] .__lock_acquire+0x128c/0x1c68 [c0000000137e7280] [c0000000000b9a4c] .lock_acquire+0xe8/0x104 [c0000000137e7350] [c00000000083588c] .down_read+0x4c/0x90 [c0000000137e73e0] [c000000000091890] .__blocking_notifier_call_chain+0x40/0x78 [c0000000137e7490] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48 [c0000000137e7520] [c000000000682a28] .of_reconfig_notify+0x34/0x5c [c0000000137e75b0] [c000000000682a9c] .of_property_notify+0x4c/0x54 [c0000000137e7650] [c000000000682bf0] .of_remove_property+0x30/0xd4 [c0000000137e76f0] [c000000000052a44] .remove_ddw+0x144/0x168 [c0000000137e7790] [c000000000053204] .iommu_reconfig_notifier+0x30/0xe0 [c0000000137e7820] [c00000000009137c] .notifier_call_chain+0x6c/0xb4 [c0000000137e78c0] [c0000000000918ac] .__blocking_notifier_call_chain+0x5c/0x78 [c0000000137e7970] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48 [c0000000137e7a00] [c000000000682a28] .of_reconfig_notify+0x34/0x5c [c0000000137e7a90] [c000000000682e14] .of_detach_node+0x44/0x1fc [c0000000137e7b40] [c0000000000518e4] .ofdt_write+0x3ac/0x688 [c0000000137e7c20] [c000000000238430] .proc_reg_write+0xb8/0xd4 [c0000000137e7cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8 [c0000000137e7d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0 [c0000000137e7e30] [c00000000000a064] syscall_exit+0x0/0x98 Cc: stable@vger.kernel.org Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/pseries: Failure on removing device nodeGavin Shan
While running command "drmgr -c phb -r -s 'PHB 528'", following backtrace jumped out because the target device node isn't marked with OF_DETACHED by of_detach_node(), which caused by error returned from memory hotplug related reconfig notifier when disabling CONFIG_MEMORY_HOTREMOVE. The patch fixes it. ERROR: Bad of_node_put() on /pci@800000020000210/ethernet@0 CPU: 14 PID: 2252 Comm: drmgr Tainted: G W 3.16.0+ #427 Call Trace: [c000000012a776a0] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable) [c000000012a77750] [c00000000083cd34] .dump_stack+0x7c/0x9c [c000000012a777d0] [c0000000006807c4] .of_node_release+0x58/0xe0 [c000000012a77860] [c00000000038a7d0] .kobject_release+0x174/0x1b8 [c000000012a77900] [c00000000038a884] .kobject_put+0x70/0x78 [c000000012a77980] [c000000000681680] .of_node_put+0x28/0x34 [c000000012a77a00] [c000000000681ea8] .__of_get_next_child+0x64/0x70 [c000000012a77a90] [c000000000682138] .of_find_node_by_path+0x1b8/0x20c [c000000012a77b40] [c000000000051840] .ofdt_write+0x308/0x688 [c000000012a77c20] [c000000000238430] .proc_reg_write+0xb8/0xd4 [c000000012a77cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8 [c000000012a77d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0 [c000000012a77e30] [c00000000000a064] syscall_exit+0x0/0x98 Cc: stable@vger.kernel.org Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/boot: Use correct zlib types for comparisonBenjamin Herrenschmidt
Avoids this warning: arch/powerpc/boot/gunzip_util.c:118:9: warning: comparison of distinct pointer types lacks a cast Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/powernv: Interface to register/unregister opal dump regionVasant Hegde
PowerNV platform is capable of capturing host memory region when system crashes (because of host/firmware). We have new OPAL API to register/ unregister memory region to be captured when system crashes. This patch adds support for new API. Also during boot time we register kernel log buffer and unregister before doing kexec. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13printk: Add function to return log buffer address and sizeVasant Hegde
Platforms like IBM Power Systems supports service processor assisted dump. It provides interface to add memory region to be captured when system is crashed. During initialization/running we can add kernel memory region to be collected. Presently we don't have a way to get the log buffer base address and size. This patch adds support to return log buffer address and size. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Andrew Morton <akpm@linux-foundation.org>