summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-09-29net: Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTERDavid Ahern
Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTER and update the name of the netif_is_vrf and netif_index_is_vrf macros. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29Merge branch 'listener-refactoring-preparations'David S. Miller
Eric Dumazet says: ==================== tcp: listener refactoring preparations This patch series makes changes to TCP/DCCP stacks so that we can switch listener code to lockless mode. This is done by marking const the listener socket in all appropriate paths. FastOpen code had to be changed to not dynamically allocate a very small structure to make code simpler for following changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: prepare fastopen code for upcoming listener changesEric Dumazet
While auditing TCP stack for upcoming 'lockless' listener changes, I found I had to change fastopen_init_queue() to properly init the object before publishing it. Otherwise an other cpu could try to lock the spinlock before it gets properly initialized. Instead of adding appropriate barriers, just remove dynamic memory allocations : - Structure is 28 bytes on 64bit arches. Using additional 8 bytes for holding a pointer seems overkill. - Two listeners can share same cache line and performance would suffer. If we really want to save few bytes, we would instead dynamically allocate whole struct request_sock_queue in the future. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: constify tcp_syn_flood_action() socket argumentEric Dumazet
tcp_syn_flood_action() will soon be called with unlocked socket. In order to avoid SYN flood warning being emitted multiple times, use xchg(). Extend max_qlen_log and synflood_warned fields in struct listen_sock to u32 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: constify tcp_v{4|6}_route_req() sock argumentEric Dumazet
These functions do not change the listener socket. Goal is to make sure tcp_conn_request() is not messing with listener in a racy way. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: cookie_init_sequence() cleanupsEric Dumazet
Some common IPv4/IPv6 code can be factorized. Also constify cookie_init_sequence() socket argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp/dccp: constify syn_recv_sock() method sock argumentEric Dumazet
We'll soon no longer hold listener socket lock, these functions do not modify the socket in any way. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: constify tcp_create_openreq_child() socket argumentEric Dumazet
This method does not touch the listener socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29dccp: constify dccp_create_openreq_child() sock argumentEric Dumazet
socket no longer needs to be read/write Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: constify sk_gfp_atomic() sock argumentEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29inet: constify __inet_inherit_port() sock argumentEric Dumazet
socket is not touched, make it const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29inet: constify inet_csk_route_child_sock() socket argumentEric Dumazet
The socket points to the (shared) listener. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29dccp: use inet6_csk_route_req() helperEric Dumazet
Before changing dccp_v6_request_recv_sock() sock argument to const, we need to get rid of security_sk_classify_flow(), and it seems doable by reusing inet6_csk_route_req() helper. We need to add a proto parameter to inet6_csk_route_req(), not assume it is TCP. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: remove tcp_rcv_state_process() tcp_hdr argumentEric Dumazet
Factorize code to get tcp header from skb. It makes no sense to duplicate code in callers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp: remove unused len argument from tcp_rcv_state_process()Eric Dumazet
Once we realize tcp_rcv_synsent_state_process() does not use its 'len' argument and we get rid of it, then it becomes clear this argument is no longer used in tcp_rcv_state_process() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29tcp/dccp: constify send_synack and send_reset socket argumentEric Dumazet
None of these functions need to change the socket, make it const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29skbuff: Fix skb checksum partial check.Pravin B Shelar
Earlier patch 6ae459bda tried to detect void ckecksum partial skb by comparing pull length to checksum offset. But it does not work for all cases since checksum-offset depends on updates to skb->data. Following patch fixes it by validating checksum start offset after skb-data pointer is updated. Negative value of checksum offset start means there is no need to checksum. Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull") Reported-by: Andrew Vagin <avagin@odin.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29Merge branch 'ipv4-routing-cleanups'David S. Miller
Alexander Duyck says: ==================== Minor IPv4 routing cleanups These patches just contain some minor cleanups to address a few minor issues. The first and the third mostly just improve readability. The second patch should improve the performance for multicast destination addresses that do not have a localhost source IP address by avoiding some unnecessary dereferences. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Remove martian_source_keep_err goto labelDavid Ahern
err is initialized to -EINVAL when it is declared. It is not reset until fib_lookup which is well after the 3 users of the martian_source jump. So resetting err to -EINVAL at martian_source label is not needed. Removing that line obviates the need for the martian_source_keep_err label so delete it. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Swap ordering of tests in ip_route_input_mcAlexander Duyck
This patch just swaps the ordering of one of the conditional tests in ip_route_input_mc. Specifically it swaps the testing for the source address to see if it is loopback, and the test to see if we allow a loopback source address. The reason for swapping these two tests is because it is much faster to test if an address is loopback than it is to dereference several pointers to get at the net structure to see if the use of loopback is allowed. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net/ipv4: Pass proto as u8 instead of u16 in ip_check_mc_rcuAlexander Duyck
This patch updates ip_check_mc_rcu so that protocol is passed as a u8 instead of a u16. The motivation is just to avoid any unneeded type transitions since some systems will require an instruction to zero extend a u8 field to a u16. Also it makes it a bit more readable as to the fact that protocol is a u8 so there are no byte ordering changes needed to pass it. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is setDavid Ahern
Wolfgang reported that IPv6 stack is ignoring oif in output route lookups: With ipv6, ip -6 route get always returns the specific route. $ ip -6 r 2001:db8:e2::1 dev enp2s0 proto kernel metric 256 2001:db8:e2::/64 dev enp2s0 metric 1024 2001:db8:e3::1 dev enp3s0 proto kernel metric 256 2001:db8:e3::/64 dev enp3s0 metric 1024 fe80::/64 dev enp3s0 proto kernel metric 256 default via 2001:db8:e3::255 dev enp3s0 metric 1024 $ ip -6 r get 2001:db8:e2::100 2001:db8:e2::100 from :: dev enp2s0 src 2001:db8:e3::1 metric 0 cache $ ip -6 r get 2001:db8:e2::100 oif enp3s0 2001:db8:e2::100 from :: dev enp2s0 src 2001:db8:e3::1 metric 0 cache The stack does consider the oif but a mismatch in rt6_device_match is not considered fatal because RT6_LOOKUP_F_IFACE is not set in the flags. Cc: Wolfgang Nothdurft <netdev@linux-dude.de> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29RESEND: [PATCH v3 net-next] sky2: use random address if EEPROM is badLiviu Dudau
On some embedded systems the EEPROM does not contain a valid MAC address. In that case it is better to fallback to a generated mac address and let init scripts fix the value later. Reported-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> [Changed handcoded setup to use eth_hw_addr_random() and to save new address into HW] Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29netpoll: Drop budget parameter from NAPI polling call hierarchyAlexander Duyck
For some reason we were carrying the budget value around between the various calls to napi->poll. If for example one of the drivers called had a bug in which it returned a non-zero value for work this could result in the budget value becoming negative. Rather than carry around a value of budget that is 0 or less we can instead just loop through and pass 0 to each napi->poll call. If any driver returns a value for work done that is non-zero then we can report that driver and continue rather than allowing a bad actor to make the budget value negative and pass that negative value to napi->poll. Note, the only actual change here is that instead of letting budget become negative we are keeping it at 0 regardless of the value returned for work since it should not be possible for the polling routine to do any actual work with a budget of 0. So if the polling routine returns a non-0 value we are just reporting it and continuing with a budget of 0 rather than letting that work value be subtracted from the budget of 0. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net sysfs: Print link speed as signed integerAlexander Stein
Otherwise 4294967295 (MBit/s) (-1) will be printed when there is no link. Documentation/ABI/testing/sysfs-class-net does not state if this shall be signed or unsigned. Also remove the now unused variable fmt_udec. Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29bna: fix error handlingAndrzej Hajda
Several functions can return negative value in case of error, so their return type should be fixed as well as type of variables to which this value is assigned. The problem has been detected using proposed semantic patch scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2046107 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29Merge branch 'af_unix_MSG_PEEK'David S. Miller
Aaron Conole says: ==================== af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag This patch set implements a bugfix for kernel.org bugzilla #12323, allowing MSG_PEEK to return all queued data on the unix domain socket, not just the data contained in a single SKB. This is the v3 version of this patch, which includes a suggested modification by Eric Dumazet to convert the unix_sk() conversion macro to a static inline function. These patches are independent and can be applied separately. This set was tested over a 24-hour period, utilizing a loop continually executing the bugzilla issue attached python code. It was instrumented with a pr_err_once() ([ 13.798683] unix: went there at least one time). v2->v3: - Added Eric Dumazet's suggestion for #define to static inline - Fixed an issue calling unix_state_lock() with an invalid argument v3->v4: - Eliminated an XXX comment - Changed from goto unlock to explicit unix_state_unlock() and break ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29af_unix: return data from multiple SKBs on recv() with MSG_PEEK flagAaron Conole
AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag is set. This is referenced in kernel bugzilla #12323 @ https://bugzilla.kernel.org/show_bug.cgi?id=12323 As described both in the BZ and lkml thread @ http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an AF_UNIX socket only reads a single skb, where the desired effect is to return as much skb data has been queued, until hitting the recv buffer size (whichever comes first). The modified MSG_PEEK path will now move to the next skb in the tree and jump to the again: label, rather than following the natural loop structure. This requires duplicating some of the loop head actions. This was tested using the python socketpair python code attached to the bugzilla issue. Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29af_unix: Convert the unix_sk macro to an inline function for type safetyAaron Conole
As suggested by Eric Dumazet this change replaces the #define with a static inline function to enjoy complaints by the compiler when misusing the API. Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29bridge: vlan: add per-vlan struct and move to rhashtablesNikolay Aleksandrov
This patch changes the bridge vlan implementation to use rhashtables instead of bitmaps. The main motivation behind this change is that we need extensible per-vlan structures (both per-port and global) so more advanced features can be introduced and the vlan support can be extended. I've tried to break this up but the moment net_port_vlans is changed and the whole API goes away, thus this is a larger patch. A few short goals of this patch are: - Extensible per-vlan structs stored in rhashtables and a sorted list - Keep user-visible behaviour (compressed vlans etc) - Keep fastpath ingress/egress logic the same (optimizations to come later) Here's a brief list of some of the new features we'd like to introduce: - per-vlan counters - vlan ingress/egress mapping - per-vlan igmp configuration - vlan priorities - avoid fdb entries replication (e.g. local fdb scaling issues) The structure is kept single for both global and per-port entries so to avoid code duplication where possible and also because we'll soon introduce "port0 / aka bridge as port" which should simplify things further (thanks to Vlad for the suggestion!). Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port rhashtable, if an entry is added to a port it'll get a pointer to its global context so it can be quickly accessed later. There's also a sorted vlan list which is used for stable walks and some user-visible behaviour such as the vlan ranges, also for error paths. VLANs are stored in a "vlan group" which currently contains the rhashtable, sorted vlan list and the number of "real" vlan entries. A good side-effect of this change is that it resembles how hw keeps per-vlan data. One important note after this change is that if a VLAN is being looked up in the bridge's rhashtable for filtering purposes (or to check if it's an existing usable entry, not just a global context) then the new helper br_vlan_should_use() needs to be used if the vlan is found. In case the lookup is done only with a port's vlan group, then this check can be skipped. Things tested so far: - basic vlan ingress/egress - pvids - untagged vlans - undef CONFIG_BRIDGE_VLAN_FILTERING - adding/deleting vlans in different scenarios (with/without global ctx, while transmitting traffic, in ranges etc) - loading/removing the module while having/adding/deleting vlans - extracting bridge vlan information (user ABI), compressed requests - adding/deleting fdbs on vlans - bridge mac change, promisc mode - default pvid change - kmemleak ON during the whole time Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29Merge branch 'mvneta_percpu_irq'David S. Miller
Gregory CLEMENT says: ==================== net: mvneta: Switch to per-CPU irq and make rxq_def useful As stated in the first version: "this patchset reworks the Marvell neta driver in order to really support its per-CPU interrupts, instead of faking them as SPI, and allow the use of any RX queue instead of the hardcoded RX queue 0 that we have currently." Following the review which has been done, Maxime started adding the CPU hotplug support. I continued his work a few weeks ago and here is the result. Since the 1st version the main change is this CPU hotplug support, in order to validate it I powered up and down the CPUs while performing iperf. I ran the tests during hours: the kernel didn't crash and the network interfaces were still usable. Of course it impacted the performance, but continuously power down and up the CPUs is not something we usually do. I also reorganized the series, the 3 first patches should go through the irq subsystem, whereas the 4 others should go to the network subsystem. However, there is a runtime dependency between the two parts. Patch 5 depend on the patch 3 to be able to use the percpu irq. Thanks, Gregory PS: Thanks to Willy who gave me some pointers on how to deal with the NAPI. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: mvneta: Statically assign queues to CPUsMaxime Ripard
Since the switch to per-CPU interrupts, we lost the ability to set which CPU was going to receive our RX interrupt, which was now only the CPU on which the mvneta_open function was run. We can now assign our queues to their respective CPUs, and make sure only this CPU is going to handle our traffic. This also paves the road to be able to change that at runtime, and later on to support RSS. [gregory.clement@free-electrons.com]: hardened the CPU hotplug support. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: mvneta: Allow different queuesMaxime Ripard
The mvneta driver allows to change the default RX queue trough the rxq_def kernel parameter. However, the current code doesn't allow to have any value but 0. It is actively checked for in the driver's probe because the drivers makes a number of assumption and takes a number of shortcuts in order to just use that RX queue. Remove these limitations in order to be able to specify any available queue. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: mvneta: Handle per-cpu interruptsMaxime Ripard
Now that our interrupt controller is allowing us to use per-CPU interrupts, actually use it in the mvneta driver. This involves obviously reworking the driver to have a CPU-local NAPI structure, and report for incoming packet using that structure. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: mvneta: Fix CPU_MAP registers initialisationMaxime Ripard
The CPU_MAP register is duplicated for each CPUs at different addresses, each instance being at a different address. However, the code so far was using CONFIG_NR_CPUS to initialise the CPU_MAP registers for each registers, while the SoCs embed at most 4 CPUs. This is especially an issue with multi_v7_defconfig, where CONFIG_NR_CPUS is currently set to 16, resulting in writes to registers that are not CPU_MAP. Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: <stable@vger.kernel.org> # v3.8+ Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29irqchip: armada-370-xp: Rework per-cpu interrupts handlingMaxime Ripard
The MPIC driver currently has a list of interrupts to handle as per-cpu. Since the timer, fabric and neta interrupts were the only per-cpu interrupts in the system, we can now remove the switch and just check for the hardware irq number to determine whether a given interrupt is per-cpu or not. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29irq: Export per-cpu irq allocation and de-allocation functionsMaxime Ripard
Some drivers might use the per-cpu interrupts and still might be built as a module. Export request_percpu_irq an free_percpu_irq to these user, which also make it consistent with enable/disable_percpu_irq that were exported. Reported-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29genirq: Fix the documentation of request_percpu_irqMaxime Ripard
The documentation of request_percpu_irq is confusing and suggest that the interrupt is not enabled at all, while it is actually enabled on the local CPU. Clarify that. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29bridge: Pass net into br_validate_ipv4 and br_validate_ipv6Eric W. Biederman
The network namespace is easiliy available in state->net so use it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-29ipv6: Pass struct net into ip6_route_me_harderEric W. Biederman
Don't make ip6_route_me_harder guess which network namespace it is routing in, pass the network namespace in. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-29ipv4: Pass struct net into ip_route_me_harderEric W. Biederman
Don't make ip_route_me_harder guess which network namespace it is routing in, pass the network namespace in. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-29netfilter: ipt_SYNPROXY: Pass snet into synproxy_send_tcpEric W. Biederman
ip6t_SYNPROXY already does this and this is needed so that we have a struct net that can be passed down into ip_route_me_harder, so that ip_route_me_harder can stop guessing it's context. Along the way pass snet into synproxy_send_client_synack as this is the only caller of synprox_send_tcp that is not passed snet already. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-29netfilter: Push struct net down into nf_afinfo.rerouteEric W. Biederman
The network namespace is needed when routing a packet. Stop making nf_afinfo.reroute guess which network namespace is the proper namespace to route the packet in. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-29ipv4: Push struct net down into nf_send_resetEric W. Biederman
This is needed so struct net can be pushed down into ip_route_me_harder. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-29UBI: return ENOSPC if no enough space availableshengyong
UBI: attaching mtd1 to ubi0 UBI: scanning is finished UBI error: init_volumes: not enough PEBs, required 706, available 686 UBI error: ubi_wl_init: no enough physical eraseblocks (-20, need 1) UBI error: ubi_attach_mtd_dev: failed to attach mtd1, error -12 <= NOT ENOMEM UBI error: ubi_init: cannot attach mtd1 If available PEBs are not enough when initializing volumes, return -ENOSPC directly. If available PEBs are not enough when initializing WL, return -ENOSPC instead of -ENOMEM. Cc: stable@vger.kernel.org Signed-off-by: Sheng Yong <shengyong1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: David Gstir <david@sigma-star.at>
2015-09-29UBI: Validate data_sizeRichard Weinberger
Make sure that data_size is less than LEB size. Otherwise a handcrafted UBI image is able to trigger an out of bounds memory access in ubi_compare_lebs(). Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: David Gstir <david@sigma-star.at>
2015-09-29UBIFS: Kill unneeded locking in ubifs_init_securityRichard Weinberger
Fixes the following lockdep splat: [ 1.244527] ============================================= [ 1.245193] [ INFO: possible recursive locking detected ] [ 1.245193] 4.2.0-rc1+ #37 Not tainted [ 1.245193] --------------------------------------------- [ 1.245193] cp/742 is trying to acquire lock: [ 1.245193] (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0 [ 1.245193] [ 1.245193] but task is already holding lock: [ 1.245193] (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280 [ 1.245193] [ 1.245193] other info that might help us debug this: [ 1.245193] Possible unsafe locking scenario: [ 1.245193] [ 1.245193] CPU0 [ 1.245193] ---- [ 1.245193] lock(&sb->s_type->i_mutex_key#9); [ 1.245193] lock(&sb->s_type->i_mutex_key#9); [ 1.245193] [ 1.245193] *** DEADLOCK *** [ 1.245193] [ 1.245193] May be due to missing lock nesting notation [ 1.245193] [ 1.245193] 2 locks held by cp/742: [ 1.245193] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff811ad37f>] mnt_want_write+0x1f/0x50 [ 1.245193] #1: (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280 [ 1.245193] [ 1.245193] stack backtrace: [ 1.245193] CPU: 2 PID: 742 Comm: cp Not tainted 4.2.0-rc1+ #37 [ 1.245193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140816_022509-build35 04/01/2014 [ 1.245193] ffffffff8252d530 ffff88007b023a38 ffffffff814f6f49 ffffffff810b56c5 [ 1.245193] ffff88007c30cc80 ffff88007b023af8 ffffffff810a150d ffff88007b023a68 [ 1.245193] 000000008101302a ffff880000000000 00000008f447e23f ffffffff8252d500 [ 1.245193] Call Trace: [ 1.245193] [<ffffffff814f6f49>] dump_stack+0x4c/0x65 [ 1.245193] [<ffffffff810b56c5>] ? console_unlock+0x1c5/0x510 [ 1.245193] [<ffffffff810a150d>] __lock_acquire+0x1a6d/0x1ea0 [ 1.245193] [<ffffffff8109fa78>] ? __lock_is_held+0x58/0x80 [ 1.245193] [<ffffffff810a1a93>] lock_acquire+0xd3/0x270 [ 1.245193] [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0 [ 1.245193] [<ffffffff814fc83b>] mutex_lock_nested+0x6b/0x3a0 [ 1.245193] [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0 [ 1.245193] [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0 [ 1.245193] [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0 [ 1.245193] [<ffffffff8128e286>] ubifs_create+0xa6/0x1f0 [ 1.245193] [<ffffffff81198e7f>] ? path_openat+0x3af/0x1280 [ 1.245193] [<ffffffff81195d15>] vfs_create+0x95/0xc0 [ 1.245193] [<ffffffff8119929c>] path_openat+0x7cc/0x1280 [ 1.245193] [<ffffffff8109ffe3>] ? __lock_acquire+0x543/0x1ea0 [ 1.245193] [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0 [ 1.245193] [<ffffffff81088c00>] ? calc_global_load_tick+0x60/0x90 [ 1.245193] [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0 [ 1.245193] [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180 [ 1.245193] [<ffffffff8119ac55>] do_filp_open+0x75/0xd0 [ 1.245193] [<ffffffff814ffd86>] ? _raw_spin_unlock+0x26/0x40 [ 1.245193] [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180 [ 1.245193] [<ffffffff81189bd9>] do_sys_open+0x129/0x200 [ 1.245193] [<ffffffff81189cc9>] SyS_open+0x19/0x20 [ 1.245193] [<ffffffff81500717>] entry_SYSCALL_64_fastpath+0x12/0x6f While the lockdep splat is a false positive, becuase path_openat holds i_mutex of the parent directory and ubifs_init_security() tries to acquire i_mutex of a new inode, it reveals that taking i_mutex in ubifs_init_security() is in vain because it is only being called in the inode allocation path and therefore nobody else can see the inode yet. Cc: stable@vger.kernel.org # 3.20- Reported-and-tested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-and-tested-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: dedekind1@gmail.com
2015-09-29mmc: pxamci: fix card detect with slot-gpio APIRobert Jarzmik
Move pxamci to mmc slot-gpio API to fix interrupt request. It fixes the case where the card detection is on a gpio expander, on I2C for example on zylonite board. In this case, the card detect netsted interrupt is called from a threaded interrupt. The request_irq() fails, because a hard irq cannot be a nested interrupt from a threaded interrupt (set __setup_irq()). This was tested on zylonite and mioa701 boards. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Petr Cvek <petr.cvek@tul.cz> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-09-29mmc: sunxi: Fix clk-delay settingsHans de Goede
In recent allwinner kernel sources the mmc clk-delay settings have been slightly tweaked, and for sun9i they are completely different then what we are using. This commit brings us in sync with what allwinner does, fixing problems accessing sdcards on some A33 devices (and likely others). For pre sun9i hardware this makes the following changes: -At 400Khz change the sample delay from 7 to 0 (introduced in A31 sdk) -At 50 Mhz change the sample delay from 5 to 4 (introduced in A23 sdk) This also drops the clk-delay calculation for clocks > 50 MHz, we do not need this as we've: mmc->f_max = 50000000, and the delays in the old code were not correct (at 100 MHz the delay must be a multiple of 60, at 200 MHz a multiple of 120). Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-09-29mmc: core: Don't return an error for CD/WP GPIOs when GPIOLIB is unsetUlf Hansson
When CONFIG_GPIOLIB is unset, its stubs will return -ENOSYS. That means when the mmc core parses DT for CD/WP GPIOs via mmc_of_parse(), -ENOSYS becomes propagated to the caller. Typically this means that the mmc host driver fails to probe. As the CD/WP GPIOs are already treated as optional, let's extend that to cover the case when CONFIG_GPIOLIB is unset. Reported-by: Michal Simek <michal.simek@xilinx.com> Fixes: 16b23787fc70 ("mmc: sdhci-of-arasan: Call OF parsing for MMC") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Venu Byravarasu <vbyravarasu@nvidia.com>