summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2020-12-19clk: bcm: dvp: drop a variable that is assigned to onlyUwe Kleine-König
The third parameter to devm_platform_get_and_ioremap_resource() is used only to provide the used resource. As this variable isn't used afterwards, switch to the function devm_platform_ioremap_resource() which doesn't provide this output parameter. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20201120132121.2678997-1-u.kleine-koenig@pengutronix.de Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-12-19rtc: pcf2127: only use watchdog when explicitly availableUwe Kleine-König
Most boards using the pcf2127 chip (in my bubble) don't make use of the watchdog functionality and the respective output is not connected. The effect on such a board is that there is a watchdog device provided that doesn't work. So only register the watchdog if the device tree has a "reset-source" property. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> [RV: s/has-watchdog/reset-source/] Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201218101054.25416-3-rasmus.villemoes@prevas.dk
2020-12-18vdpa: Use simpler version of ida allocationParav Pandit
vdpa doesn't have any specific need to define start and end range of the device index. Hence use the simper version of the ida allocator. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20201112064005.349268-3-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vhost scsi: fix error return code in vhost_scsi_set_endpoint()Zhang Changzhong
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 25b98b64e284 ("vhost scsi: alloc cmds per vq instead of session") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1607071411-33484-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio_ring: Fix two use after free bugsDan Carpenter
The "vq" struct is added to the "vdev->vqs" list prematurely. If we encounter an error later in the function then the "vq" is freed, but since it is still on the list that could lead to a use after free bug. Fixes: cbeedb72b97a ("virtio_ring: allocate desc state for split ring separately") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGaG/zkI3jk8mk@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-18virtio_net: Fix error code in probe()Dan Carpenter
Set a negative error code intead of returning success if the MTU has been changed to something invalid. Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGVJSeeCdII1Ys@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-18virtio_ring: Cut and paste bugs in vring_create_virtqueue_packed()Dan Carpenter
There is a copy and paste bug in the error handling of this code and it uses "ring_dma_addr" three times instead of "device_event_dma_addr" and "driver_event_dma_addr". Fixes: 1ce9e6055fa0 (" virtio_ring: introduce packed ring support") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGRJlEzyn+04u2@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-18vdpa/mlx5: Use write memory barrier after updating CQ indexEli Cohen
Make sure to put dma write memory barrier after updating CQ consumer index so the hardware knows that there are available CQE slots in the queue. Failure to do this can cause the update of the RX doorbell record to get updated before the CQ consumer index resulting in CQ overrun. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20201209140004.15892-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa: split vdpasim to core and net modulesMax Gurtovoy
Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a preparation for adding a vdpa simulator module for block devices. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> [sgarzare: various cleanups/fixes] Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-19-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iovStefano Garzarella
vringh_getdesc_iotlb() manages 2 iovs for writable and readable descriptors. This is very useful for the block device, where for each request we have both types of descriptor. Let's split the vdpasim_virtqueue's iov field in out_iov and in_iov to use them with vringh_getdesc_iotlb(). We are using VIRTIO terminology for "out" (readable by the device) and "in" (writable by the device) descriptors. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-18-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: make vdpasim->buffer size configurableStefano Garzarella
Allow each device to specify the size of the buffer allocated in vdpa_sim. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-17-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: use kvmalloc to allocate vdpasim->bufferStefano Garzarella
The next patch will make the buffer size configurable from each device. Since the buffer could be larger than a page, we use kvmalloc() instead of kmalloc(). Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-16-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: set vringh notify callbackStefano Garzarella
Instead of calling the vq callback directly, we can leverage the vringh_notify() function, adding vdpasim_vq_notify() and setting it in the vringh notify callback. Suggested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-15-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: add set_config callback in vdpasim_dev_attrStefano Garzarella
The set_config callback can be used by the device to parse the config structure modified by the driver. The callback will be invoked, if set, in vdpasim_set_config() after copying bytes from caller buffer into vdpasim->config buffer. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-14-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: add get_config callback in vdpasim_dev_attrStefano Garzarella
The get_config callback can be used by the device to fill the config structure. The callback will be invoked in vdpasim_get_config() before copying bytes into caller buffer. Move vDPA-net config updates from vdpasim_set_features() in the new vdpasim_net_get_config() callback. This is safe since in vdpa_get_config() we already check that .set_features() callback is called before .get_config(). Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-13-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: make 'config' generic and usable for any device typeStefano Garzarella
Add new 'config_size' attribute in 'vdpasim_dev_attr' and allocates 'config' dynamically to support any device types. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-12-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: store parsed MAC address in a bufferStefano Garzarella
As preparation for the next patches, we store the MAC address, parsed during the vdpasim_create(), in a buffer that will be used to fill 'config' together with other configurations. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-11-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: add work_fn in vdpasim_dev_attrStefano Garzarella
Rename vdpasim_work() in vdpasim_net_work() and add it to the vdpasim_dev_attr structure. Co-developed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-10-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: add supported_features field in vdpasim_dev_attrStefano Garzarella
Introduce a new VDPASIM_FEATURES macro with the generic features supported by the vDPA simulator, and VDPASIM_NET_FEATURES macro with vDPA-net features. Add 'supported_features' field in vdpasim_dev_attr, to allow devices to specify their features. Co-developed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-9-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: add device id field in vdpasim_dev_attrStefano Garzarella
Remove VDPASIM_DEVICE_ID macro and add 'id' field in vdpasim_dev_attr, that will be returned by vdpasim_get_device_id(). Use VIRTIO_ID_NET for vDPA-net simulator device id. Co-developed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-8-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: add struct vdpasim_dev_attr for device attributesStefano Garzarella
vdpasim_dev_attr will contain device specific attributes. We starting moving the number of virtqueues (i.e. nvqs) to vdpasim_dev_attr. vdpasim_create() creates a new vDPA simulator following the device attributes defined in the vdpasim_dev_attr parameter. Co-developed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-7-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: rename vdpasim_config_ops variablesStefano Garzarella
These variables store generic callbacks used by the vDPA simulator core, so we can remove the 'net' word in their names. Co-developed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-6-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: make IOTLB entries limit configurableStefano Garzarella
Some devices may require a higher limit for the number of IOTLB entries, so let's make it configurable through a module parameter. By default, it's initialized with the current limit (2048). Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-5-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: remove hard-coded virtq countMax Gurtovoy
Add a new attribute that will define the number of virt queues to be created for the vdpasim device. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> [sgarzare: replace kmalloc_array() with kcalloc()] Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-4-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa_sim: remove unnecessary headers inclusionStefano Garzarella
Some headers are not necessary, so let's remove them to do some cleaning. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-3-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa: remove unnecessary 'default n' in Kconfig entriesStefano Garzarella
'default n' is not necessary since it is already the default when nothing is specified. Suggested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201215144256.155342-2-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18vdpa: ifcvf: Use dma_set_mask_and_coherent to simplify codeChristophe JAILLET
'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by an equivalent 'dma_set_mask_and_coherent()' which is much less verbose. While at it, fix a typo (s/confiugration/configuration) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201129125434.1462638-1-christophe.jaillet@wanadoo.fr Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-18vhost_vdpa: switch to vmemdup_user()Tian Tao
Replace opencoded alloc and copy with vmemdup_user() Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1605057288-60400-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2020-12-18virtio-mem: Big Block Mode (BBM) - safe memory hotunplugDavid Hildenbrand
Let's add a safe mechanism to unplug memory, avoiding long/endless loops when trying to offline memory - similar to in SBM. Fake-offline all memory (via alloc_contig_range()) before trying to offline+remove it. Use this mode as default, but allow to enable the other mode explicitly (which could give better memory hotunplug guarantees in some environments). The "unsafe" mode can be enabled e.g., via virtio_mem.bbm_safe_unplug=0 on the cmdline. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-30-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: Big Block Mode (BBM) - basic memory hotunplugDavid Hildenbrand
Let's try to unplug completely offline big blocks first. Then, (if enabled via unplug_offline) try to offline and remove whole big blocks. No locking necessary - we can deal with concurrent onlining/offlining just fine. Note1: This is sub-optimal and might be dangerous in some environments: we could end up in an infinite loop when offlining (e.g., long-term pinnings), similar as with DIMMs. We'll introduce safe memory hotunplug via fake-offlining next, and use this basic mode only when explicitly enabled. Note2: Without ZONE_MOVABLE, memory unplug will be extremely unreliable with bigger block sizes. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-29-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: allow to force Big Block Mode (BBM) and set the big block sizeDavid Hildenbrand
Let's allow to force BBM, even if subblocks would be possible. Take care of properly calculating the first big block id, because the start address might no longer be aligned to the big block size. Also, allow to manually configure the size of Big Blocks. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-27-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: Big Block Mode (BBM) memory hotplugDavid Hildenbrand
Currently, we do not support device block sizes that exceed the Linux memory block size. For example, having a device block size of 1 GiB (e.g., gigantic pages in the hypervisor) won't work with 128 MiB Linux memory blocks. Let's implement Big Block Mode (BBM), whereby we add/remove at least one Linux memory block at a time. With a 1 GiB device block size, a Big Block (BB) will cover 8 Linux memory blocks. We'll keep registering the online_page_callback machinery, it will be used for safe memory hotunplug in BBM next. Note: BBM is properly prepared for variable-sized Linux memory blocks that we might see in the future. So we won't care how many Linux memory blocks a big block actually spans, and how the memory notifier is called. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-26-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: factor out adding/removing memory from LinuxDavid Hildenbrand
Let's use wrappers for the low-level functions that dev_dbg/dev_warn and work on addr + size, such that we can reuse them for adding/removing in other granularity. We only warn when adding memory failed, because that's something to pay attention to. We won't warn when removing failed, we'll reuse that in racy context soon (and we do have proper BUG_ON() statements in the current cases where it must never happen). Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-25-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: memory notifier callbacks are specific to Sub Block Mode (SBM)David Hildenbrand
Let's rename accordingly. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-24-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virito-mem: existing (un)plug functions are specific to Sub Block Mode (SBM)David Hildenbrand
Let's rename them accordingly. virtio_mem_plug_request() and virtio_mem_unplug_request() will be handled separately. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-23-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: memory block ids are specific to Sub Block Mode (SBM)David Hildenbrand
Let's move first_mb_id/next_mb_id/last_usable_mb_id accordingly. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-22-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: nb_sb_per_mb and subblock_size are specific to Sub Block Mode (SBM)David Hildenbrand
Let's rename to "sbs_per_mb" and "sb_size" and move accordingly. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-21-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virito-mem: subblock states are specific to Sub Block Mode (SBM)David Hildenbrand
Let's rename and move accordingly. While at it, rename sb_bitmap to "sb_states". Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-20-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: memory block states are specific to Sub Block Mode (SBM)David Hildenbrand
let's use a new "sbm" sub-struct to hold SBM-specific state and rename + move applicable definitions, functions, and variables (related to memory block states). While at it: - Drop the "_STATE" part from memory block states - Rename "nb_mb_state" to "mb_count" - "set_mb_state" / "get_mb_state" vs. "mb_set_state" / "mb_get_state" - Don't use lengthy "enum virtio_mem_smb_mb_state", simply use "uint8_t" Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-19-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virito-mem: document Sub Block Mode (SBM)David Hildenbrand
Let's add some documentation for the current mode - Sub Block Mode (SBM) - to prepare for a new mode - Big Block Mode (BBM). Follow-up patches will properly factor out the existing Sub Block Mode (SBM) and implement Big Block Mode (BBM). Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-18-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize handling when memory is getting onlined deferredDavid Hildenbrand
We don't want to add too much memory when it's not getting onlined immediately, to avoid running OOM. Generalize the handling, to avoid making use of memory block states. Use a threshold of 1 GiB for now. Properly adjust the offline size when adding/removing memory. As we are not always protected by a lock when touching the offline size, use an atomic64_t. We don't care about races (e.g., someone offlining memory while we are adding more), only about consistent values. (1 GiB needs a memmap of ~16MiB - which sounds reasonable even for setups with little boot memory and (possibly) one virtio-mem device per node) We don't want to retrigger when onlining is caused immediately by our action (e.g., adding memory which immediately gets onlined), so use a flag to indicate if the workqueue is active and use that as an indicator whether to trigger a retry. This will also be especially relevant for Big Block Mode (BBM), whereby we might re-online memory in case offlining of another memory block failed. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-17-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: don't always trigger the workqueue when offlining memoryDavid Hildenbrand
Let's trigger from offlining code only when we're not allowed to unplug online memory. Handle the other case (memmap possibly freeing up another memory block) when actually removing memory. We now also properly handle the case when removing already offline memory blocks via virtio_mem_mb_remove(). When removing via virtio_mem_remove(), when unloading the driver, virtio_mem_retry() is a NOP and safe to use. While at it, move retry handling when offlining out of virtio_mem_notify_offline(), to share it with Big Block Mode (BBM) soon. This is a preparation for Big Block Mode (BBM), whereby we can see some temporary offlining of memory blocks without actually making progress. Imagine you have a Big Block that spans to Linux memory blocks. Assume the first Linux memory blocks has no unmovable data on it. When we would call offline_and_remove_memory() on the big block, we would 1. Try to offline the first block. Works, notifiers triggered. virtio_mem_retry() called. 2. Try to offline the second block. Does not work. 3. Re-online first block. 4. Exit to main loop, exit workqueue. 5. Retry immediately (due to virtio_mem_retry()), go to 1. The result are endless retries. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-16-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: drop last_mb_idDavid Hildenbrand
No longer used, let's drop it. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-15-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize virtio_mem_overlaps_range()David Hildenbrand
Avoid using memory block ids. While at it, use uint64_t for address/size. This is a preparation for Big Block Mode (BBM). Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-14-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize virtio_mem_owned_mb()David Hildenbrand
Avoid using memory block ids. Rename it to virtio_mem_contains_range(). This is a preparation for Big Block Mode (BBM). Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-13-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize check for added memoryDavid Hildenbrand
Let's check by traversing busy system RAM resources instead, to avoid relying on memory block states. Don't use walk_system_ram_range(), as that works on pages and we want to use the bare addresses we have easily at hand. This is a preparation for Big Block Mode (BBM), which won't have memory block states. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-12-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: retry fake-offlining via alloc_contig_range() on ZONE_MOVABLEDavid Hildenbrand
ZONE_MOVABLE is supposed to give some guarantees, yet, alloc_contig_range() isn't prepared to properly deal with some racy cases properly (e.g., temporary page pinning when exiting processed, PCP). Retry 5 times for now. There is certainly room for improvement in the future. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-11-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: factor out handling of fake-offline pages in memory notifierDavid Hildenbrand
Let's factor out the core pieces and place the implementation next to virtio_mem_fake_offline(). We'll reuse this functionality soon. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-10-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: factor out fake-offlining into virtio_mem_fake_offline()David Hildenbrand
... which now matches virtio_mem_fake_online(). We'll reuse this functionality soon. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-9-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: print debug messages from virtio_mem_send_*_request()David Hildenbrand
Let's move the existing dev_dbg() into the functions, print if something went wrong, and also print for virtio_mem_send_unplug_all_request(). Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-8-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>