summaryrefslogtreecommitdiff
path: root/drivers/md/dm-integrity.c
AgeCommit message (Collapse)Author
2020-06-05Merge tag 'for-5.8/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - The largest change for this cycle is the DM zoned target's metadata version 2 feature that adds support for pairing regular block devices with a zoned device to ease the performance impact associated with finite random zones of zoned device. The changes came in three batches: the first prepared for and then added the ability to pair a single regular block device, the second was a batch of fixes to improve zoned's reclaim heuristic, and the third removed the limitation of only adding a single additional regular block device to allow many devices. Testing has shown linear scaling as more devices are added. - Add new emulated block size (ebs) target that emulates a smaller logical_block_size than a block device supports The primary use-case is to emulate "512e" devices that have 512 byte logical_block_size and 4KB physical_block_size. This is useful to some legacy applications that otherwise wouldn't be able to be used on 4K devices because they depend on issuing IO in 512 byte granularity. - Add discard interfaces to DM bufio. First consumer of the interface is the dm-ebs target that makes heavy use of dm-bufio. - Fix DM crypt's block queue_limits stacking to not truncate logic_block_size. - Add Documentation for DM integrity's status line. - Switch DMDEBUG from a compile time config option to instead use dynamic debug via pr_debug. - Fix DM multipath target's hueristic for how it manages "queue_if_no_path" state internally. DM multipath now avoids disabling "queue_if_no_path" unless it is actually needed (e.g. in response to configure timeout or explicit "fail_if_no_path" message). This fixes reports of spurious -EIO being reported back to userspace application during fault tolerance testing with an NVMe backend. Added various dynamic DMDEBUG messages to assist with debugging queue_if_no_path in the future. - Add a new DM multipath "Historical Service Time" Path Selector. - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error. - Improve DM writecache target performance by using explicit cache flushing for target's single-threaded usecase and a small cleanup to remove unnecessary test in persistent_memory_claim. - Other small cleanups in DM core, dm-persistent-data, and DM integrity. * tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits) dm crypt: avoid truncating the logical block size dm mpath: add DM device name to Failing/Reinstating path log messages dm mpath: enhance queue_if_no_path debugging dm mpath: restrict queue_if_no_path state machine dm mpath: simplify __must_push_back dm zoned: check superblock location dm zoned: prefer full zones for reclaim dm zoned: select reclaim zone based on device index dm zoned: allocate zone by device index dm zoned: support arbitrary number of devices dm zoned: move random and sequential zones into struct dmz_dev dm zoned: per-device reclaim dm zoned: add metadata pointer to struct dmz_dev dm zoned: add device pointer to struct dm_zone dm zoned: allocate temporary superblock for tertiary devices dm zoned: convert to xarray dm zoned: add a 'reserved' zone flag dm zoned: improve logging messages for reclaim dm zoned: avoid unnecessary device recalulation for secondary superblock dm zoned: add debugging message for reading superblocks ...
2020-05-22block: remove the error_sector argument to blkdev_issue_flushChristoph Hellwig
The argument isn't used by any caller, and drivers don't fill out bi_sector for flush requests either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-20dm: replace zero-length array with flexible-arrayGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-05-15dm integrity: remove set but not used variablesYueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/md/dm-integrity.c: In function 'integrity_metadata': drivers/md/dm-integrity.c:1557:12: warning: variable 'save_metadata_offset' set but not used [-Wunused-but-set-variable] drivers/md/dm-integrity.c:1556:12: warning: variable 'save_metadata_block' set but not used [-Wunused-but-set-variable] They are never used, so remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03dm integrity: fix logic bug in integrity tag testingMikulas Patocka
If all the bytes are equal to DISCARD_FILLER, we want to accept the buffer. If any of the bytes are different, we must do thorough tag-by-tag checking. The condition was inverted. Fixes: 84597a44a9d8 ("dm integrity: add optional discard support") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03dm integrity: fix ppc64le warningMike Snitzer
Otherwise: In file included from drivers/md/dm-integrity.c:13: drivers/md/dm-integrity.c: In function 'dm_integrity_status': drivers/md/dm-integrity.c:3061:10: error: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type 'long int' [-Werror=format=] DMEMIT("%llu %llu", ^~~~~~~~~~~ atomic64_read(&ic->number_of_mismatches), ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/device-mapper.h:550:46: note: in definition of macro 'DMEMIT' 0 : scnprintf(result + sz, maxlen - sz, x)) ^ cc1: all warnings being treated as errors Fixes: 7649194a1636ab5 ("dm integrity: remove sector type casts") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: improve discard in journal modeMikulas Patocka
When we discard something that is present in the journal, we flush the journal first, so that discarded blocks are not overwritten by the journal content. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: add optional discard supportMikulas Patocka
Add an argument "allow_discards" that enables discard processing on dm-integrity device. Discards are only allowed to devices using internal hash. When a block is discarded the integrity tag is filled with DISCARD_FILLER (0xf6) bytes. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: allow resize of the integrity deviceMikulas Patocka
If the size of the underlying device changes, change the size of the integrity device too. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: factor out get_provided_data_sectors()Mikulas Patocka
Move code to a new function get_provided_data_sectors(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: don't replay journal data past the end of the deviceMikulas Patocka
Following commits will make it possible to shrink or extend the device. If the device was shrunk, we don't want to replay journal data pointing past the end of the device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: remove sector type castsMikulas Patocka
Since the commit 72deb455b5ec619ff043c30bc90025aa3de3cdda ("block: remove CONFIG_LBDAF") sector_t is always defined as unsigned long long. Delete the needless type casts in printk and avoids some warnings if DEBUG_PRINT is defined. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: fix a crash with unusually large tag sizeMikulas Patocka
If the user specifies tag size larger than HASH_MAX_DIGESTSIZE, there's a crash in integrity_metadata(). Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24dm integrity: print device name in integrity_metadata() error messageErich Eckner
Similar to f710126cfc89c8df477002a26dee8407eb0b4acd ("dm crypt: print device name in integrity error message"), this message should also better identify the device with the integrity failure. Signed-off-by: Erich Eckner <git@eckner.net> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-03dm: bump version of core and various targetsMike Snitzer
Changes made during the 5.6 cycle warrant bumping the version number for DM core and the targets modified by this commit. It should be noted that dm-thin, dm-crypt and dm-raid already had their target version bumped during the 5.6 merge window. Signed-off-by; Mike Snitzer <snitzer@redhat.com>
2020-03-03dm integrity: use dm_bio_record and dm_bio_restoreMike Snitzer
In cases where dec_in_flight() has to requeue the integrity_bio_wait work to transfer the rest of the data, the bio's __bi_remaining might already have been decremented to 0, e.g.: if bio passed to underlying data device was split via blk_queue_split(). Use dm_bio_{record,restore} rather than effectively open-coding them in dm-integrity -- these methods now manage __bi_remaining too. Depends-on: f7f0b057a9c1 ("dm bio record: save/restore bi_end_io and bi_integrity") Reported-by: Daniel Glöckner <dg@emlix.com> Suggested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-27dm: report suspended device during destroyMikulas Patocka
The function dm_suspended returns true if the target is suspended. However, when the target is being suspended during unload, it returns false. An example where this is a problem: the test "!dm_suspended(wc->ti)" in writecache_writeback is not sufficient, because dm_suspended returns zero while writecache_suspend is in progress. As is, without an enhanced dm_suspended, simply switching from flush_workqueue to drain_workqueue still emits warnings: workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries writecache_suspend calls flush_workqueue(wc->writeback_wq) - this function flushes the current work. However, the workqueue may re-queue itself and flush_workqueue doesn't wait for re-queued works to finish. Because of this - the function writecache_writeback continues execution after the device was suspended and then concurrently with writecache_dtr, causing a crash in writecache_writeback. We must use drain_workqueue - that waits until the work and all re-queued works finish. As a prereq for switching to drain_workqueue, this commit fixes dm_suspended to return true after the presuspend hook and before the postsuspend hook - just like during a normal suspend. It allows simplifying the dm-integrity and dm-writecache targets so that they don't have to maintain suspended flags on their own. With this change use of drain_workqueue() can be used effectively. This change was tested with the lvm2 testsuite and cryptsetup testsuite and the are no regressions. Fixes: 48debafe4f2f ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18+ Reported-by: Corey Marthaler <cmarthal@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-25dm integrity: fix invalid table returned due to argument count mismatchMikulas Patocka
If the flag SB_FLAG_RECALCULATE is present in the superblock, but it was not specified on the command line (i.e. ic->recalculate_flag is false), dm-integrity would return invalid table line - the reported number of arguments would not match the real number. Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-25dm integrity: fix a deadlock due to offloading to an incorrect workqueueMikulas Patocka
If we need to perform synchronous I/O in dm_integrity_map_continue(), we must make sure that we are not in the map function - in order to avoid the deadlock due to bio queuing in generic_make_request. To avoid the deadlock, we offload the request to metadata_wq. However, metadata_wq also processes metadata updates for write requests. If there are too many requests that get offloaded to metadata_wq at the beginning of dm_integrity_map_continue, the workqueue metadata_wq becomes clogged and the system is incapable of processing any metadata updates. This causes a deadlock because all the requests that need to do metadata updates wait for metadata_wq to proceed and metadata_wq waits inside wait_and_add_new_range until some existing request releases its range lock (which doesn't happen because the range lock is released after metadata update). In order to fix the deadlock, we create a new workqueue offload_wq and offload requests to it - so that processing of offload_wq is independent from processing of metadata_wq. Fixes: 7eada909bfd7 ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Heinz Mauelshagen <heinzm@redhat.com> Tested-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-25dm integrity: fix recalculation when moving from journal mode to bitmap modeMikulas Patocka
If we resume a device in bitmap mode and the on-disk format is in journal mode, we must recalculate anything above ic->sb->recalc_sector. Otherwise, there would be non-recalculated blocks which would cause I/O errors. Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-15dm integrity: fix excessive alignment of metadata runsMikulas Patocka
Metadata runs are supposed to be aligned on 4k boundary (so that they work efficiently with disks with 4k sectors). However, there was a programming bug that makes them aligned on 128k boundary instead. The unused space is wasted. Fix this bug by providing a proper 4k alignment. In order to keep existing volumes working, we introduce a new flag SB_FLAG_FIXED_PADDING - when the flag is clear, we calculate the padding the old way. In order to make sure that the old version cannot mount the volume created by the new version, we increase superblock version to 4. Also in order to not break with old integritysetup, we fix alignment only if the parameter "fix_padding" is present when formatting the device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-09-17block: centralize PI remapping logic to the block layerMax Gurtovoy
Currently t10_pi_prepare/t10_pi_complete functions are called during the NVMe and SCSi layers command preparetion/completion, but their actual place should be the block layer since T10-PI is a general data integrity feature that is used by block storage protocols. Introduce .prepare_fn and .complete_fn callbacks within the integrity profile that each type can implement according to its needs. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Suggested-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Fixed to not call queue integrity functions if BLK_DEV_INTEGRITY isn't defined in the config. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-15dm integrity: fix a crash due to BUG_ON in __journal_read_write()Mikulas Patocka
Fix a crash that was introduced by the commit 724376a04d1a. The crash is reported here: https://gitlab.com/cryptsetup/cryptsetup/issues/468 When reading from the integrity device, the function dm_integrity_map_continue calls find_journal_node to find out if the location to read is present in the journal. Then, it calculates how many sectors are consecutively stored in the journal. Then, it locks the range with add_new_range and wait_and_add_new_range. The problem is that during wait_and_add_new_range, we hold no locks (we don't hold ic->endio_wait.lock and we don't hold a range lock), so the journal may change arbitrarily while wait_and_add_new_range sleeps. The code then goes to __journal_read_write and hits BUG_ON(journal_entry_get_sector(je) != logical_sector); because the journal has changed. In order to fix this bug, we need to re-check the journal location after wait_and_add_new_range. We restrict the length to one block in order to not complicate the code too much. Fixes: 724376a04d1a ("dm integrity: implement fair range locks") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-09dm integrity: use kzalloc() instead of kmalloc() + memset()Fuqian Huang
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-09dm integrity: always set version on superblock updateMilan Broz
The new integrity bitmap mode uses the dirty flag. The dirty flag should not be set in older superblock versions. The current code sets it unconditionally, even if the superblock was already formatted without bitmap in older system. Fix this by moving the version check to one common place and check version on every superblock write. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-16Merge tag 'for-5.2/dm-changes-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Improve DM snapshot target's scalability by using finer grained locking. Requires some list_bl interface improvements. - Add ability for DM integrity to use a bitmap mode, that tracks regions where data and metadata are out of sync, instead of using a journal. - Improve DM thin provisioning target to not write metadata changes to disk if the thin-pool and associated thin devices are merely activated but not used. This avoids metadata corruption due to concurrent activation of thin devices across different OS instances (e.g. split brain scenarios, which ultimately would be avoided if proper device filters were used -- but not having proper filtering has proven a very common configuration mistake) - Fix missing call to path selector type->end_io in DM multipath. This fixes reported performance problems due to inaccurate path selector IO accounting causing an imbalance of IO (e.g. avoiding issuing IO to particular path due to it seemingly being heavily used). - Fix bug in DM cache metadata's loading of its discard bitset that could lead to all cache blocks being discarded if the very first cache block was discarded (thankfully in practice the first cache block is generally in use; be it FS superblock, partition table, disk label, etc). - Add testing-only DM dust target which simulates a device that has failing sectors and/or read failures. - Fix a DM init error path reference count hang that caused boot hangs if user supplied malformed input on kernel commandline. - Fix a couple issues with DM crypt target's logging being overly verbose or lacking context. - Various other small fixes to DM init, DM multipath, DM zoned, and DM crypt. * tag 'for-5.2/dm-changes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (42 commits) dm: fix a couple brace coding style issues dm crypt: print device name in integrity error message dm crypt: move detailed message into debug level dm ioctl: fix hang in early create error condition dm integrity: whitespace, coding style and dead code cleanup dm integrity: implement synchronous mode for reboot handling dm integrity: handle machine reboot in bitmap mode dm integrity: add a bitmap mode dm integrity: introduce a function add_new_range_and_wait() dm integrity: allow large ranges to be described dm ingerity: pass size to dm_integrity_alloc_page_list() dm integrity: introduce rw_journal_sectors() dm integrity: update documentation dm integrity: don't report unused options dm integrity: don't check null pointer before kvfree and vfree dm integrity: correctly calculate the size of metadata area dm dust: Make dm_dust_init and dm_dust_exit static dm dust: remove redundant unsigned comparison to less than zero dm mpath: always free attached_handler_name in parse_path() dm init: fix max devices/targets checks ...
2019-05-09dm integrity: whitespace, coding style and dead code cleanupMike Snitzer
Just some things that stood out like a sore thumb. Also, converted some printk(KERN_CRIT, ...) to DMCRIT(...) Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08dm integrity: implement synchronous mode for reboot handlingMikulas Patocka
Unfortunatelly, there may be bios coming even after the reboot notifier was called. We don't want these bios to make the bitmap dirty again. To address this, implement a synchronous mode - when a bio is about to be terminated, we clean the bitmap and terminate the bio after the clean operation succeeds. This obviously slows down bio processing, but it makes sure that when all bios are finished, the bitmap will be clean. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08dm integrity: handle machine reboot in bitmap modeMikulas Patocka
When in bitmap mode the bitmap must be cleared when rebooting. This commit adds the reboot hook. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08dm integrity: add a bitmap modeMikulas Patocka
Introduce an alternate mode of operation where dm-integrity uses a bitmap instead of a journal. If a bit in the bitmap is 1, the corresponding region's data and integrity tags are not synchronized - if the machine crashes, the unsynchronized regions will be recalculated. The bitmap mode is faster than the journal mode, because we don't have to write the data twice, but it is also less reliable, because if data corruption happens when the machine crashes, it may not be detected. Benchmark results for an SSD connected to a SATA300 port, when doing large linear writes with dd: buffered I/O: raw device throughput - 245MB/s dm-integrity with journaling - 120MB/s dm-integrity with bitmap - 238MB/s direct I/O with 1MB block size: raw device throughput - 248MB/s dm-integrity with journaling - 123MB/s dm-integrity with bitmap - 223MB/s For more info see dm-integrity in Documentation/device-mapper/ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08dm integrity: introduce a function add_new_range_and_wait()Mikulas Patocka
Introduce a function add_new_range_and_wait() in order to avoid repetitive code. It will be used in the following commit. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07Merge tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "Nothing major in this series, just fixes and improvements all over the map. This contains: - Series of fixes for sed-opal (David, Jonas) - Fixes and performance tweaks for BFQ (via Paolo) - Set of fixes for bcache (via Coly) - Set of fixes for md (via Song) - Enabling multi-page for passthrough requests (Ming) - Queue release fix series (Ming) - Device notification improvements (Martin) - Propagate underlying device rotational status in loop (Holger) - Removal of mtip32xx trim support, which has been disabled for years (Christoph) - Improvement and cleanup of nvme command handling (Christoph) - Add block SPDX tags (Christoph) - Cleanup/hardening of bio/bvec iteration (Christoph) - A few NVMe pull requests (Christoph) - Removal of CONFIG_LBDAF (Christoph) - Various little fixes here and there" * tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block: (164 commits) block: fix mismerge in bvec_advance block: don't drain in-progress dispatch in blk_cleanup_queue() blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release blk-mq: always free hctx after request queue is freed blk-mq: split blk_mq_alloc_and_init_hctx into two parts blk-mq: free hw queue's resource in hctx's release handler blk-mq: move cancel of requeue_work into blk_mq_release blk-mq: grab .q_usage_counter when queuing request from plug code path block: fix function name in comment nvmet: protect discovery change log event list iteration nvme: mark nvme_core_init and nvme_core_exit static nvme: move command size checks to the core nvme-fabrics: check more command sizes nvme-pci: check more command sizes nvme-pci: remove an unneeded variable initialization nvme-pci: unquiesce admin queue on shutdown nvme-pci: shutdown on timeout during deletion nvme-pci: fix psdt field for single segment sgls nvme-multipath: don't print ANA group state by default nvme-multipath: split bios with the ns_head bio_set before submitting ...
2019-05-07dm integrity: allow large ranges to be describedMikulas Patocka
Change n_sectors data type from unsigned to sector_t. Following commits will need to lock large ranges. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07dm ingerity: pass size to dm_integrity_alloc_page_list()Mikulas Patocka
Pass size to dm_integrity_alloc_page_list(). This is needed so following commits can pass a size that is different from ic->journal_pages. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07dm integrity: introduce rw_journal_sectors()Mikulas Patocka
Introduce a function rw_journal_sectors() that takes sector and length as its arguments instead of a section and the number of sections. This functions will be used in further patches. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07dm integrity: update documentationMikulas Patocka
Update documentation with the "meta_device" parameter and flags. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07dm integrity: don't report unused optionsMikulas Patocka
If we are not journaling, don't report journaling options in the table status. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07dm integrity: don't check null pointer before kvfree and vfreeMikulas Patocka
The functions kfree, vfree and kvfree do nothing if we pass a NULL pointer to them. So we don't need to test the pointer for NULL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07dm integrity: correctly calculate the size of metadata areaMikulas Patocka
When we use separate devices for data and metadata, dm-integrity would incorrectly calculate the size of the metadata device as if it had 512-byte block size - and it would refuse activation with larger block size and smaller metadata device. Fix this so that it takes actual block size into account, which fixes the following reported issue: https://gitlab.com/cryptsetup/cryptsetup/issues/450 Fixes: 356d9d52e122 ("dm integrity: allow separate metadata device") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-06Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Add support for AEAD in simd - Add fuzz testing to testmgr - Add panic_on_fail module parameter to testmgr - Use per-CPU struct instead multiple variables in scompress - Change verify API for akcipher Algorithms: - Convert x86 AEAD algorithms over to simd - Forbid 2-key 3DES in FIPS mode - Add EC-RDSA (GOST 34.10) algorithm Drivers: - Set output IV with ctr-aes in crypto4xx - Set output IV in rockchip - Fix potential length overflow with hashing in sun4i-ss - Fix computation error with ctr in vmx - Add SM4 protected keys support in ccree - Remove long-broken mxc-scc driver - Add rfc4106(gcm(aes)) cipher support in cavium/nitrox" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (179 commits) crypto: ccree - use a proper le32 type for le32 val crypto: ccree - remove set but not used variable 'du_size' crypto: ccree - Make cc_sec_disable static crypto: ccree - fix spelling mistake "protedcted" -> "protected" crypto: caam/qi2 - generate hash keys in-place crypto: caam/qi2 - fix DMA mapping of stack memory crypto: caam/qi2 - fix zero-length buffer DMA mapping crypto: stm32/cryp - update to return iv_out crypto: stm32/cryp - remove request mutex protection crypto: stm32/cryp - add weak key check for DES crypto: atmel - remove set but not used variable 'alg_name' crypto: picoxcell - Use dev_get_drvdata() crypto: crypto4xx - get rid of redundant using_sd variable crypto: crypto4xx - use sync skcipher for fallback crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues crypto: crypto4xx - fix ctr-aes missing output IV crypto: ecrdsa - select ASN1 and OID_REGISTRY for EC-RDSA crypto: ux500 - use ccflags-y instead of CFLAGS_<basename>.o crypto: ccree - handle tee fips error during power management resume crypto: ccree - add function to handle cryptocell tee fips error ...
2019-04-25crypto: shash - remove shash_desc::flagsEric Biggers
The flags field in 'struct shash_desc' never actually does anything. The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP. However, no shash algorithm ever sleeps, making this flag a no-op. With this being the case, inevitably some users who can't sleep wrongly pass MAY_SLEEP. These would all need to be fixed if any shash algorithm actually started sleeping. For example, the shash_ahash_*() functions, which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP from the ahash API to the shash API. However, the shash functions are called under kmap_atomic(), so actually they're assumed to never sleep. Even if it turns out that some users do need preemption points while hashing large buffers, we could easily provide a helper function crypto_shash_update_large() which divides the data into smaller chunks and calls crypto_shash_update() and cond_resched() for each chunk. It's not necessary to have a flag in 'struct shash_desc', nor is it necessary to make individual shash algorithms aware of this at all. Therefore, remove shash_desc::flags, and document that the crypto_shash_*() functions can be called from any context. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-22Merge tag 'v5.1-rc6' into for-5.2/blockJens Axboe
Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a comment, and is trivial. The other one is a conflict due to a later fix in the bio multi-page work, and needs a bit more care. * tag 'v5.1-rc6': (770 commits) Linux 5.1-rc6 block: make sure that bvec length can't be overflow block: kill all_q_node in request_queue x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping mm/kmemleak.c: fix unused-function warning init: initialize jump labels before command line option parsing kernel/watchdog_hld.c: hard lockup message should end with a newline kcov: improve CONFIG_ARCH_HAS_KCOV help text mm: fix inactive list balancing between NUMA nodes and cgroups mm/hotplug: treat CMA pages as unmovable proc: fixup proc-pid-vm test proc: fix map_files test on F29 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock mm: swapoff: shmem_unuse() stop eviction without igrab() mm: swapoff: take notice of completion sooner mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES mm: swapoff: shmem_find_swap_entries() filter out other types slab: store tagged freelist for off-slab slabmgmt ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-06block: remove CONFIG_LBDAFChristoph Hellwig
Currently support for 64-bit sector_t and blkcnt_t is optional on 32-bit architectures. These types are required to support block device and/or file sizes larger than 2 TiB, and have generally defaulted to on for a long time. Enabling the option only increases the i386 tinyconfig size by 145 bytes, and many data structures already always use 64-bit values for their in-core and on-disk data structures anyway, so there should not be a large change in dynamic memory usage either. Dropping this option removes a somewhat weird non-default config that has cause various bugs or compiler warnings when actually used. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-05dm integrity: fix deadlock with overlapping I/OMikulas Patocka
dm-integrity will deadlock if overlapping I/O is issued to it, the bug was introduced by commit 724376a04d1a ("dm integrity: implement fair range locks"). Users rarely use overlapping I/O so this bug went undetected until now. Fix this bug by correcting, likely cut-n-paste, typos in ranges_overlap() and also remove a flawed ranges_overlap() check in remove_range_unlocked(). This condition could leave unprocessed bios hanging on wait_list forever. Cc: stable@vger.kernel.org # v4.19+ Fixes: 724376a04d1a ("dm integrity: implement fair range locks") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-01dm integrity: make dm_integrity_init and dm_integrity_exit staticYueHaibing
Fix sparse warnings: drivers/md/dm-integrity.c:3619:12: warning: symbol 'dm_integrity_init' was not declared. Should it be static? drivers/md/dm-integrity.c:3638:6: warning: symbol 'dm_integrity_exit' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-01dm integrity: change memcmp to strncmp in dm_integrity_ctrMikulas Patocka
If the string opt_string is small, the function memcmp can access bytes that are beyond the terminating nul character. In theory, it could cause segfault, if opt_string were located just below some unmapped memory. Change from memcmp to strncmp so that we don't read bytes beyond the end of the string. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-06dm integrity: limit the rate of error messagesMikulas Patocka
When using dm-integrity underneath md-raid, some tests with raid auto-correction trigger large amounts of integrity failures - and all these failures print an error message. These messages can bring the system to a halt if the system is using serial console. Fix this by limiting the rate of error messages - it improves the speed of raid recovery and avoids the hang. Fixes: 7eada909bfd7a ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm integrity: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-28Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: - large KASAN update to use arm's "software tag-based mode" - a few misc things - sh updates - ocfs2 updates - just about all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (167 commits) kernel/fork.c: mark 'stack_vm_area' with __maybe_unused memcg, oom: notify on oom killer invocation from the charge path mm, swap: fix swapoff with KSM pages include/linux/gfp.h: fix typo mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization memory_hotplug: add missing newlines to debugging output mm: remove __hugepage_set_anon_rmap() include/linux/vmstat.h: remove unused page state adjustment macro mm/page_alloc.c: allow error injection mm: migrate: drop unused argument of migrate_page_move_mapping() blkdev: avoid migration stalls for blkdev pages mm: migrate: provide buffer_migrate_page_norefs() mm: migrate: move migrate_page_lock_buffers() mm: migrate: lock buffers before migrate_page_move_mapping() mm: migration: factor out code to compute expected number of page references mm, page_alloc: enable pcpu_drain with zone capability kmemleak: add config to select auto scan mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init ...
2018-12-28Merge tag 'for-4.21/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Eliminate a couple indirect calls from bio-based DM core. - Fix DM to allow reads that exceed readahead limits by setting io_pages in the backing_dev_info. - A couple code cleanups in request-based DM. - Fix various DM targets to check for device sector overflow if CONFIG_LBDAF is not set. - Use u64 instead of sector_t to store iv_offset in DM crypt; sector_t isn't large enough on 32bit when CONFIG_LBDAF is not set. - Performance fixes to DM's kcopyd and the snapshot target focused on limiting memory use and workqueue stalls. - Fix typos in the integrity and writecache targets. - Log which algorithm is used for dm-crypt's encryption and dm-integrity's hashing. - Fix false -EBUSY errors in DM raid target's handling of check/repair messages. - Fix DM flakey target's corrupt_bio_byte feature to reliably corrupt the Nth byte in a bio's payload. * tag 'for-4.21/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: do not allow readahead to limit IO size dm raid: fix false -EBUSY when handling check/repair message dm rq: cleanup leftover code from recently removed q->mq_ops branching dm verity: log the hash algorithm implementation dm crypt: log the encryption algorithm implementation dm integrity: fix spelling mistake in workqueue name dm flakey: Properly corrupt multi-page bios. dm: Check for device sector overflow if CONFIG_LBDAF is not set dm crypt: use u64 instead of sector_t to store iv_offset dm kcopyd: Fix bug causing workqueue stalls dm snapshot: Fix excessive memory usage and workqueue stalls dm bufio: update comment in dm-bufio.c dm writecache: fix typo in error msg for creating writecache_flush_thread dm: remove indirect calls from __send_changing_extent_only() dm mpath: only flush workqueue when needed dm rq: remove unused arguments from rq_completed() dm: avoid indirect call in __dm_make_request