summaryrefslogtreecommitdiff
path: root/drivers/pci/pci-sysfs.c
AgeCommit message (Collapse)Author
2012-11-05PCI/PM: Fix proc config reg access for D3cold and bridge suspendingHuang Ying
In https://bugzilla.kernel.org/show_bug.cgi?id=48981 Peter reported that /proc/bus/pci/??/??.? does not work for 3.6. This is because the device configuration space registers are not accessible if the corresponding parent bridge is suspended or the device is put into D3cold state. This is the same as /sys/bus/pci/devices/0000:??:??.?/config access issue. So the function used to solve sysfs issue is used to solve this issue. This patch moves pci_config_pm_runtime_get()/_put() from pci/pci-sysfs.c to pci/pci.c and makes them extern so they can be used by both the sysfs and proc paths. [bhelgaas: changelog, references, reporters] Reference: https://bugzilla.kernel.org/show_bug.cgi?id=48981 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=49031 Reported-by: Forrest Loomis <cybercyst@gmail.com> Reported-by: Peter <lekensteyn@gmail.com> Reported-by: Micael Dias <kam1kaz3@gmail.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org # v3.6+
2012-08-21PCI/PM: Fix config reg access for D3cold and bridge suspendingHuang Ying
This patch fixes the following bug: http://marc.info/?l=linux-pci&m=134338059022620&w=2 Where lspci does not work properly if a device and the corresponding parent bridge (such as PCIe port) is suspended. This is because the device configuration space registers will be not accessible if the corresponding parent bridge is suspended or the device is put into D3cold state. To solve the issue, the bridge/PCIe port connected to the device is put into active state before read/write configuration space registers. If the device is in D3cold state, it will be put into active state too. To avoid resume/suspend PCIe port for each configuration register read/write, a small delay is added before the PCIe port to go suspended. Reported-by: Bjorn Mork <bjorn@mork.no> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-06-23Merge branch 'topic/huang-d3cold-v7' into nextBjorn Helgaas
* topic/huang-d3cold-v7: PCI/PM: add PCIe runtime D3cold support PCI: do not call pci_set_power_state with PCI_D3cold PCI/PM: add runtime PM support to PCIe port ACPI/PM: specify lowest allowed state for device sleep state
2012-06-23PCI/PM: add PCIe runtime D3cold supportHuang Ying
This patch adds runtime D3cold support and corresponding ACPI platform support. This patch only enables runtime D3cold support; it does not enable D3cold support during system suspend/hibernate. D3cold is the deepest power saving state for a PCIe device, where its main power is removed. While it is in D3cold, you can't access the device at all, not even its configuration space (which is still accessible in D3hot). Therefore the PCI PM registers can not be used to transition into/out of the D3cold state; that must be done by platform logic such as ACPI _PR3. To support wakeup from D3cold, a system may provide auxiliary power, which allows a device to request wakeup using a Beacon or the sideband WAKE# signal. WAKE# is usually connected to platform logic such as ACPI GPE. This is quite different from other power saving states, where devices request wakeup via a PME message on the PCIe link. Some devices, such as those in plug-in slots, have no direct platform logic. For example, there is usually no ACPI _PR3 for them. D3cold support for these devices can be done via the PCIe Downstream Port leading to the device. When the PCIe port is powered on/off, the device is powered on/off too. Wakeup events from the device will be notified to the corresponding PCIe port. For more information about PCIe D3cold and corresponding ACPI support, please refer to: - PCI Express Base Specification Revision 2.0 - Advanced Configuration and Power Interface Specification Revision 5.0 [bhelgaas: changelog] Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl> Originally-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-20PCI: use __weak consistentlyBjorn Helgaas
Use "__weak" instead of the gcc-specific "__attribute__ ((weak))" Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-24vgaarb: Add support for setting the default video device (v2)Matthew Garrett
The default VGA device is a somewhat fluid concept on platforms with multiple GPUs. Add support for setting it so switching code can update things appropriately, and make sure that the sysfs code returns the right device if it's changed. v2: Updated to fix builds when __ARCH_HAS_VGA_DEFAULT_DEVICE is false. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: benh@kernel.crashing.org Cc: airlied@redhat.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-02-27PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_deviceYinghai Lu
The old pci_remove_bus_device actually did stop and remove. Make the name reflect that to reduce confusion. This patch is done by sed scripts and changes back some incorrect __pci_remove_bus_device changes. Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14PCI: Make rescan bus increase bridge resource size if neededYinghai Lu
Current rescan will not touch bridge MMIO and IO. Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge resources, if child devices need more resources. Only do that for bridges whose children are all removed already; i.e. don't release resources that could already be in use by drivers on child devices. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-14Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-securityLinus Torvalds
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security: capabilities: remove __cap_full_set definition security: remove the security_netlink_recv hook as it is equivalent to capable() ptrace: do not audit capability check when outputing /proc/pid/stat capabilities: remove task_ns_* functions capabitlies: ns_capable can use the cap helpers rather than lsm call capabilities: style only - move capable below ns_capable capabilites: introduce new has_ns_capabilities_noaudit capabilities: call has_ns_capability from has_capability capabilities: remove all _real_ interfaces capabilities: introduce security_capable_noaudit capabilities: reverse arguments to security_capable capabilities: remove the task from capable LSM hook entirely selinux: sparse fix: fix several warnings in the security server cod selinux: sparse fix: fix warnings in netlink code selinux: sparse fix: eliminate warnings for selinuxfs selinux: sparse fix: declare selinux_disable() in security.h selinux: sparse fix: move selinux_complete_init selinux: sparse fix: make selinux_secmark_refcount static SELinux: Fix RCU deref check warning in sel_netport_insert() Manually fix up a semantic mis-merge wrt security_netlink_recv(): - the interface was removed in commit fd7784615248 ("security: remove the security_netlink_recv hook as it is equivalent to capable()") - a new user of it appeared in commit a38f7907b926 ("crypto: Add userspace configuration API") causing no automatic merge conflict, but Eric Paris pointed out the issue.
2012-01-05capabilities: reverse arguments to security_capableEric Paris
security_capable takes ns, cred, cap. But the LSM capable() hook takes cred, ns, cap. The capability helper functions also take cred, ns, cap. Rather than flip argument order just to flip it back, leave them alone. Heck, this should be a little faster since argument will be in the right place! Signed-off-by: Eric Paris <eparis@redhat.com>
2011-10-31pci: Fix files needing export.h for EXPORT_SYMBOL/THIS_MODULEPaul Gortmaker
They were implicitly getting it from device.h --> module.h but we want to clean that up. So add the minimal header for these macros. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-05-21PCI/sysfs: move bus cpuaffinity to class dev_attrsYinghai Lu
Requested by Greg KH to fix a race condition in the creating of PCI bus cpuaffinity files. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21PCI: add rescan to /sys/.../pci_bus/.../Yinghai Lu
After remove the device from /sys, we have to rescan all or find out the bridge and access /sys../device/rescan there. this patch add /sys/.../pci_bus/.../rescan. So user can rescan more easy. that is more clean and easy to understand. like after remove 0000:c4:00.0, you can rescan 0000:c4 directly. -v2: According to Jesse, use function instead of exposing attr, so could hide #ifdef in header file. also add code to remove rescan file in remove path. -v3: GregKH pointed out that we should use dev_attrs to avoid racing. So add pcibus_attrs and make it to be member of pcibus_attrs. -v4: Change name to pcibus_dev_attrs according to GregKH Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-23userns: security: make capabilities relative to the user namespaceSerge E. Hallyn
- Introduce ns_capable to test for a capability in a non-default user namespace. - Teach cap_capable to handle capabilities in a non-default user namespace. The motivation is to get to the unprivileged creation of new namespaces. It looks like this gets us 90% of the way there, with only potential uid confusion issues left. I still need to handle getting all caps after creation but otherwise I think I have a good starter patch that achieves all of your goals. Changelog: 11/05/2010: [serge] add apparmor 12/14/2010: [serge] fix capabilities to created user namespaces Without this, if user serge creates a user_ns, he won't have capabilities to the user_ns he created. THis is because we were first checking whether his effective caps had the caps he needed and returning -EPERM if not, and THEN checking whether he was the creator. Reverse those checks. 12/16/2010: [serge] security_real_capable needs ns argument in !security case 01/11/2011: [serge] add task_ns_capable helper 01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion 02/16/2011: [serge] fix a logic bug: the root user is always creator of init_user_ns, but should not always have capabilities to it! Fix the check in cap_capable(). 02/21/2011: Add the required user_ns parameter to security_capable, fixing a compile failure. 02/23/2011: Convert some macros to functions as per akpm comments. Some couldn't be converted because we can't easily forward-declare them (they are inline if !SECURITY, extern if SECURITY). Add a current_user_ns function so we can use it in capability.h without #including cred.h. Move all forward declarations together to the top of the #ifdef __KERNEL__ section, and use kernel-doc format. 02/23/2011: Per dhowells, clean up comment in cap_capable(). 02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable. (Original written and signed off by Eric; latest, modified version acked by him) [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: export current_user_ns() for ecryptfs] [serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-18Merge branch 'linux-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: label: remove #include of ACPI header to avoid warnings PCI: label: Fix compilation error when CONFIG_ACPI is unset PCI: pre-allocate additional resources to devices only after successful allocation of essential resources. PCI: introduce reset_resource() PCI: data structure agnostic free list function PCI: refactor io size calculation code PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH PCI hotplug: acpiphp: set current_state to D0 in register_slot PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs PCI: add more checking to ICH region quirks PCI: aer-inject: Override PCIe AER Mask Registers PCI: fix tlan build when CONFIG_PCI is not enabled PCI: remove quirk for pre-production systems PCI: Avoid potential NULL pointer dereference in pci_scan_bridge PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs PCI: sysfs: Fix failure path for addition of "vpd" attribute
2011-02-15pci: use security_capable() when checking capablities during config space readChris Wright
This reintroduces commit 47970b1b which was subsequently reverted as f00eaeea. The original change was broken and caused X startup failures and generally made privileged processes incapable of reading device dependent config space. The normal capable() interface returns true on success, but the LSM interface returns 0 on success. This thinko is now fixed in this patch, and has been confirmed to work properly. So, once again...Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file open to read device dependent config space") caused the capability check to bypass security modules and potentially auditing. Rectify this by calling security_capable() when checking the open file's capabilities for config space reads. Reported-by: Eric Paris <eparis@redhat.com> Tested-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: James Morris <jmorris@namei.org> Cc: Dave Airlie <airlied@gmail.com> Cc: Alex Riesen <raa.lkml@gmail.com> Cc: Sedat Dilek <sedat.dilek@googlemail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: James Morris <jmorris@namei.org>
2011-02-13Revert "pci: use security_capable() when checking capablities during config ↵Linus Torvalds
space read" This reverts commit 47970b1b2aa64464bc0a9543e86361a622ae7c03. It turns out it breaks several distributions. Looks like the stricter selinux checks fail due to selinux policies not being set to allow the access - breaking X, but also lspci. So while the change was clearly the RightThing(tm) to do in theory, in practice we have backwards compatibility issues making it not work. Reported-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: David Airlie <airlied@linux.ie> Acked-by: Alex Riesen <raa.lkml@gmail.com> Cc: Eric Paris <eparis@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11pci: use security_capable() when checking capablities during config space readChris Wright
Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file open to read device dependent config space") caused the capability check to bypass security modules and potentially auditing. Rectify this by calling security_capable() when checking the open file's capabilities for config space reads. Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: James Morris <jmorris@namei.org>
2011-02-08PCI: sysfs: Fix failure path for addition of "vpd" attributeBen Hutchings
Commit 280c73d ("PCI: centralize the capabilities code in pci-sysfs.c") changed the initialisation of the "rom" and "vpd" attributes, and made the failure path for the "vpd" attribute incorrect. We must free the new attribute structure (attr), but instead we currently free dev->vpd->attr. That will normally be NULL, resulting in a memory leak, but it might be a stale pointer, resulting in a double-free. Found by inspection; compile-tested only. Cc: stable@kernel.org Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14PCI: sysfs: Update ROM to include default owner write accessAlex Williamson
The PCI sysfs ROM interface requires an enabling write to access the ROM image, but the default file mode is 0400. The original proposed patch adding sysfs ROM support was a true read-only interface, with the enabling bit coming in as a feature request. I suspect it was simply an oversight that the file mode didn't get updated to match the API. Acked-by: Chris Wright <chrisw@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-16PCI: fix offset check for sysfs mmapped filesDarrick J. Wong
I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts. Running an strace of the X server shows that it's doing this: open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10 mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument) This code seems to be asking for a shared read/write mapping of 16MB worth of BAR0 starting at file offset 0, and letting the kernel assign a starting address. Unfortunately, this -EINVAL causes X not to start. Looking into dmesg, there's a complaint like so: process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x 96000000, size 0x 1000000) ...with the following code in pci_mmap_fits: pci_start = (mmap_api == PCI_MMAP_SYSFS) ? pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0; if (start >= pci_start && start < pci_start + size && start + nr <= pci_start + size) It looks like the logic here is set up such that when the mmap call comes via sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the resource's start and end address, and the end of the vma to be no farther than the end. However, the sysfs PCI resource files always start at offset zero, which means that this test always fails for programs that mmap the sysfs files. Given the comment in the original commit 3b519e4ea618b6943a82931630872907f9ac2c2b, I _think_ the old procfs files require that the file offset be equal to the resource's base address when mmapping. I think what we want here is for pci_start to be 0 when mmap_api == PCI_MMAP_PROCFS. The following patch makes that change, after which the Matrox and Mach64 X drivers work again. Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com> Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-15PCI: sysfs: fix printk warningsRandy Dunlap
Cast pci_resource_start() and pci_resource_len() to u64 for printk. drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t' drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-11PCI: fix size checks for mmap() on /proc/bus/pci filesMartin Wilck
The checks for valid mmaps of PCI resources made through /proc/bus/pci files that were introduced in 9eff02e2042f96fb2aedd02e032eca1c5333d767 have several problems: 1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0, whereas under /sys/bus/pci/devices, the start of the resource corresponds to offset 0. This may lead to false negatives in pci_mmap_fits(), which implicitly assumes the /sys/bus/pci/devices layout. 2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads to false positives, because pci_mmap_fits() doesn't treat empty resources correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT) in this case!). 3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus WARNINGS for the first resources that don't fit until the correct one is found. On many controllers the first 2-4 BARs are used, and the others are empty. In this case, an mmap attempt will first fail on the non-empty BARs (including the "right" BAR because of 1.) and emit bogus WARNINGS because of 3., and finally succeed on the first empty BAR because of 2. This is certainly not the intended behaviour. This patch addresses all 3 issues. Updated with an enum type for the additional parameter for pci_mmap_fits(). Cc: stable@kernel.org Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: export SMBIOS provided firmware instance and label to sysfsNarendra K
This patch exports SMBIOS provided firmware instance and label of onboard PCI devices to sysfs. New files are: /sys/bus/pci/devices/.../label which contains the firmware name for the device in question, and /sys/bus/pci/devices/.../index which contains the firmware device type instance for the given device. Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: Allow read/write access to sysfs I/O port resourcesAlex Williamson
PCI sysfs resource files currently only allow mmap'ing. On x86 this works fine for memory backed BARs, but doesn't work at all for I/O port backed BARs. Add read/write to I/O port PCI sysfs resource files to allow userspace access to these device regions. Acked-by: Chris Wright <chrisw@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: pci-sysfs: remove casts from void*Kulikov Vasiliy
Remove unnesessary casts from void*. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-06-11Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"Jesse Barnes
This reverts commit 75568f8094eb0333e9c2109b23cbc8b82d318a3c. Since they're just a convenience anyway, remove these symlinks since they're causing duplicate filename errors in the wild. Acked-by: Alex Chiang <achiang@canonical.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-21Merge branch 'linux-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (36 commits) PCI: hotplug: pciehp: Removed check for hotplug of display devices PCI: read memory ranges out of Broadcom CNB20LE host bridge PCI: Allow manual resource allocation for PCI hotplug bridges x86/PCI: make ACPI MCFG reserved error messages ACPI specific PCI hotplug: Use kmemdup PM/PCI: Update PCI power management documentation PCI: output FW warning in pci_read/write_vpd PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in comments PCI quirks: disable msi on AMD rs4xx internal gfx bridges PCI: Disable MSI for MCP55 on P5N32-E SLI x86/PCI: irq and pci_ids patch for additional Intel Cougar Point DeviceIDs PCI: aerdrv: trivial cleanup for aerdrv_core.c PCI: aerdrv: trivial cleanup for aerdrv.c PCI: aerdrv: introduce default_downstream_reset_link PCI: aerdrv: rework find_aer_service PCI: aerdrv: remove is_downstream PCI: aerdrv: remove magical ROOT_ERR_STATUS_MASKS PCI: aerdrv: redefine PCI_ERR_ROOT_*_SRC PCI: aerdrv: rework do_recovery PCI: aerdrv: rework get_e_source() ...
2010-05-21pci: check caps from sysfs file open to read device dependent config spaceChris Wright
The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN check to verify privileges before allowing a user to read device dependent config space. This is meant to protect from an unprivileged user potentially locking up the box. When assigning a PCI device directly to a guest with libvirt and KVM, the sysfs config space file is chown'd to the unprivileged user that the KVM guest will run as. The guest needs to have full access to the device's config space since it's responsible for driving the device. However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN check will not allow read access beyond the config header. With this patch we check privileges against the capabilities used when openining the sysfs file. The allows a privileged process to open the file and hand it to an unprivileged process, and the unprivileged process can still read all of the config space. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21sysfs: add struct file* to bin_attr callbacksChris Wright
This allows bin_attr->read,write,mmap callbacks to check file specific data (such as inode owner) as part of any privilege validation. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-11PCI: return correct value when writing to the "reset" attributeMichal Schmidt
A successful write() to the "reset" sysfs attribute should return the number of bytes written, not 0. Otherwise userspace (bash) retries the write over and over again. Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-11PCI: create function symlinks in /sys/bus/pci/slots/N/Alex Chiang
Create convenience symlinks in sysfs, linking slots to device functions, and vice versa. These links make it easier for users to figure out which devices actually live in what slots. For example: sapphire:/sys/bus/pci/slots # ls 1 10 2 3 4 5 6 7 8 9 sapphire:/sys/bus/pci/slots # ls -l 3 total 0 -r--r--r-- 1 root root 65536 Aug 18 14:10 address lrwxrwxrwx 1 root root 0 Aug 18 14:10 function0 -> ../../../../devices/pci0000:23/0000:23:01.0 lrwxrwxrwx 1 root root 0 Aug 18 14:10 function1 -> ../../../../devices/pci0000:23/0000:23:01.1 sapphire:/sys/bus/pci/slots # ls -l 3/function0/slot lrwxrwxrwx 1 root root 0 Aug 18 14:13 3/function0/slot -> ../../../bus/pci/slots/3 The original form of this patch was written by Matthew Wilcox, and was enhanced to include links from the sysfs slots/ directory pointing back at the device functions. Cc: willy@linux.intel.com Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-19sysfs: Initialised pci bus legacy_mem field before useMel Gorman
PPC64 is failing to boot the latest mmotm due to an uninitialised pointer in pci_create_legacy_files(). The surprise is that machines boot at all and it would appear to affect current mainline as well. This patch fixes the problem. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07sysfs: fix for thinko with sysfs_bin_attr_init()Stephen Rothwell
After merging the final tree, today's linux-next build (powerpc allyesconfig) failed like this: drivers/pci/pci-sysfs.c: In function 'pci_create_legacy_files': drivers/pci/pci-sysfs.c:645: error: lvalue required as unary '&' operand drivers/pci/pci-sysfs.c:658: error: lvalue required as unary '&' operand Caused by commit "sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributes" interacting with commit "sysfs: Use one lockdep class per sysfs attribute") both from the driver-core tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributesEric W. Biederman
These are the non-static sysfs attributes that exist on my test machine. Fix them to use sysfs_attr_init or sysfs_bin_attr_init as appropriate. It simply requires making a sysfs attribute present to see this. So this is a little bit tedious but otherwise not too bad. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-04PCI: Check the node argument passed to cpumask_of_nodeDavid John
Commit e0cd516 "PCI: derive nearby CPUs from device's instead of bus' NUMA information" causes an null pointer dereference when reading from the sysfs attributes local_cpu* on Intel machines with no ACPI NUMA proximity info, since dev->numa_node gets set to -1 for all PCI devices, which then gets passed to cpumask_of_node. Add a check to prevent this. Signed-off-by: David John <davidjon@xenontk.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04PCI: show dma_mask bits in /sysYinghai Lu
So we can catch if the driver sets an incorrect dma_mask. Reviewed-by: Grant Grundler <grundler@google.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-06PCI: derive nearby CPUs from device's instead of bus' NUMA informationAndreas Herrmann
In case of AMD CPU northbridge functions this NUMA information might differ. Here is an example from a 4-socket system. Currently Linux shows root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node 0 root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu* 0-3 00000000,0000000f which is not correct for northbridge functions as the local CPUs are those of the same socket. With this patch and a quirk for AMD CPU NB functions Linux can do better and correctly show root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node 2 root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu* 8-11 00000000,00000f00 Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09PCI: expose function reset capability in sysfsMichael S. Tsirkin
Some devices allow an individual function to be reset without affecting other functions in the same device: that's what pci_reset_function does. For devices that have this support, expose reset attribite in sysfs. This is useful e.g. for virtualization, where a qemu userspace process wants to reset the device when the guest is reset, to emulate machine reboot as closely as possible. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-22docbooks: add/fix PCI kernel-docRandy Dunlap
Add drivers/pci/*.c source files to DocBook/kernel-api.tmpl and update those pci/*.c source files that need kernel-doc fixes. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-06PCI: allow PCI core hotplug to remove PCI root busAlex Chiang
There is no reason to prevent removal of root bus devices. A subsequent rescan will find them just fine. Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-06PCI: Setup disabled bridges even if buses are addedYuji Shimada
This patch sets up disabled bridges even if buses have already been added. pci_assign_unassigned_resources is called after buses are added. pci_assign_unassigned_resources calls pci_bus_assign_resources. pci_bus_assign_resources calls pci_setup_bridge to configure BARs of bridges. Currently pci_setup_bridge returns immediately if the bus have already been added. So pci_assign_unassigned_resources can't configure BARs of bridges that were added in a disabled state; this patch fixes the issue. On logical hot-add, we need to prevent the kernel from re-initializing bridges that have already been initialized. To achieve this, pci_setup_bridge returns immediately if the bridge have already been enabled. We don't need to check whether the specified bus is a root bus or not. pci_setup_bridge is not called on a root bus, because a root bus does not have a bridge. The patch adds a new helper function, pci_is_enabled. I made the function name similar to pci_is_managed. The codes which use enable_cnt directly are changed to use pci_is_enabled. Acked-by: Alex Chiang <achiang@hp.com> Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20PCI: Introduce /sys/bus/pci/devices/.../rescanAlex Chiang
This interface allows the user to force a rescan of the device's parent bus and all subordinate buses, and rediscover devices removed earlier from this part of the device tree. Cc: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20PCI: Introduce /sys/bus/pci/devices/.../removeAlex Chiang
This patch adds an attribute named "remove" to a PCI device's sysfs directory. Writing a non-zero value to this attribute will remove the PCI device and any children of it. Trent Piepho wrote the original implementation and documentation. Thanks to Vegard Nossum for testing under kmemcheck and finding locking issues with the sysfs interface. Cc: Trent Piepho <xyzzy@speakeasy.org> Tested-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20PCI: Introduce /sys/bus/pci/rescanAlex Chiang
This interface allows the user to force a rescan of all PCI buses in system, and rediscover devices that have been removed earlier. pci_bus_attrs implementation from Trent Piepho. Thanks to Vegard Nossum for discovering locking issues with the sysfs interface. Cc: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20PCI: expose boot VGA device via sysfs.Dave Airlie
X really would like to know which VGA device was considered the boot device by the system. The x86 PCI fixups have support for discovering this but we provide no way to expose it to userspace. This adds a sysfs file per VGA class device which has the value 0 for non the boot device or unknown, and 1 if the VGA device is the boot device. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19PCI/alpha: pci sysfs resourcesIvan Kokshaysky
This closes http://bugzilla.kernel.org/show_bug.cgi?id=10893 which is a showstopper for X development on alpha. The generic HAVE_PCI_MMAP code (drivers/pci-sysfs.c) is not very useful since we have to deal with three different types of MMIO address spaces: sparse and dense mappings for old ev4/ev5 machines and "normal" 1:1 MMIO space (bwx) for ev56 and later. Also "write combine" mappings are meaningless on alpha - roughly speaking, alpha does write combining, IO reordering and other optimizations by default, unless user splits IO accesses with memory barriers. I think the cleanest way to deal with resource files on alpha is to convert the default no-op pci_create_resource_files() and pci_remove_resource_files() for !HAVE_PCI_MMAP case into __weak functions and override them with alpha specific ones. Another alpha hook is needed for "legacy_" resource files to handle sparse addressing (pci_adjust_legacy_attr). With the "standard" resourceN files on ev56/ev6 libpciaccess works "out of the box". Handling of resourceN_sparse/resourceN_dense files on older machines obviously requires some userland work. Sparse/dense stuff has been tested on sx164 (pca56/pyxis, normally uses bwx IO) with the kernel hacked into "cia compatible" mode. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04PCI: return error on failure to read PCI ROMsTimothy S. Nelson
This patch makes the ROM reading code return an error to user space if the size of the ROM read is equal to 0. The patch also emits a warnings if the contents of the ROM are invalid, and documents the effects of the "enable" file on ROM reading. Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au> Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>