summaryrefslogtreecommitdiff
path: root/fs/sysfs
AgeCommit message (Collapse)Author
2012-05-28Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds
Pull writeback tree from Wu Fengguang: "Mainly from Jan Kara to avoid iput() in the flusher threads." * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Avoid iput() from flusher thread vfs: Rename end_writeback() to clear_inode() vfs: Move waiting for inode writeback from end_writeback() to evict_inode() writeback: Refactor writeback_single_inode() writeback: Remove wb->list_lock from writeback_single_inode() writeback: Separate inode requeueing after writeback writeback: Move I_DIRTY_PAGES handling writeback: Move requeueing when I_SYNC set to writeback_sb_inodes() writeback: Move clearing of I_SYNC into inode_sync_complete() writeback: initialize global_dirty_limit fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds mm: page-writeback.c: local functions should not be exposed globally
2012-05-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...
2012-05-15userns: Convert sysfs to use kgid/kuid where appropriateEric W. Biederman
Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-05-14sysfs: get rid of some lockdep false positivesAlan Stern
This patch (as1554) fixes a lockdep false-positive report. The problem arises because lockdep is unable to deal with the tree-structured locks created by the device core and sysfs. This particular problem involves a sysfs attribute method that unregisters itself, not from the device it was called for, but from a descendant device. Lockdep doesn't understand the distinction and reports a possible deadlock, even though the operation is safe. This is the sort of thing that would normally be handled by using a nested lock annotation; unfortunately it's not feasible to do that here. There's no sensible way to tell sysfs when attribute removal occurs in the context of a parent attribute method. As a workaround, the patch adds a new flag to struct attribute telling sysfs not to inform lockdep when it acquires a readlock on a sysfs_dirent instance for the attribute. The readlock is still acquired, but lockdep doesn't know about it and hence does not complain about impossible deadlock scenarios. Also added are macros for static initialization of attribute structures with the ignore_lockdep flag set. The three offending attributes in the USB subsystem are converted to use the new macros. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Tejun Heo <tj@kernel.org> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-06vfs: Rename end_writeback() to clear_inode()Jan Kara
After we moved inode_sync_wait() from end_writeback() it doesn't make sense to call the function end_writeback() anymore. Rename it to clear_inode() which well says what the function really does - set I_CLEAR flag. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-05-02sysfs: Removed dup_name entirely in sysfs_renameSasikantha babu
Since no one using "dup_name", removed it completely in sysfs_rename. Signed-off-by: Sasikantha babu <sasikanth.v19@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-10sysfs: handle 'parent deleted before child added'Dan Williams
In scsi at least two cases of the parent device being deleted before the child is added have been observed. 1/ scsi is performing async scans and the device is removed prior to the async can thread running (can happen with an in-opportune / unlikely unplug during initial scan). 2/ libsas discovery event running after the parent port has been torn down (this is a bug in libsas). Result in crash signatures like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6 ... Process scsi_scan_8 (pid: 5417, threadinfo ffff88080bd16000, task ffff880801b8a0b0) Stack: 00000000fffffffe ffff880813470628 ffff88080bd17cd0 ffff88080614b7e8 ffff88080b45c108 00000000fffffffe ffff88080bd17d20 ffffffff8125e4a8 ffff88080bd17cf0 ffffffff81075149 ffff88080bd17d30 ffff88080614b7e8 Call Trace: [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3 [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8125e641>] kobject_add_varg+0x41/0x50 [<ffffffff8125e70b>] kobject_add+0x64/0x66 [<ffffffff8131122b>] device_add+0x12d/0x63a In this scenario the parent is still valid (because we have a reference), but it has been device_del()'d which means its kobj->sd pointer is NULL'd via: device_del()->kobject_del()->sysfs_remove_dir() ...and then sysfs_create_dir() (without this fix) goes ahead and de-references parent_sd via sysfs_ns_type(): return (sd->s_flags & SYSFS_NS_TYPE_MASK) >> SYSFS_NS_TYPE_SHIFT; This scenario is being fixed in scsi/libsas, but if other subsystems present the same ordering the system need not immediately crash. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: James Bottomley <JBottomley@parallels.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-10sysfs: Prevent crash on unset sysfs group attributesBruno Prémont
Do not let the kernel crash when a device is registered with sysfs while group attributes are not set (aka NULL). Warn about the offender with some information about the offending device. This would warn instead of trying NULL pointer deref like: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81152673>] internal_create_group+0x83/0x1a0 PGD 0 Oops: 0000 [#1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc1-x86_64 #3 HP ProLiant DL360 G4 RIP: 0010:[<ffffffff81152673>] [<ffffffff81152673>] internal_create_group+0x83/0x1a0 RSP: 0018:ffff88019485fd70 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 RDX: ffff880192e99908 RSI: ffff880192e99630 RDI: ffffffff81a26c60 RBP: ffff88019485fdc0 R08: 0000000000000000 R09: 0000000000000000 R10: ffff880192e99908 R11: 0000000000000000 R12: ffffffff81a16a00 R13: ffff880192e99908 R14: ffffffff81a16900 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88019bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a0c000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 1, threadinfo ffff88019485e000, task ffff880194878000) Stack: ffff88019485fdd0 ffff880192da9d60 0000000000000000 ffff880192e99908 ffff880192e995d8 0000000000000001 ffffffff81a16a00 ffff880192da9d60 0000000000000000 0000000000000000 ffff88019485fdd0 ffffffff811527be Call Trace: [<ffffffff811527be>] sysfs_create_group+0xe/0x10 [<ffffffff81376ca6>] device_add_groups+0x46/0x80 [<ffffffff81377d3d>] device_add+0x46d/0x6a0 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-09sysfs: Update the name hash for an entry after changing the namespaceTom Goff
This is needed to allow renaming network devices that have been moved to another network namespace. Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "This is _not_ all; in particular, Miklos' and Jan's stuff is not there yet." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits) ext4: initialization of ext4_li_mtx needs to be done earlier debugfs-related mode_t whack-a-mole hfsplus: add an ioctl to bless files hfsplus: change finder_info to u32 hfsplus: initialise userflags qnx4: new helper - try_extent() qnx4: get rid of qnx4_bread/qnx4_getblk take removal of PF_FORKNOEXEC to flush_old_exec() trim includes in inode.c um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it um: embed ->stub_pages[] into mmu_context gadgetfs: list_for_each_safe() misuse ocfs2: fix leaks on failure exits in module_init ecryptfs: make register_filesystem() the last potential failure exit ntfs: forgets to unregister sysctls on register_filesystem() failure logfs: missing cleanup on register_filesystem() failure jfs: mising cleanup on register_filesystem() failure make configfs_pin_fs() return root dentry on success configfs: configfs_create_dir() has parent dentry in dentry->d_parent configfs: sanitize configfs_create() ...
2012-03-20switch open-coded instances of d_make_root() to new helperAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-08Revert "sysfs: Kill nlink counting."Greg Kroah-Hartman
This reverts commit 524b6c5b39b931311dfe5a2f5abae2f5c9731676. It has shown to break userspace tools, which is not acceptable. Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-24sysfs: Fix memory leak in sysfs_sd_setsecdata().Masami Ichikawa
This patch fixies follwing two memory leak patterns that reported by kmemleak. sysfs_sd_setsecdata() is called during sys_lsetxattr() operation. It checks sd->s_iattr is NULL or not. Then if it is NULL, it calls sysfs_init_inode_attrs() to allocate memory. That code is this. iattrs = sd->s_iattr; if (!iattrs) iattrs = sysfs_init_inode_attrs(sd); The iattrs recieves sysfs_init_inode_attrs()'s result, but sd->s_iattr doesn't know the address. so it needs to set correct address to sd->s_iattr to free memory in other function. unreferenced object 0xffff880250b73e60 (size 32): comm "systemd", pid 1, jiffies 4294683888 (age 94.553s) hex dump (first 32 bytes): 73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f system_u:object_ 72 3a 73 79 73 66 73 5f 74 3a 73 30 00 00 00 00 r:sysfs_t:s0.... backtrace: [<ffffffff814cb1d0>] kmemleak_alloc+0x73/0x98 [<ffffffff811270ab>] __kmalloc+0x100/0x12c [<ffffffff8120775a>] context_struct_to_string+0x106/0x210 [<ffffffff81207cc1>] security_sid_to_context_core+0x10b/0x129 [<ffffffff812090ef>] security_sid_to_context+0x10/0x12 [<ffffffff811fb0da>] selinux_inode_getsecurity+0x7d/0xa8 [<ffffffff811fb127>] selinux_inode_getsecctx+0x22/0x2e [<ffffffff811f4d62>] security_inode_getsecctx+0x16/0x18 [<ffffffff81191dad>] sysfs_setxattr+0x96/0x117 [<ffffffff811542f0>] __vfs_setxattr_noperm+0x73/0xd9 [<ffffffff811543d9>] vfs_setxattr+0x83/0xa1 [<ffffffff811544c6>] setxattr+0xcf/0x101 [<ffffffff81154745>] sys_lsetxattr+0x6a/0x8f [<ffffffff814efda9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff88024163c5a0 (size 96): comm "systemd", pid 1, jiffies 4294683888 (age 94.553s) hex dump (first 32 bytes): 00 00 00 00 ed 41 00 00 00 00 00 00 00 00 00 00 .....A.......... 00 00 00 00 00 00 00 00 0c 64 42 4f 00 00 00 00 .........dBO.... backtrace: [<ffffffff814cb1d0>] kmemleak_alloc+0x73/0x98 [<ffffffff81127402>] kmem_cache_alloc_trace+0xc4/0xee [<ffffffff81191cbe>] sysfs_init_inode_attrs+0x2a/0x83 [<ffffffff81191dd6>] sysfs_setxattr+0xbf/0x117 [<ffffffff811542f0>] __vfs_setxattr_noperm+0x73/0xd9 [<ffffffff811543d9>] vfs_setxattr+0x83/0xa1 [<ffffffff811544c6>] setxattr+0xcf/0x101 [<ffffffff81154745>] sys_lsetxattr+0x6a/0x8f [<ffffffff814efda9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff ` Signed-off-by: Masami Ichikawa <masami256@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-02Merge 3.3-rc2 into the driver-core-next branch.Greg Kroah-Hartman
This was done to resolve a merge and build problem with the drivers/acpi/processor_driver.c file. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-01-31sysfs: Update the name hash when renaming sysfs entriesEric W. Biederman
This fixes a bug introduced with sysfs name hashes where renaming a network device appears to succeed but silently makes the sysfs files for that network device inaccessible. In at least one configuration this bug has stopped networking from coming up during boot. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: change permissions for /sys from 0755 to 0555Vitaly Kuznetsov
There is a misleading difference between /proc and /sys permissions, /proc is 0555 and /sys is 0755. But as it is impossible to create or unlink something in /sys it would be nice to have same permissions. Signed-off-by: Vitaly Kuznetsov <vitty@altlinux.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Kill nlink counting.Eric W. Biederman
Tracking the number of subdirectories requires an extra field that increases the size of sysfs_dirent. nlinks are not particularly interesting for sysfs and the nlink counts are wrong when network namespaces are involved so stop counting them, and always return nlink == 1. Userspace already knows that directories with nlink == 1 have an nlink count they can't use to count subdirectories. This reduces the size of sysfs_dirent by 8 bytes on 64bit platforms. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Store the sysfs inode in an unsigned int.Eric W. Biederman
Store the sysfs inode number in an unsided int because ida inode allocator can return at most a 31 bit number, reducing the size of struct sysfs_dirent by 8 bytes on 64bit platforms. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Reduce s_flags to an unsinged short so it packs well with s_modeEric W. Biederman
On 32bit this reduces sizeof(struct sysfs_dirent) by 2 bytes. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Add s_hash to sysfs_dirent and order directory entries by hashEric W. Biederman
Compute a 31 bit hash of directory entries (that can fit in a signed 32bit off_t) and index the sysfs directory entries by that hash, replacing the per directory indexes by name and by inode. Because we now only use a single rbtree this reduces the size of sysfs_dirent by 2 pointers. Because we have fewer cases to deal with the code is now simpler. For now I use the simple hash that the dcache uses as that is easy to use and seems simple enough. In addition to makeing the code simpler using a hash for the file position in readdir brings sysfs in line with other filesystems that have non-trivial directory structures. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Complain bitterly about attempts to remove files from nonexistent ↵Eric W. Biederman
directories. Recently an OOPS was observed from the usb serial io_ti driver when it tried to remove sysfs directories. Upon investigation it turns out this driver was always buggy and that a recent sysfs change had stopped guarding itself against removing attributes from sysfs directories that had already been removed. :( Historically we have been silent about attempting to files from nonexistent sysfs directories and have politely returned error codes. That has resulted in people writing broken code that ignores the error codes. Issue a kernel WARNING and a stack backtrace to make it clear in no uncertain terms that abusing sysfs is not ok, and the callers need to fix their code. This change transforms the io_ti OOPS into a more comprehensible error message and stack backtrace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reported-by: Wolfgang Frisch <wfpub@roembden.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-03sysfs: propagate umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch sysfs_chmod_file() to umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch ->is_visible() to returning umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-11-02filesystems: add set_nlink()Miklos Szeredi
Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-11-01sysfs: Make sysfs_rename safe with sysfs_dirents in rbtrees.Eric W. Biederman
In sysfs_rename we need to remove the optimization of not calling sysfs_unlink_sibling and sysfs_link_sibling if the renamed parent directory is not changing. This optimization is no longer valid now that sysfs dirents are stored in an rbtree sorted by name. Move the assignment of s_ns before the call of sysfs_link_sibling. With no sysfs_dirent fields changing after the call of sysfs_link_sibling this allows sysfs_link_sibling to take any of the directory entries into account when it builds the rbtrees, and s_ns looks like a prime canidate to be used in the rbtree in the future. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Greg KH <gregkh@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-25sysfs: Remove support for tagged directories with untagged members (again)Eric W. Biederman
In commit 8a9ea3237e7e ("Merge git://.../davem/net-next") where my sysfs changes from the net tree merged with the sysfs rbtree changes from Mickulas Patocka the conflict resolution failed to preserve the simplified property that was the point of my changes. That is sysfs_find_dirent can now say something is a match if and only s_name and s_ns match what we are looking for, and sysfs_readdir can simply return all of the directory entries where s_ns matches the directory that we should be returning. Now that we are back to exact matches we can tweak sysfs_find_dirent and the name rb_tree to order sysfs_dirents by s_ns s_name and remove the second loop in sysfs_find_dirent. However that change seems a bit much for a conflict resolution so it can come later. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits) dp83640: free packet queues on remove dp83640: use proper function to free transmit time stamping packets ipv6: Do not use routes from locally generated RAs |PATCH net-next] tg3: add tx_dropped counter be2net: don't create multiple RX/TX rings in multi channel mode be2net: don't create multiple TXQs in BE2 be2net: refactor VF setup/teardown code into be_vf_setup/clear() be2net: add vlan/rx-mode/flow-control config to be_setup() net_sched: cls_flow: use skb_header_pointer() ipv4: avoid useless call of the function check_peer_pmtu TCP: remove TCP_DEBUG net: Fix driver name for mdio-gpio.c ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces ipv4: fix ipsec forward performance regression jme: fix irq storm after suspend/resume route: fix ICMP redirect validation net: hold sock reference while processing tx timestamps tcp: md5: add more const attributes Add ethtool -g support to virtio_net ... Fix up conflicts in: - drivers/net/Kconfig: The split-up generated a trivial conflict with removal of a stale reference to Documentation/networking/net-modules.txt. Remove it from the new location instead. - fs/sysfs/dir.c: Fairly nasty conflicts with the sysfs rb-tree usage, conflicting with Eric Biederman's changes for tagged directories.
2011-10-19sysfs: Reject with a warning invalid uses of tagged directories.Eric W. Biederman
sysfs is a core piece of ifrastructure that many people use and few people have all of the rules in their head on how to use it correctly. Add warnings for people using tagged directories improperly to that any misuses can be caught and diagnosed quickly. A single inexpensive test in sysfs_find_dirent is almost sufficient to catch all possible misuses. An additional warning is needed in sysfs_add_dirent so that we actually fail when attempting to add an untagged dirent in a tagged directory. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19sysfs: Remove support for tagged directories with untagged members.Eric W. Biederman
Now that /sys/class/net/bonding_masters is implemented as a tagged sysfs file we can remove support for untagged files in tagged directories. This change removes any ambiguity of what a NULL namespace value means. A NULL namespace parameter after this patch means that we are talking about an untagged sysfs dirent. This makes the sysfs code much less prone to mistakes when during maintenance. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19sysfs: Implement support for tagged files in sysfs.Eric W. Biederman
Looking up files in sysfs is hard to understand and analyize because we currently allow placing untagged files in tagged directories. In the implementation of that we have two subtly different meanings of NULL. NULL meaning there is no tag on a directory entry and NULL meaning we don't care which namespace the lookup is performed for. This multiple uses of NULL have resulted in subtle bugs (since fixed) in the code. Currently it is only the bonding driver that needs to have an untagged file in a tagged directory. To untagle this mess I am adding support for tagged files to sysfs. Modifying the bonding driver to implement bonding_masters as a tagged file. Registering bonding_masters once for each network namespace. Then I am removing support for untagged entries in tagged sysfs directories. Resulting in code that is much easier to reason about. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-26sysfs: add unsigned long cast to prevent compile warningHeiko Carstens
"sysfs: use rb-tree for inode number lookup" added a new printk which causes a new compile warning on s390 (and few other architectures): fs/sysfs/dir.c: In function 'sysfs_link_sibling': fs/sysfs/dir.c:63:4: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'ino_t' [-Wform Add an explicit unsigned long cast since ino_t is an unsigned long on most architectures. Cc: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: use rb-tree for inode number lookupMikulas Patocka
sysfs: use rb-tree for inode number lookup This patch makes sysfs use red-black tree for inode number lookup. Together with a previous patch to use red-black tree for name lookup, this patch makes all sysfs lookups to have O(log n) complexity. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: remove s_sibling hacksMikulas Patocka
sysfs: remove s_sibling hacks s_sibling was used for three different purposes: 1) as a linked list of entries in the directory 2) as a linked list of entries to be deleted 3) as a pointer to "struct completion" This patch removes the hack and introduces new union u which holds pointers for cases 2) and 3). This change is needed for the following patch that removes s_sibling at all and replaces it with a rb tree. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: use rb-tree for name lookupsMikulas Patocka
sysfs: use rb-tree for name lookups Use red-black tree for name lookups. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: count subdirectoriesMikulas Patocka
sysfs: count subdirectories This patch introduces a subdirectory counter for each sysfs directory. Without the patch, sysfs_refresh_inode would walk all entries of the directory to calculate the number of subdirectories. This patch improves time of "ls -la /sys/block" when there are 10000 block devices from 9 seconds to 0.19 seconds. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-20->permission() sanitizing: don't pass flags to ->permission()Al Viro
not used by the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20->permission() sanitizing: don't pass flags to generic_permission()Al Viro
redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of them removes that bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20kill check_acl callback of generic_permission()Al Viro
its value depends only on inode and does not change; we might as well store it in ->i_op->check_acl and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-12Delay struct net freeing while there's a sysfs instance refering to itAl Viro
* new refcount in struct net, controlling actual freeing of the memory * new method in kobj_ns_type_operations (->drop_ns()) * ->current_ns() semantics change - it's supposed to be followed by corresponding ->drop_ns(). For struct net in case of CONFIG_NET_NS it bumps the new refcount; net_drop_ns() decrements it and calls net_free() if the last reference has been dropped. Method renamed to ->grab_current_ns(). * old net_free() callers call net_drop_ns() instead. * sysfs_exit_ns() is gone, along with a large part of callchain leading to it; now that the references stored in ->ns[...] stay valid we do not need to hunt them down and replace them with NULL. That fixes problems in sysfs_lookup() and sysfs_readdir(), along with getting rid of sb->s_instances abuse. Note that struct net *shutdown* logics has not changed - net_cleanup() is called exactly when it used to be called. The only thing postponed by having a sysfs instance refering to that struct net is actual freeing of memory occupied by struct net. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-13sysfs: remove "last sysfs file:" line from the oops messagesGreg Kroah-Hartman
On some arches (x86, sh, arm, unicore, powerpc) the oops message would print out the last sysfs file accessed. This was very useful in finding a number of sysfs and driver core bugs in the 2.5 and early 2.6 development days, but it has been a number of years since this file has actually helped in debugging anything that couldn't also be trivially determined from the stack traceback. So it's time to delete the line. This is good as we need all the space we can get for oops messages at times on consoles. Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-10SYSFS: Fix erroneous comments for sysfs_update_group().Robert P. J. Day
Fix what is clearly a simple copy-and-paste error in commenting the sysfs_update_group() routine. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-01-20kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERTDavid Rientjes
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg KH <gregkh@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <holt@sgi.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-10Merge branch 'driver-core-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: driver core: Document that device_rename() is only for networking sysfs: remove useless test from sysfs_merge_group driver-core: merge private parts of class and bus driver core: fix whitespace in class_attr_string
2011-01-10headers: kobject.h reduxAlexey Dobriyan
Remove kobject.h from files which don't need it, notably, sched.h and fs.h. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-07fs: provide rcu-walk aware permission i_opsNick Piggin
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: rcu-walk aware d_revalidate methodNick Piggin
Require filesystems be aware of .d_revalidate being called in rcu-walk mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning -ECHILD from all implementations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce branches in lookup pathNick Piggin
Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_delete semanticsNick Piggin
Change d_delete from a dentry deletion notification to a dentry caching advise, more like ->drop_inode. Require it to be constant and idempotent, and not take d_lock. This is how all existing filesystems use the callback anyway. This makes fine grained dentry locking of dput and dentry lru scanning much simpler. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2010-11-29sysfs: remove useless test from sysfs_merge_groupAlan Stern
Dan Carpenter pointed out that the new sysfs_merge_group() and sysfs_unmerge_group() routines requires their grp argument to be non-NULL, because they dereference grp to obtain the list of attributes. Hence it's pointless for the routines to include a test and special-case handling for when grp is NULL. This patch (as1433) removes the unneeded tests. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Dan Carpenter <error27@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>