Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull openat2 support from Al Viro:
"This is the openat2() series from Aleksa Sarai.
I'm afraid that the rest of namei stuff will have to wait - it got
zero review the last time I'd posted #work.namei, and there had been a
leak in the posted series I'd caught only last weekend. I was going to
repost it on Monday, but the window opened and the odds of getting any
review during that... Oh, well.
Anyway, openat2 part should be ready; that _did_ get sane amount of
review and public testing, so here it comes"
From Aleksa's description of the series:
"For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown
flags are present[1].
This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road
to being added to openat(2).
Furthermore, the need for some sort of control over VFS's path
resolution (to avoid malicious paths resulting in inadvertent
breakouts) has been a very long-standing desire of many userspace
applications.
This patchset is a revival of Al Viro's old AT_NO_JUMPS[3] patchset
(which was a variant of David Drysdale's O_BENEATH patchset[4] which
was a spin-off of the Capsicum project[5]) with a few additions and
changes made based on the previous discussion within [6] as well as
others I felt were useful.
In line with the conclusions of the original discussion of
AT_NO_JUMPS, the flag has been split up into separate flags. However,
instead of being an openat(2) flag it is provided through a new
syscall openat2(2) which provides several other improvements to the
openat(2) interface (see the patch description for more details). The
following new LOOKUP_* flags are added:
LOOKUP_NO_XDEV:
Blocks all mountpoint crossings (upwards, downwards, or through
absolute links). Absolute pathnames alone in openat(2) do not
trigger this. Magic-link traversal which implies a vfsmount jump is
also blocked (though magic-link jumps on the same vfsmount are
permitted).
LOOKUP_NO_MAGICLINKS:
Blocks resolution through /proc/$pid/fd-style links. This is done
by blocking the usage of nd_jump_link() during resolution in a
filesystem. The term "magic-links" is used to match with the only
reference to these links in Documentation/, but I'm happy to change
the name.
It should be noted that this is different to the scope of
~LOOKUP_FOLLOW in that it applies to all path components. However,
you can do openat2(NO_FOLLOW|NO_MAGICLINKS) on a magic-link and it
will *not* fail (assuming that no parent component was a
magic-link), and you will have an fd for the magic-link.
In order to correctly detect magic-links, the introduction of a new
LOOKUP_MAGICLINK_JUMPED state flag was required.
LOOKUP_BENEATH:
Disallows escapes to outside the starting dirfd's
tree, using techniques such as ".." or absolute links. Absolute
paths in openat(2) are also disallowed.
Conceptually this flag is to ensure you "stay below" a certain
point in the filesystem tree -- but this requires some additional
to protect against various races that would allow escape using
"..".
Currently LOOKUP_BENEATH implies LOOKUP_NO_MAGICLINKS, because it
can trivially beam you around the filesystem (breaking the
protection). In future, there might be similar safety checks done
as in LOOKUP_IN_ROOT, but that requires more discussion.
In addition, two new flags are added that expand on the above ideas:
LOOKUP_NO_SYMLINKS:
Does what it says on the tin. No symlink resolution is allowed at
all, including magic-links. Just as with LOOKUP_NO_MAGICLINKS this
can still be used with NOFOLLOW to open an fd for the symlink as
long as no parent path had a symlink component.
LOOKUP_IN_ROOT:
This is an extension of LOOKUP_BENEATH that, rather than blocking
attempts to move past the root, forces all such movements to be
scoped to the starting point. This provides chroot(2)-like
protection but without the cost of a chroot(2) for each filesystem
operation, as well as being safe against race attacks that
chroot(2) is not.
If a race is detected (as with LOOKUP_BENEATH) then an error is
generated, and similar to LOOKUP_BENEATH it is not permitted to
cross magic-links with LOOKUP_IN_ROOT.
The primary need for this is from container runtimes, which
currently need to do symlink scoping in userspace[7] when opening
paths in a potentially malicious container.
There is a long list of CVEs that could have bene mitigated by
having RESOLVE_THIS_ROOT (such as CVE-2017-1002101,
CVE-2017-1002102, CVE-2018-15664, and CVE-2019-5736, just to name a
few).
In order to make all of the above more usable, I'm working on
libpathrs[8] which is a C-friendly library for safe path resolution.
It features a userspace-emulated backend if the kernel doesn't support
openat2(2). Hopefully we can get userspace to switch to using it, and
thus get openat2(2) support for free once it's ready.
Future work would include implementing things like
RESOLVE_NO_AUTOMOUNT and possibly a RESOLVE_NO_REMOTE (to allow
programs to be sure they don't hit DoSes though stale NFS handles)"
* 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Documentation: path-lookup: include new LOOKUP flags
selftests: add openat2(2) selftests
open: introduce openat2(2) syscall
namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution
namei: LOOKUP_IN_ROOT: chroot-like scoped resolution
namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution
namei: LOOKUP_NO_XDEV: block mountpoint crossing
namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution
namei: LOOKUP_NO_SYMLINKS: block symlink resolution
namei: allow set_root() to produce errors
namei: allow nd_jump_link() to produce errors
nsfs: clean-up ns_get_path() signature to return int
namei: only return -ECHILD from follow_dotdot_rcu()
|
|
aa_xattrs_match() is unfortunately calling vfs_getxattr_alloc() from a
context protected by an rcu_read_lock. This can not be done as
vfs_getxattr_alloc() may sleep regardles of the gfp_t value being
passed to it.
Fix this by breaking the rcu_read_lock on the policy search when the
xattr match feature is requested and restarting the search if a policy
changes occur.
Fixes: 8e51f9087f40 ("apparmor: Add support for attaching profiles via xattr, presence and value")
Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
In preparation for LOOKUP_NO_MAGICLINKS, it's necessary to add the
ability for nd_jump_link() to return an error which the corresponding
get_link() caller must propogate back up to the VFS.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- increase left match history buffer size to provide improved
conflict resolution in overlapping execution rules.
- switch buffer allocation to use a memory pool and GFP_KERNEL where
possible.
- add compression of policy blobs to reduce memory usage.
Cleanups:
- fix spelling mistake "immutible" -> "immutable"
Bug fixes:
- fix unsigned len comparison in update_for_len macro
- fix sparse warning for type-casting of current->real_cred"
* tag 'apparmor-pr-2019-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: make it so work buffers can be allocated from atomic context
apparmor: reduce rcu_read_lock scope for aa_file_perm mediation
apparmor: fix wrong buffer allocation in aa_new_mount
apparmor: fix unsigned len comparison with less than zero
apparmor: increase left match history buffer size
apparmor: Switch to GFP_KERNEL where possible
apparmor: Use a memory pool instead per-CPU caches
apparmor: Force type-casting of current->real_cred
apparmor: fix spelling mistake "immutible" -> "immutable"
apparmor: fix blob compression when ns is forced on a policy load
apparmor: fix missing ZLIB defines
apparmor: fix blob compression build failure on ppc
apparmor: Initial implementation of raw policy blob compression
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount updates from Al Viro:
"The first part of mount updates.
Convert filesystems to use the new mount API"
* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
mnt_init(): call shmem_init() unconditionally
constify ksys_mount() string arguments
don't bother with registering rootfs
init_rootfs(): don't bother with init_ramfs_fs()
vfs: Convert smackfs to use the new mount API
vfs: Convert selinuxfs to use the new mount API
vfs: Convert securityfs to use the new mount API
vfs: Convert apparmorfs to use the new mount API
vfs: Convert openpromfs to use the new mount API
vfs: Convert xenfs to use the new mount API
vfs: Convert gadgetfs to use the new mount API
vfs: Convert oprofilefs to use the new mount API
vfs: Convert ibmasmfs to use the new mount API
vfs: Convert qib_fs/ipathfs to use the new mount API
vfs: Convert efivarfs to use the new mount API
vfs: Convert configfs to use the new mount API
vfs: Convert binfmt_misc to use the new mount API
convenience helper: get_tree_single()
convenience helper get_tree_nodev()
vfs: Kill sget_userns()
...
|
|
Convert the apparmorfs filesystem to the new internal mount API as the old
one will be obsoleted and removed. This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.
See Documentation/filesystems/mount_api.txt for more information.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: John Johansen <john.johansen@canonical.com>
cc: apparmor@lists.ubuntu.com
cc: linux-security-module@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 315 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This adds an initial implementation of raw policy blob compression,
using deflate. Compression level can be controlled via a new sysctl,
"apparmor.rawdata_compression_level", which can be set to a value
between 0 (no compression) and 9 (highest compression).
Signed-off-by: Chris Coulson <chris.coulson@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
symlink body shouldn't be freed without an RCU delay. Switch apparmorfs
to ->destroy_inode() and use of call_rcu(); free both the inode and symlink
body in the callback.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor fixes from John Johansen:
- fix double when failing to unpack secmark rules in policy
- fix leak of dentry when profile is removed
* tag 'apparmor-pr-2019-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix double free when unpack of secmark rules fails
apparmor: delete the dentry in aafs_remove() to avoid a leak
apparmor: Fix warning about unused function apparmor_ipv6_postroute
|
|
Although the apparmorfs dentries are always dropped from the dentry cache
when the usage count drops to zero, there is no guarantee that this will
happen in aafs_remove(), as another thread might still be using it. In
this scenario, this means that the dentry will temporarily continue to
appear in the results of lookups, even after the call to aafs_remove().
In the case of removal of a profile - it also causes simple_rmdir()
on the profile directory to fail, as the directory won't be empty until
the usage counts of all child dentries have decreased to zero. This
results in the dentry for the profile directory leaking and appearing
empty in the file system tree forever.
Signed-off-by: Chris Coulson <chris.coulson@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends. That changed
when we forked out support for the latter into the export.h file.
This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.
The advantage in removing such instances is that module.h itself
sources about 15 other headers; adding significantly to what we feed
cpp, and it can obscure what headers we are effectively using.
Since module.h might have been the implicit source for init.h
(for __init) and for export.h (for EXPORT_SYMBOL) we consider each
instance for the presence of either and replace as needed.
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-security-module@vger.kernel.org
Cc: linux-integrity@vger.kernel.org
Cc: keyrings@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
|
|
Trivial fix to clean up an indentation issue, remove space
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Fully initialize the aa_perms struct in profile_query_cb() to avoid the
potential of using an uninitialized struct member's value in a response
to a query from userspace.
Detected by CoverityScan CID#1415126 ("Uninitialized scalar variable")
Fixes: 4f3b3f2d79a4 ("apparmor: add profile permission query ability")
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- add base infrastructure for socket mediation. ABI bump and
additional checks to ensure only v8 compliant policy uses socket af
mediation.
- improve and cleanup dfa verification
- improve profile attachment logic
- improve overlapping expression handling
- add the xattr matching to the attachment logic
- improve signal mediation handling with stacked labels
- improve handling of no_new_privs in a label stack
Cleanups and changes:
- use dfa to parse string split
- bounded version of label_parse
- proper line wrap nulldfa.in
- split context out into task and cred naming to better match usage
- simplify code in aafs
Bug fixes:
- fix display of .ns_name for containers
- fix resource audit messages when auditing peer
- fix logging of the existence test for signals
- fix resource audit messages when auditing peer
- fix display of .ns_name for containers
- fix an error code in verify_table_headers()
- fix memory leak on buffer on error exit path
- fix error returns checks by making size a ssize_t"
* tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (36 commits)
apparmor: fix memory leak on buffer on error exit path
apparmor: fix dangling symlinks to policy rawdata after replacement
apparmor: Fix an error code in verify_table_headers()
apparmor: fix error returns checks by making size a ssize_t
apparmor: update MAINTAINERS file git and wiki locations
apparmor: remove POLICY_MEDIATES_SAFE
apparmor: add base infastructure for socket mediation
apparmor: improve overlapping domain attachment resolution
apparmor: convert attaching profiles via xattrs to use dfa matching
apparmor: Add support for attaching profiles via xattr, presence and value
apparmor: cleanup: simplify code to get ns symlink name
apparmor: cleanup create_aafs() error path
apparmor: dfa split verification of table headers
apparmor: dfa add support for state differential encoding
apparmor: dfa move character match into a macro
apparmor: update domain transitions that are subsets of confinement at nnp
apparmor: move context.h to cred.h
apparmor: move task related defines and fns to task.X files
apparmor: cleanup, drop unused fn __aa_task_is_confined()
apparmor: cleanup fixup description of aa_replace_profiles
...
|
|
Currently on the error exit path the allocated buffer is not free'd
causing a memory leak. Fix this by kfree'ing it.
Detected by CoverityScan, CID#1466876 ("Resource leaks")
Fixes: 1180b4c757aa ("apparmor: fix dangling symlinks to policy rawdata after replacement")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
When policy replacement occurs the symlinks in the profile directory
need to be updated to point to the new rawdata, otherwise once the
old rawdata is removed the symlink becomes broken.
Fix this by dynamically generating the symlink everytime it is read.
These links are used enough that their value needs to be cached and
this way we can avoid needing locking to read and update the link
value.
Fixes: a481f4d917835 ("apparmor: add custom apparmorfs that will be used by policy namespace files")
BugLink: http://bugs.launchpad.net/bugs/1755563
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The unpack code now makes sure every profile has a dfa so the safe
version of POLICY_MEDIATES is no longer needed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
version 2 - Force an abi break. Network mediation will only be
available in v8 abi complaint policy.
Provide a basic mediation of sockets. This is not a full net mediation
but just whether a spcific family of socket can be used by an
application, along with setting up some basic infrastructure for
network mediation to follow.
the user space rule hav the basic form of
NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ]
[ TYPE | PROTOCOL ]
DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' |
'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' |
'netbeui' | 'security' | 'key' | 'packet' | 'ash' |
'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' |
'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' |
'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' |
'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' |
'vsock' | 'mpls' | 'ib' | 'kcm' ) ','
TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' |
'packet' )
PROTOCOL = ( 'tcp' | 'udp' | 'icmp' )
eg.
network,
network inet,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
|
|
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Overlapping domain attachments using the current longest left exact
match fail in some simple cases, and with the fix to ensure consistent
behavior by failing unresolvable attachments it becomes important to
do a better job.
eg. under the current match the following are unresolvable where
the alternation is clearly a better match under the most specific
left match rule.
/**
/{bin/,}usr/
Use a counting match that detects when a loop in the state machine is
enter, and return the match count to provide a better specific left
match resolution.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This converts profile attachment based on xattrs to a fixed extended
conditional using dfa matching.
This has a couple of advantages
- pattern matching can be used for the xattr match
- xattrs can be optional for an attachment or marked as required
- the xattr attachment conditional will be able to be combined with
other extended conditionals when the flexible extended conditional
work lands.
The xattr fixed extended conditional is appended to the xmatch
conditional. If an xattr attachment is specified the profile xmatch
will be generated regardless of whether there is a pattern match on
the executable name.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
|
|
ns_get_name() is called in only one place and can be folded in.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Domain transition so far have been largely blocked by no new privs,
unless the transition has been provably a subset of the previous
confinement. There was a couple problems with the previous
implementations,
- transitions that weren't explicitly a stack but resulted in a subset
of confinement were disallowed
- confinement subsets were only calculated from the previous
confinement instead of the confinement being enforced at the time of
no new privs, so transitions would have to get progressively
tighter.
Fix this by detecting and storing a reference to the task's
confinement at the "time" no new privs is set. This reference is then
used to determine whether a transition is a subsystem of the
confinement at the time no new privs was set.
Unfortunately the implementation is less than ideal in that we have to
detect no new privs after the fact when a task attempts a domain
transition. This is adequate for the currently but will not work in a
stacking situation where no new privs could be conceivably be set in
both the "host" and in the container.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Now that file contexts have been moved into file, and task context
fns() and data have been split from the context, only the cred context
remains in context.h so rename to cred.h to better reflect what it
deals with.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
These duplicate includes have been found with scripts/checkincludes.pl but
they have been removed manually to avoid removing false positives.
Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The .ns_name should not be virtualized by the current ns view. It
needs to report the ns base name as that is being used during startup
as part of determining apparmor policy namespace support.
BugLink: http://bugs.launchpad.net/bugs/1746463
Fixes: d9f02d9c237aa ("apparmor: fix display of ns name")
Cc: Stable <stable@vger.kernel.org>
Reported-by: Serge Hallyn <serge@hallyn.com>
Tested-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This is a pure automated search-and-replace of the internal kernel
superblock flags.
The s_flags are now called SB_*, with the names and the values for the
moment mirroring the MS_* flags that they're equivalent to.
Note how the MS_xyz flags are the ones passed to the mount system call,
while the SB_xyz flags are what we then use in sb->s_flags.
The script to do this was:
# places to look in; re security/*: it generally should *not* be
# touched (that stuff parses mount(2) arguments directly), but
# there are two places where we really deal with superblock flags.
FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
include/linux/fs.h include/uapi/linux/bfs_fs.h \
security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
# the list of MS_... constants
SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
ACTIVE NOUSER"
SED_PROG=
for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
# we want files that contain at least one of MS_...,
# with fs/namespace.c and fs/pnode.c excluded.
L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
for f in $L; do sed -i $f $SED_PROG; done
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Use mutex_lock_nested to provide lockdep the parent child lock ordering of
the tree.
This fixes the lockdep Warning
[ 305.275177] ============================================
[ 305.275178] WARNING: possible recursive locking detected
[ 305.275179] 4.14.0-rc7+ #320 Not tainted
[ 305.275180] --------------------------------------------
[ 305.275181] apparmor_parser/1339 is trying to acquire lock:
[ 305.275182] (&ns->lock){+.+.}, at: [<ffffffff970544dd>] __aa_create_ns+0x6d/0x1e0
[ 305.275187]
but task is already holding lock:
[ 305.275187] (&ns->lock){+.+.}, at: [<ffffffff97054b5d>] aa_prepare_ns+0x3d/0xd0
[ 305.275190]
other info that might help us debug this:
[ 305.275191] Possible unsafe locking scenario:
[ 305.275192] CPU0
[ 305.275193] ----
[ 305.275193] lock(&ns->lock);
[ 305.275194] lock(&ns->lock);
[ 305.275195]
*** DEADLOCK ***
[ 305.275196] May be due to missing lock nesting notation
[ 305.275198] 2 locks held by apparmor_parser/1339:
[ 305.275198] #0: (sb_writers#10){.+.+}, at: [<ffffffff96e9c6b7>] vfs_write+0x1a7/0x1d0
[ 305.275202] #1: (&ns->lock){+.+.}, at: [<ffffffff97054b5d>] aa_prepare_ns+0x3d/0xd0
[ 305.275205]
stack backtrace:
[ 305.275207] CPU: 1 PID: 1339 Comm: apparmor_parser Not tainted 4.14.0-rc7+ #320
[ 305.275208] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
[ 305.275209] Call Trace:
[ 305.275212] dump_stack+0x85/0xcb
[ 305.275214] __lock_acquire+0x141c/0x1460
[ 305.275216] ? __aa_create_ns+0x6d/0x1e0
[ 305.275218] ? ___slab_alloc+0x183/0x540
[ 305.275219] ? ___slab_alloc+0x183/0x540
[ 305.275221] lock_acquire+0xed/0x1e0
[ 305.275223] ? lock_acquire+0xed/0x1e0
[ 305.275224] ? __aa_create_ns+0x6d/0x1e0
[ 305.275227] __mutex_lock+0x89/0x920
[ 305.275228] ? __aa_create_ns+0x6d/0x1e0
[ 305.275230] ? trace_hardirqs_on_caller+0x11f/0x190
[ 305.275231] ? __aa_create_ns+0x6d/0x1e0
[ 305.275233] ? __lockdep_init_map+0x57/0x1d0
[ 305.275234] ? lockdep_init_map+0x9/0x10
[ 305.275236] ? __rwlock_init+0x32/0x60
[ 305.275238] mutex_lock_nested+0x1b/0x20
[ 305.275240] ? mutex_lock_nested+0x1b/0x20
[ 305.275241] __aa_create_ns+0x6d/0x1e0
[ 305.275243] aa_prepare_ns+0xc2/0xd0
[ 305.275245] aa_replace_profiles+0x168/0xf30
[ 305.275247] ? __might_fault+0x85/0x90
[ 305.275250] policy_update+0xb9/0x380
[ 305.275252] profile_load+0x7e/0x90
[ 305.275254] __vfs_write+0x28/0x150
[ 305.275256] ? rcu_read_lock_sched_held+0x72/0x80
[ 305.275257] ? rcu_sync_lockdep_assert+0x2f/0x60
[ 305.275259] ? __sb_start_write+0xdc/0x1c0
[ 305.275261] ? vfs_write+0x1a7/0x1d0
[ 305.275262] vfs_write+0xca/0x1d0
[ 305.275264] ? trace_hardirqs_on_caller+0x11f/0x190
[ 305.275266] SyS_write+0x49/0xa0
[ 305.275268] entry_SYSCALL_64_fastpath+0x23/0xc2
[ 305.275271] RIP: 0033:0x7fa6b22e8c74
[ 305.275272] RSP: 002b:00007ffeaaee6288 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 305.275273] RAX: ffffffffffffffda RBX: 00007ffeaaee62a4 RCX: 00007fa6b22e8c74
[ 305.275274] RDX: 0000000000000a51 RSI: 00005566a8198c10 RDI: 0000000000000004
[ 305.275275] RBP: 0000000000000a39 R08: 0000000000000a51 R09: 0000000000000000
[ 305.275276] R10: 0000000000000000 R11: 0000000000000246 R12: 00005566a8198c10
[ 305.275277] R13: 0000000000000004 R14: 00005566a72ecb88 R15: 00005566a72ec3a8
Fixes: 73688d1ed0b8 ("apparmor: refactor prepare_ns() and make usable from different views")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This reverts commit 651e28c5537abb39076d3949fb7618536f1d242e.
This caused a regression:
"The specific problem is that dnsmasq refuses to start on openSUSE Leap
42.2. The specific cause is that and attempt to open a PF_LOCAL socket
gets EACCES. This means that networking doesn't function on a system
with a 4.14-rc2 system."
Sadly, the developers involved seemed to be in denial for several weeks
about this, delaying the revert. This has not been a good release for
the security subsystem, and this area needs to change development
practices.
Reported-and-bisected-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The DAC access permissions for several apparmorfs files are wrong.
.access - needs to be writable by all tasks to perform queries
the others in the set only provide a read fn so should be read only.
With policy namespace virtualization all apparmor needs to control
the permission and visibility checks directly which means DAC
access has to be allowed for all user, group, and other.
BugLink: http://bugs.launchpad.net/bugs/1713103
Fixes: c97204baf840b ("apparmor: rename apparmor file fns and data to indicate use")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Provide a basic mediation of sockets. This is not a full net mediation
but just whether a spcific family of socket can be used by an
application, along with setting up some basic infrastructure for
network mediation to follow.
the user space rule hav the basic form of
NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ]
[ TYPE | PROTOCOL ]
DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' |
'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' |
'netbeui' | 'security' | 'key' | 'packet' | 'ash' |
'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' |
'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' |
'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' |
'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' |
'vsock' | 'mpls' | 'ib' | 'kcm' ) ','
TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' |
'packet' )
PROTOCOL = ( 'tcp' | 'udp' | 'icmp' )
eg.
network,
network inet,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
|
|
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
|
|
Add basic mount mediation. That allows controlling based on basic
mount parameters. It does not include special mount parameters for
apparmor, super block labeling, or any triggers for apparmor namespace
parameter modifications on pivot root.
default userspace policy rules have the form of
MOUNT RULE = ( MOUNT | REMOUNT | UMOUNT )
MOUNT = [ QUALIFIERS ] 'mount' [ MOUNT CONDITIONS ] [ SOURCE FILEGLOB ]
[ '->' MOUNTPOINT FILEGLOB ]
REMOUNT = [ QUALIFIERS ] 'remount' [ MOUNT CONDITIONS ]
MOUNTPOINT FILEGLOB
UMOUNT = [ QUALIFIERS ] 'umount' [ MOUNT CONDITIONS ] MOUNTPOINT FILEGLOB
MOUNT CONDITIONS = [ ( 'fstype' | 'vfstype' ) ( '=' | 'in' )
MOUNT FSTYPE EXPRESSION ]
[ 'options' ( '=' | 'in' ) MOUNT FLAGS EXPRESSION ]
MOUNT FSTYPE EXPRESSION = ( MOUNT FSTYPE LIST | MOUNT EXPRESSION )
MOUNT FSTYPE LIST = Comma separated list of valid filesystem and
virtual filesystem types (eg ext4, debugfs, etc)
MOUNT FLAGS EXPRESSION = ( MOUNT FLAGS LIST | MOUNT EXPRESSION )
MOUNT FLAGS LIST = Comma separated list of MOUNT FLAGS.
MOUNT FLAGS = ( 'ro' | 'rw' | 'nosuid' | 'suid' | 'nodev' | 'dev' |
'noexec' | 'exec' | 'sync' | 'async' | 'remount' |
'mand' | 'nomand' | 'dirsync' | 'noatime' | 'atime' |
'nodiratime' | 'diratime' | 'bind' | 'rbind' | 'move' |
'verbose' | 'silent' | 'loud' | 'acl' | 'noacl' |
'unbindable' | 'runbindable' | 'private' | 'rprivate' |
'slave' | 'rslave' | 'shared' | 'rshared' |
'relatime' | 'norelatime' | 'iversion' | 'noiversion' |
'strictatime' | 'nouser' | 'user' )
MOUNT EXPRESSION = ( ALPHANUMERIC | AARE ) ...
PIVOT ROOT RULE = [ QUALIFIERS ] pivot_root [ oldroot=OLD PUT FILEGLOB ]
[ NEW ROOT FILEGLOB ]
SOURCE FILEGLOB = FILEGLOB
MOUNTPOINT FILEGLOB = FILEGLOB
eg.
mount,
mount /dev/foo,
mount options=ro /dev/foo -> /mnt/,
mount options in (ro,atime) /dev/foo -> /mnt/,
mount options=ro options=atime,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
|
|
Add signal mediation where the signal can be mediated based on the
signal, direction, or the label or the peer/target. The signal perms
are verified on a cross check to ensure policy consistency in the case
of incremental policy load/replacement.
The optimization of skipping the cross check when policy is guaranteed
to be consistent (single compile unit) remains to be done.
policy rules have the form of
SIGNAL_RULE = [ QUALIFIERS ] 'signal' [ SIGNAL ACCESS PERMISSIONS ]
[ SIGNAL SET ] [ SIGNAL PEER ]
SIGNAL ACCESS PERMISSIONS = SIGNAL ACCESS | SIGNAL ACCESS LIST
SIGNAL ACCESS LIST = '(' Comma or space separated list of SIGNAL
ACCESS ')'
SIGNAL ACCESS = ( 'r' | 'w' | 'rw' | 'read' | 'write' | 'send' |
'receive' )
SIGNAL SET = 'set' '=' '(' SIGNAL LIST ')'
SIGNAL LIST = Comma or space separated list of SIGNALS
SIGNALS = ( 'hup' | 'int' | 'quit' | 'ill' | 'trap' | 'abrt' |
'bus' | 'fpe' | 'kill' | 'usr1' | 'segv' | 'usr2' |
'pipe' | 'alrm' | 'term' | 'stkflt' | 'chld' | 'cont' |
'stop' | 'stp' | 'ttin' | 'ttou' | 'urg' | 'xcpu' |
'xfsz' | 'vtalrm' | 'prof' | 'winch' | 'io' | 'pwr' |
'sys' | 'emt' | 'exists' | 'rtmin+0' ... 'rtmin+32'
)
SIGNAL PEER = 'peer' '=' AARE
eg.
signal, # allow all signals
signal send set=(hup, kill) peer=foo,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
|
|
We accidentally forgot to set the error code on this path. It means we
return NULL instead of an error pointer. I looked through a bunch of
callers and I don't think it really causes a big issue, but the
documentation says we're supposed to return error pointers here.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Allow userspace to detect that basic profile policy namespaces are
available.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Update the user interface to support the stacked change_profile transition.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Now that the domain label transition is complete advertise it to
userspace.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Begin the actual switch to using domain labels by storing them on
the context and converting the label to a singular profile where
possible.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
There are still a few places where profile replacement fails to update
and a stale profile is used for mediation. Fix this by moving to
accessing the current label through a critical section that will
always ensure mediation is using the current label regardless of
whether the tasks cred has been updated or not.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The ns name being displayed should go through an ns view lookup.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The data being queried isn't always the current profile and a lookup
relative to the current profile should be done.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|