Welcome! Log In Create A New Profile

Advanced

[PATCH 1/2] security, perf: allow further restriction of perf_event_open

Posted by Jeff Vander Stoep 
When kernel.perf_event_paranoid is set to 3 (or greater), disallow
all access to performance events by users without CAP_SYS_ADMIN.

This new level of restriction is intended to reduce the attack
surface of the kernel. Perf is a valuable tool for developers but
is generally unnecessary and unused on production systems. Perf may
open up an attack vector to vulnerable device-specific drivers as
recently demonstrated in CVE-2016-0805, CVE-2016-0819,
CVE-2016-0843, CVE-2016-3768, and CVE-2016-3843. This new level of
restriction allows for a safe default to be set on production systems
while leaving a simple means for developers to grant access [1].

This feature is derived from CONFIG_GRKERNSEC_PERF_HARDEN by Brad
Spengler. It is based on a patch by Ben Hutchings [2]. Ben's patches
have been modified and split up to address on-list feedback.

kernel.perf_event_paranoid=3 is the default on both Debian [2] and
Android [3].

[1] Making perf available to developers on Android:
https://android-review.googlesource.com/#/c/234400/
[2] Original patch by Ben Hutchings:
https://lkml.org/lkml/2016/1/11/587
[3] https://android-review.googlesource.com/#/c/234743/

Signed-off-by: Jeff Vander Stoep <[email protected]>
---
Documentation/sysctl/kernel.txt | 1 +
include/linux/perf_event.h | 5 +++++
kernel/events/core.c | 4 ++++
3 files changed, 10 insertions(+)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index ffab8b5..fac9798 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -665,6 +665,7 @@ users (without CAP_SYS_ADMIN). The default value is 2.
>=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>=1: Disallow CPU event access by users without CAP_SYS_ADMIN
>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
+>=3: Disallow all event access by users without CAP_SYS_ADMIN

==============================================================

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8ed43261..1e2080f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1156,6 +1156,11 @@ static inline bool perf_paranoid_kernel(void)
return sysctl_perf_event_paranoid > 1;
}

+static inline bool perf_paranoid_any(void)
+{
+ return sysctl_perf_event_paranoid > 2;
+}
+
extern void perf_event_init(void);
extern void perf_tp_event(u16 event_type, u64 count, void *record,
int entry_size, struct pt_regs *regs,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 356a6c7..52bd100 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -353,6 +353,7 @@ static struct srcu_struct pmus_srcu;
* 0 - disallow raw tracepoint access for unpriv
* 1 - disallow cpu events for unpriv
* 2 - disallow kernel profiling for unpriv
+ * 3 - disallow all unpriv perf event use
*/
int sysctl_perf_event_paranoid __read_mostly = 2;

@@ -9296,6 +9297,9 @@ SYSCALL_DEFINE5(perf_event_open,
if (flags & ~PERF_FLAG_ALL)
return -EINVAL;

+ if (perf_paranoid_any() && !capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
err = perf_copy_attr(attr_uptr, &attr);
if (err)
return err;
--
2.8.0.rc3.226.g39d4020
On Wed, Jul 27, 2016 at 7:45 AM, Jeff Vander Stoep <[email protected]> wrote:
> When kernel.perf_event_paranoid is set to 3 (or greater), disallow
> all access to performance events by users without CAP_SYS_ADMIN.
>
> This new level of restriction is intended to reduce the attack
> surface of the kernel. Perf is a valuable tool for developers but
> is generally unnecessary and unused on production systems. Perf may
> open up an attack vector to vulnerable device-specific drivers as
> recently demonstrated in CVE-2016-0805, CVE-2016-0819,
> CVE-2016-0843, CVE-2016-3768, and CVE-2016-3843. This new level of
> restriction allows for a safe default to be set on production systems
> while leaving a simple means for developers to grant access [1].
>
> This feature is derived from CONFIG_GRKERNSEC_PERF_HARDEN by Brad
> Spengler. It is based on a patch by Ben Hutchings [2]. Ben's patches
> have been modified and split up to address on-list feedback.
>
> kernel.perf_event_paranoid=3 is the default on both Debian [2] and
> Android [3].
>
> [1] Making perf available to developers on Android:
> https://android-review.googlesource.com/#/c/234400/
> [2] Original patch by Ben Hutchings:
> https://lkml.org/lkml/2016/1/11/587
> [3] https://android-review.googlesource.com/#/c/234743/
>
> Signed-off-by: Jeff Vander Stoep <[email protected]>

Thanks for splitting this up! It'll be nice to have this delta out of
Debian and Android.

Reviewed-by: Kees Cook <[email protected]>

-Kees

> ---
> Documentation/sysctl/kernel.txt | 1 +
> include/linux/perf_event.h | 5 +++++
> kernel/events/core.c | 4 ++++
> 3 files changed, 10 insertions(+)
>
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index ffab8b5..fac9798 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -665,6 +665,7 @@ users (without CAP_SYS_ADMIN). The default value is 2.
> >=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
> >=1: Disallow CPU event access by users without CAP_SYS_ADMIN
> >=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
> +>=3: Disallow all event access by users without CAP_SYS_ADMIN
>
> ==============================================================
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 8ed43261..1e2080f 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1156,6 +1156,11 @@ static inline bool perf_paranoid_kernel(void)
> return sysctl_perf_event_paranoid > 1;
> }
>
> +static inline bool perf_paranoid_any(void)
> +{
> + return sysctl_perf_event_paranoid > 2;
> +}
> +
> extern void perf_event_init(void);
> extern void perf_tp_event(u16 event_type, u64 count, void *record,
> int entry_size, struct pt_regs *regs,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 356a6c7..52bd100 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -353,6 +353,7 @@ static struct srcu_struct pmus_srcu;
> * 0 - disallow raw tracepoint access for unpriv
> * 1 - disallow cpu events for unpriv
> * 2 - disallow kernel profiling for unpriv
> + * 3 - disallow all unpriv perf event use
> */
> int sysctl_perf_event_paranoid __read_mostly = 2;
>
> @@ -9296,6 +9297,9 @@ SYSCALL_DEFINE5(perf_event_open,
> if (flags & ~PERF_FLAG_ALL)
> return -EINVAL;
>
> + if (perf_paranoid_any() && !capable(CAP_SYS_ADMIN))
> + return -EACCES;
> +
> err = perf_copy_attr(attr_uptr, &attr);
> if (err)
> return err;
> --
> 2.8.0.rc3.226.g39d4020
>



--
Kees Cook
Chrome OS & Brillo Security
On Wed, Jul 27, 2016 at 07:45:46AM -0700, Jeff Vander Stoep wrote:
> When kernel.perf_event_paranoid is set to 3 (or greater), disallow
> all access to performance events by users without CAP_SYS_ADMIN.
>
> This new level of restriction is intended to reduce the attack
> surface of the kernel. Perf is a valuable tool for developers but
> is generally unnecessary and unused on production systems. Perf may
> open up an attack vector to vulnerable device-specific drivers as
> recently demonstrated in CVE-2016-0805, CVE-2016-0819,
> CVE-2016-0843, CVE-2016-3768, and CVE-2016-3843.

We have bugs we fix them, we don't kill complete infrastructure because
of them.

> This new level of
> restriction allows for a safe default to be set on production systems
> while leaving a simple means for developers to grant access [1].

So the problem I have with this is that it will completely inhibit
development of things like JITs that self-profile to re-compile
frequently used code.

I would much rather have an LSM hook where the security stuff can do
more fine grained control of things. Allowing some apps perf usage while
denying others.
Arnaldo Carvalho de Melo
Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
December 13, 2016 05:59AM
Em Tue, Aug 02, 2016 at 11:52:43AM +0200, Peter Zijlstra escreveu:
> On Wed, Jul 27, 2016 at 07:45:46AM -0700, Jeff Vander Stoep wrote:
> > When kernel.perf_event_paranoid is set to 3 (or greater), disallow
> > all access to performance events by users without CAP_SYS_ADMIN.

> > This new level of restriction is intended to reduce the attack
> > surface of the kernel. Perf is a valuable tool for developers but
> > is generally unnecessary and unused on production systems. Perf may
> > open up an attack vector to vulnerable device-specific drivers as
> > recently demonstrated in CVE-2016-0805, CVE-2016-0819,
> > CVE-2016-0843, CVE-2016-3768, and CVE-2016-3843.

> We have bugs we fix them, we don't kill complete infrastructure because
> of them.

> > This new level of
> > restriction allows for a safe default to be set on production systems
> > while leaving a simple means for developers to grant access [1].

> So the problem I have with this is that it will completely inhibit
> development of things like JITs that self-profile to re-compile
> frequently used code.

Or reimplement strace with sys_perf_event_open(), speeding it up greatly
by not using ptrace (see 'perf trace', one such attempt), combining it
with sys_bpf(), which can run unpriviledged as well, provides lots of
possibilities for efficient tooling that would be greatly stiffled by
such big hammer restrictions :-(

> I would much rather have an LSM hook where the security stuff can do
> more fine grained control of things. Allowing some apps perf usage while
> denying others.

- Arnaldo
On Tue, 2016-08-02 at 11:52 +0200, Peter Zijlstra wrote:
> On Wed, Jul 27, 2016 at 07:45:46AM -0700, Jeff Vander Stoep wrote:
> >
> > When kernel.perf_event_paranoid is set to 3 (or greater), disallow
> > all access to performance events by users without CAP_SYS_ADMIN.
> >
> > This new level of restriction is intended to reduce the attack
> > surface of the kernel. Perf is a valuable tool for developers but
> > is generally unnecessary and unused on production systems. Perf may
> > open up an attack vector to vulnerable device-specific drivers as
> > recently demonstrated in CVE-2016-0805, CVE-2016-0819,
> > CVE-2016-0843, CVE-2016-3768, and CVE-2016-3843.
>
> We have bugs we fix them, we don't kill complete infrastructure
> because
>
> of them.

It's still accessible to privileged processes either way. Android still
allows access from unprivileged processes but it can only be enabled via
the debugging shell, which is not enabled by default either.

It isn't even possible to disable the perf events infrastructure via
kernel configuration for every architecture right now. You're forcing
people to have common local privilege escalation and information leak
vulnerabilities for something few people actually use.

This patch is now a requirement for any Android devices with a security
patch level above August 2016. The only thing that not merging it is
going to accomplish is preventing a mainline kernel from ever being used
on Android devices, unless you provide an alternative it can use for the
same use case.

https://source.android.com/security/bulletin/2016-08-01.html

> > This new level of
> > restriction allows for a safe default to be set on production
> > systems
> > while leaving a simple means for developers to grant access [1].
>
> So the problem I have with this is that it will completely inhibit
> development of things like JITs that self-profile to re-compile
> frequently used code.
>
> I would much rather have an LSM hook where the security stuff can do
more fine grained control of things. Allowing some apps perf usage while
> denying others.

If the only need was controlling access per-process statically, then
using seccomp-bpf works fine. It needs to be dynamic though. I don't
think SELinux could be used to provide the functionality so it would
have to be a whole new LSM. I doubt anyone will implement that when the
necessary functionality is already available. It's already exposed only
for developers using profiling tools until they reboot, so finer grained
control wouldn't accomplish much.
Sorry, only registered users may post in this forum.

Click here to login