diff --git a/en/security/credentials.md b/en/security/credentials.md new file mode 100644 index 0000000..16efe19 --- /dev/null +++ b/en/security/credentials.md @@ -0,0 +1,387 @@ +#CREDENTIALS IN LINUX + +By: David Howells + +Contents: + +* Overview. + +* Types of credentials. + +* File markings. + +* Task credentials. + + - Immutable credentials. + - Accessing task credentials. + - Accessing another task's credentials. + - Altering credentials. + - Managing credentials. + +* Open file credentials. + +* Overriding the VFS's use of credentials. + +##OVERVIEW + +There are several parts to the security check performed by Linux when one object acts upon another: + +1. Objects. + + Objects are things in the system that may be acted upon directly by userspace programs. Linux has a variety of actionable objects, including: + + * Tasks + * Files/inodes + * Sockets + * Message queues + * Shared memory segments + * Semaphores + * Keys + + As a part of the description of all these objects there is a set of credentials. What's in the set depends on the type of object. + +2. Object ownership. + + Amongst the credentials of most objects, there will be a subset that indicates the ownership of that object. This is used for resource accounting and limitation (disk quotas and task rlimits for example). + + In a standard UNIX filesystem, for instance, this will be defined by the UID marked on the inode. + +3. The objective context. + + Also amongst the credentials of those objects, there will be a subset that indicates the 'objective context' of that object. This may or may not be the same set as in (2) - in standard UNIX files, for instance, this is the defined by the UID and the GID marked on the inode. + + The objective context is used as part of the security calculation that is carried out when an object is acted upon. + +4. Subjects. + + A subject is an object that is acting upon another object. + + Most of the objects in the system are inactive: they don't act on other objects within the system. Processes/tasks are the obvious exception: they do stuff; they access and manipulate things. + + Objects other than tasks may under some circumstances also be subjects. For instance an open file may send SIGIO to a task using the UID and EUID given to it by a task that called fcntl(F_SETOWN) upon it. In this case, the file struct will have a subjective context too. + +5. The subjective context. + + A subject has an additional interpretation of its credentials. A subset of its credentials forms the 'subjective context'. The subjective context is used as part of the security calculation that is carried out when a subject acts. + + A Linux task, for example, has the FSUID, FSGID and the supplementary group list for when it is acting upon a file - which are quite separate from the real UID and GID that normally form the objective context of the task. + +6. Actions. + + Linux has a number of actions available that a subject may perform upon an object. The set of actions available depends on the nature of the subject and the object. + + Actions include reading, writing, creating and deleting files; forking or signalling and tracing tasks. + +7. Rules, access control lists and security calculations. + + When a subject acts upon an object, a security calculation is made. This involves taking the subjective context, the objective context and the action, and searching one or more sets of rules to see whether the subject is granted or denied permission to act in the desired manner on the object, given those contexts. + + There are two main sources of rules: + + (a) Discretionary access control (DAC): + + Sometimes the object will include sets of rules as part of its description. This is an 'Access Control List' or 'ACL'. A Linux file may supply more than one ACL. + + A traditional UNIX file, for example, includes a permissions mask that is an abbreviated ACL with three fixed classes of subject ('user', 'group' and 'other'), each of which may be granted certain privileges ('read', 'write' and 'execute' - whatever those map to for the object in question). UNIX file permissions do not allow the arbitrary specification of subjects, however, and so are of limited use. + + A Linux file might also sport a POSIX ACL. This is a list of rules that grants various permissions to arbitrary subjects. + + (b) Mandatory access control (MAC): + + The system as a whole may have one or more sets of rules that get applied to all subjects and objects, regardless of their source. `SELinux` and `Smack` are examples of this. + + In the case of SELinux and Smack, each object is given a label as part of its credentials. When an action is requested, they take the subject label, the object label and the action and look for a rule that says that this action is either granted or denied. + +##TYPES OF CREDENTIALS + +The Linux kernel supports the following types of credentials: + + 1. Traditional UNIX credentials. + + Real User ID Real Group ID + + The UID and GID are carried by most, if not all, Linux objects, even + if in some cases it has to be invented (FAT or CIFS files for example, which are derived from Windows). These (mostly) define the objective context of that object, with tasks being slightly different in some cases. + + Effective, Saved and FS User ID + Effective, Saved and FS Group ID + Supplementary groups + + These are additional credentials used by tasks only. Usually, an `EUID/EGID/GROUPS` will be used as the subjective context, and real `UID/GID` will be used as the objective. For tasks, it should be noted that this is not always true. + + 2. Capabilities. + + Set of permitted capabilities + Set of inheritable capabilities + Set of effective capabilities + Capability bounding set + + These are only carried by tasks. They indicate superior capabilities granted piecemeal to a task that an ordinary task wouldn't otherwise have. These are manipulated implicitly by changes to the traditional UNIX credentials, but can also be manipulated directly by the `capset()` system call. + + The permitted capabilities are those caps that the process might grant itself to its effective or permitted sets through `capset()`. This inheritable set might also be so constrained. + + The effective capabilities are the ones that a task is actually allowed to make use of itself. + + The inheritable capabilities are the ones that may get passed across `execve()`. + + The bounding set limits the capabilities that may be inherited across `execve()`, especially when a binary is executed that will execute as UID 0. + + 3. Secure management flags (securebits). + + These are only carried by tasks. These govern the way the above credentials are manipulated and inherited over certain operations such as `execve()`. They aren't used directly as objective or subjective credentials. + + 4. Keys and keyrings. + + These are only carried by tasks. They carry and cache security tokens that don't fit into the other standard UNIX credentials. They are for making such things as network filesystem keys available to the file accesses performed by processes, without the necessity of ordinary programs having to know about security details involved. + + Keyrings are a special type of key. They carry sets of other keys and can be searched for the desired key. Each process may subscribe to a number of keyrings: + + Per-thread keying + Per-process keyring + Per-session keyring + + When a process accesses a key, if not already present, it will normally be cached on one of these keyrings for future accesses to find. + + For more information on using keys, see [Documentation/security/keys.txt](https://www.kernel.org/doc/Documentation/security/keys.txt). + + 5. LSM + + The Linux Security Module allows extra controls to be placed over the operations that a task may do. Currently Linux supports several LSM options. + + Some work by labelling the objects in a system and then applying sets of rules (policies) that say what operations a task with one label may do to an object with another label. + + 6. AF_KEY + + This is a socket-based approach to credential management for networking stacks [RFC 2367](https://www.ietf.org/rfc/rfc2367.txt). It isn't discussed by this document as it doesn't interact directly with task and file credentials; rather it keeps system level credentials. + +When a file is opened, part of the opening task's subjective context is recorded in the file struct created. This allows operations using that file struct to use those credentials instead of the subjective context of the task that issued the operation. An example of this would be a file opened on a network filesystem where the credentials of the opened file should be presented to the server, regardless of who is actually doing a read or a write upon it. + +##FILE MARKINGS + +Files on disk or obtained over the network may have annotations that form the objective security context of that file. Depending on the type of filesystem, this may include one or more of the following: + + * UNIX UID, GID, mode; + + * Windows user ID; + + * Access control list; + + * LSM security label; + + * UNIX exec privilege escalation bits (SUID/SGID); + + * File capabilities exec privilege escalation bits. + +These are compared to the task's subjective security context, and certain operations allowed or disallowed as a result. In the case of `execve()`, the privilege escalation bits come into play, and may allow the resulting process extra privileges, based on the annotations on the executable file. + +##TASK CREDENTIALS + +In Linux, all of a task's credentials are held in (uid, gid) or through (groups, keys, LSM security) a refcounted structure of type `struct cred`. Each task points to its credentials by a pointer called `cred` in its `task_struct`. + +Once a set of credentials has been prepared and committed, it may not be changed, barring the following exceptions: + + 1. its reference count may be changed; + + 2. the reference count on the `group_info`struct it points to may be changed; + + 3. the reference count on the security data it points to may be changed; + + 4. the reference count on any keyrings it points to may be changed; + + 5. any keyrings it points to may be revoked, expired or have their security attributes changed; and + + 6. the contents of any keyrings to which it points may be changed (the whole point of keyrings being a shared set of credentials, modifiable by anyone with appropriate access). + +To alter anything in the cred struct, the copy-and-replace principle must be adhered to. First take a copy, then alter the copy and then use RCU to change the task pointer to make it point to the new copy. There are wrappers to aid with this (see below). + +A task may only alter its *own* credentials; it is no longer permitted for a task to alter another's credentials. This means the `capset()` system call is no longer permitted to take any PID other than the one of the current process. Also `keyctl_instantiate()` and `keyctl_negate()` functions no longer permit attachment to process-specific keyrings in the requesting process as the instantiating process may need to create them. + +### IMMUTABLE CREDENTIALS + +Once a set of credentials has been made public (by calling `commit_creds()` for example), it must be considered immutable, barring two exceptions: + + 1. The reference count may be altered. + + 2. Whilst the keyring subscriptions of a set of credentials may not be changed, the keyrings subscribed to may have their contents altered. + +To catch accidental credential alteration at compile time, struct `task_struct` has *const* pointers to its credential sets, as does struct file. Furthermore, certain functions such as `get_cred()` and `put_cred()` operate on const pointers, thus rendering casts unnecessary, but require to temporarily ditch the const qualification to be able to alter the reference count. + +### ACCESSING TASK CREDENTIALS + +A task being able to alter only its own credentials permits the current process to read or replace its own credentials without the need for any form of locking - which simplifies things greatly. It can just call: +``` + const struct cred *current_cred() +``` +to get a pointer to its credentials structure, and it doesn't have to release it afterwards. + +There are convenience wrappers for retrieving specific aspects of a task's credentials (the value is simply returned in each case): +``` + uid_t current_uid(void) Current's real UID + gid_t current_gid(void) Current's real GID + uid_t current_euid(void) Current's effective UID + gid_t current_egid(void) Current's effective GID + uid_t current_fsuid(void) Current's file access UID + gid_t current_fsgid(void) Current's file access GID + kernel_cap_t current_cap(void) Current's effective capabilities + void *current_security(void) Current's LSM security pointer + struct user_struct *current_user(void) Current's user account +``` +There are also convenience wrappers for retrieving specific associated pairs of a task's credentials: +``` + void current_uid_gid(uid_t *, gid_t *); + void current_euid_egid(uid_t *, gid_t *); + void current_fsuid_fsgid(uid_t *, gid_t *); +``` +which return these pairs of values through their arguments after retrieving them from the current task's credentials. + +In addition, there is a function for obtaining a reference on the current process's current set of credentials: +``` + const struct cred *get_current_cred(void); +``` +and functions for getting references to one of the credentials that don't actually live in struct cred: +``` + struct user_struct *get_current_user(void); + struct group_info *get_current_groups(void); +``` +which get references to the current process's user accounting structure and supplementary groups list respectively. + +Once a reference has been obtained, it must be released with `put_cred()`, `free_uid()` or `put_group_info()` as appropriate. + +### ACCESSING ANOTHER TASK'S CREDENTIALS + +Whilst a task may access its own credentials without the need for locking, the same is not true of a task wanting to access another task's credentials. It must use the RCU read lock and `rcu_dereference()`. + +The `rcu_dereference()` is wrapped by: +``` + const struct cred *__task_cred(struct task_struct *task); +``` +This should be used inside the RCU read lock, as in the following example: +``` + void foo(struct task_struct *t, struct foo_data *f) + { + const struct cred *tcred; + ... + rcu_read_lock(); + tcred = __task_cred(t); + f->uid = tcred->uid; + f->gid = tcred->gid; + f->groups = get_group_info(tcred->groups); + rcu_read_unlock(); + ... + } +``` +Should it be necessary to hold another task's credentials for a long period of time, and possibly to sleep whilst doing so, then the caller should get a reference on them using: +``` + const struct cred *get_task_cred(struct task_struct *task); +``` +This does all the RCU magic inside of it. The caller must call `put_cred()` on the credentials so obtained when they're finished with. + +***Note***: The result of `__task_cred()` should not be passed directly to `get_cred()` as this may race with `commit_cred()`. + +There are a couple of convenience functions to access bits of another task's credentials, hiding the RCU magic from the caller: +``` + uid_t task_uid(task) Task's real UID + uid_t task_euid(task) Task's effective UID +``` +If the caller is holding the RCU read lock at the time anyway, then: +``` + __task_cred(task)->uid + __task_cred(task)->euid +``` +should be used instead. Similarly, if multiple aspects of a task's credentials need to be accessed, RCU read lock should be used, `__task_cred()` called, the result stored in a temporary pointer and then the credential aspects called from that before dropping the lock. This prevents the potentially expensive RCU magic from being invoked multiple times. + +Should some other single aspect of another task's credentials need to be +accessed, then this can be used: +``` + task_cred_xxx(task, member) +``` +where 'member' is a non-pointer member of the cred struct. For instance: +``` + uid_t task_cred_xxx(task, suid); +``` +will retrieve 'struct cred::suid' from the task, doing the appropriate RCU magic. This may not be used for pointer members as what they point to may disappear the moment the RCU read lock is dropped. + +## ALTERING CREDENTIALS + +As previously mentioned, a task may only alter its own credentials, and may not alter those of another task. This means that it doesn't need to use any locking to alter its own credentials. + +To alter the current process's credentials, a function should first prepare a new set of credentials by calling: +``` + struct cred *prepare_creds(void); +``` +this locks `current->cred_replace_mutex` and then allocates and constructs a duplicate of the current process's credentials, returning with the mutex still held if successful. It returns NULL if not successful (out of memory). + +The mutex prevents `ptrace()` from altering the ptrace state of a process whilst security checks on credentials construction and changing is taking place as the ptrace state may alter the outcome, particularly in the case of `execve()`. + +The new credentials set should be altered appropriately, and any security checks and hooks done. Both the current and the proposed sets of credentials are available for this purpose as `current_cred()` will return the current set still at this point. + +When the credential set is ready, it should be committed to the current process by calling: +``` + int commit_creds(struct cred *new); +``` +This will alter various aspects of the credentials and the process, giving the LSM a chance to do likewise, then it will use `rcu_assign_pointer()` to actually commit the new credentials to `current->cred`, it will release `current->cred_replace_mutex` to allow `ptrace()` to take place, and it will notify the scheduler and others of the changes. + +This function is guaranteed to return 0, so that it can be tail-called at the end of such functions as `sys_setresuid()`. + +Note that this function consumes the caller's reference to the new credentials. The caller should *not* call `put_cred()` on the new credentials afterwards. + +Furthermore, once this function has been called on a new set of credentials, those credentials may *not* be changed further. + +Should the security checks fail or some other error occur after `prepare_creds()` has been called, then the following function should be invoked: +``` + void abort_creds(struct cred *new); +``` +This releases the lock on `current->cred_replace_mutex` that `prepare_creds()` got and then releases the new credentials. + +A typical credentials alteration function would look something like this: +``` + int alter_suid(uid_t suid) + { + struct cred *new; + int ret; + + new = prepare_creds(); + if (!new) + return -ENOMEM; + + new->suid = suid; + ret = security_alter_suid(new); + if (ret < 0) { + abort_creds(new); + return ret; + } + + return commit_creds(new); + } +``` +### MANAGING CREDENTIALS + +There are some functions to help manage credentials: +``` + void put_cred(const struct cred *cred); +``` + This releases a reference to the given set of credentials. If the reference count reaches zero, the credentials will be scheduled for destruction by the RCU system. +``` + const struct cred *get_cred(const struct cred *cred); +``` + This gets a reference on a live set of credentials, returning a pointer to that set of credentials. +``` + struct cred *get_new_cred(struct cred *cred); +``` + This gets a reference on a set of credentials that is under construction and is thus still mutable, returning a pointer to that set of credentials. + +##OPEN FILE CREDENTIALS + +When a new file is opened, a reference is obtained on the opening task's credentials and this is attached to the file struct as `f_cred` in place of `f_uid` and `f_gid`. Code that used to access `file->f_uid` and `file->f_gid` should now access `file->f_cred->fsuid` and `file->f_cred->fsgid`. + +It is safe to access `f_cred` without the use of RCU or locking because the pointer will not change over the lifetime of the file struct, and nor will the contents of the cred struct pointed to, barring the exceptions listed above (see the Task Credentials section). + +##OVERRIDING THE VFS'S USE OF CREDENTIALS + +Under some circumstances it is desirable to override the credentials used by the VFS, and that can be done by calling into such as `vfs_mkdir()` with a different set of credentials. This is done in the following places: + + * sys_faccessat(). + + * do_coredump(). + + * nfs4recover.c. diff --git a/zh-cn/security/credentials.md b/zh-cn/security/credentials.md new file mode 100644 index 0000000..b874e9c --- /dev/null +++ b/zh-cn/security/credentials.md @@ -0,0 +1,381 @@ +>原文:[Documentation/security/credentials.txt](https://www.kernel.org/doc/Documentation/security/credentials.txt) +>翻译:@L3w1s-L1u +>校订:[TODO] + +# LINUX 信任凭据( CREDENTIALS ) + +作者:David Howells + +内容: + + * 概述。 + + * 信任凭据类型。 + + * 文件标记。 + + * 任务凭据。 + + - 不可变的信任凭据。 + - 访问任务的信任凭据。 + - 访问另一个任务的信任凭据。 + - 修改信任凭据。 + - 管理信任凭据。 + + * 打开文件信任凭据。 + + * 覆盖VFS所使用的信任凭据。 + +## 概述 + + 当 Linux 系统中的一个对象作用于另一个对象时,Linux 系统所进行的安全检查包含以下几个方面: + + 1. 对象。 + + 这里对象特指 Linux 系统内可被用户空间程序直接施加操作的实体。Linux 有多种可被操作的对象,包括: + + * 任务 + * 文件/索引节点 + * 套接字 + * 消息队列 + * 共享内存段 + * 信号量 + * 秘钥 + + Linux 提供了一个信任凭据集合作为这些对象的描述信息的组成部分。这个集合包含哪些内容则取决于对象的类型。 + + 2. 对象的所有权。 + + 绝大多数对象的信任凭据中都包含一个子集,用于表征该对象的所有权。该所有权被用来对资源进行审计和限制(例如磁盘配额和任务的 `rlimits`)。例如在一个标准 UNIX 文件系统中,所有权由索引节点上的 UID 标识。 + + 3.客体上下文。 + + 这些对象的信任凭据中也包含了一个子集,用于表征该对象的“客体上下文”。该子集不一定与2.相同——例如,在标准的 UNIX 文件中,该子集由 inode 节点上的 UID 与 GID 共同标识。 + + 4. 主体。 + + 主体是作用于另一对象的对象。系统中的多数对象是被动的:他们不会作用于系统内的其他对象。但进程或任务是个明显的例外:他们完成某项工作;他们访问并操作其他的东西。任务之外的其他对象在某些情况下也可以成为主体。例如当一个任务使用 `fcntl(F_SETOWN)` 调用打开一个文件时,该文件可能会使用该任务赋予它的 UID 和 EUID 来发送 SIGIO 信号给该任务。这种情况下,该文件的 `file struct` 结构体也具备了主体的上下文。 + + 5. 主体上下文。 + + Linux 系统对主体的信任凭据还有额外的表示。这些信任凭据中的一个子集构成了“主体的上下文”。当主体作用于其他对象时,该上下文作为系统生成安全规则的一部分参与计算。例如一个 Linux 任务有 FSUID, FSGID 以及辅助组列表作为其访问文件时的信任凭据——这通常与构成该任务客体上下文的真实 UID 和真实 GID 是完全分离的。 + + 6. 操作。 + + Linux 主体对象有一些可作用于客体对象的操作。有哪些这些操作可选则取决于主体和客体对象的特性。操作包括读,写,创建和删除文件;创建子进程,发送信号给任务以及跟踪任务等。 + + 7. 规则,访问控制列表和安全计算。 + + 当一个主体对象作用于客体时,系统就会生成相应的安全规则。这个过程包括取得主体和客体的上下文及所要进行的操作,然后在这些上下文背景下根据一套或多套规则判断该主体是否被授予相应的访问权限,以便主体以所需的方式操作客体对象。 + + 安全规则有两个主要来源: + + a. 自主访问控制( DAC ): + + 有时候系统中的对象会在其自身的描述信息中包含规则集合。这个规则集合被称为“访问控制列表”或 ACL。一个 Linux 文件可能会提供不止一个 ACL。例如,传统的 UNIX 文件都有一个权限掩码(一种缩略的 ACL),这个权限掩码包含三个固定类型的主体(“用户”、“组”和“其他人”)。每个类型的主体都可被授予特定的权限(“可读”、“可写”及“可执行”——每种都针对所访问的特定对象)。然而,由于 UNIX 文件的访问权限不允许指定任意的主体,故其使用受到一定的限制。 + + Linux 文件也可操作 POSIX 标准的 ACL。该列表能够针对任意的主体对象授予多种权限。 + + b. 强制访问控制( MAC ): + + 作为一个整体的系统可能会有一些访问规则集合适用于所有的主体和客体,而不论它们来自于哪里。 SELinux 和 Smack 就是这样的例子。对于 SELinux 和 Smack,每个对象都被打上一个标签作为其信用凭据的一部分。当需要进行某个操作时,SELinux 或 Smack 会取得主体及客体对象的标签,并寻找相应的规则以判断是否允许该操作。 + +## 信用凭据的类型 + + Linux 内核支持以下几种类型的信任凭据: + + 1. 传统的 UNIX 信用凭据。 + + 实际用户ID + 实际组ID + + 绝大部分 Linux 对象都持有 UID 和 GID,哪怕有时候它们是凭空生成出来的(例如源于 Windows 系统的 FAT 或这 CIFS 文件)。 UID 和 GID(大部分情况下)定义了对象的客体上下文,而任务在某些情况下会稍有不同。 + + 有效用户 ID(EUID),已保存用户 ID和文件系统用户 ID + 有效组 ID(EGID),已保存组 ID和文件系统组 ID + 补充组 + + 这些是只有任务会用到的附加信任凭据。通常 EUID/EGID/GROUPS 会作为主体上下文,而实际 UID/GID 会作为客体上下文。需要注意的是,对任务而言情况并不一定总是如此。 + + 2. 能力。 + + 许可能力集合 + 可继承能力集合 + 有效能力集合 + 能力边界集合 + + 这是任务所特有的。它们表征了仅能通过逐个授权的方式赋予普通任务的高级能力。对这些能力的操作既能隐式地通过修改传统的 UNIX 信任凭据来实现,也能够直接通过 `capset()` 系统调用来实现。 + + 许可能力是指那些能够通过 `capset()` 调用授予进程自身有效能力集合或许可能力集合的能力。这个可继承的能力集合也可能会非常有限。 + + 有效能力是指那些任务实际被允许使用到的能力。 + + 可继承能力是指那些能够通过 `execve()` 系统调用传递给其他主体的能力。 + + 边界集合对可能通过 `execve()` 系统调用得到继承的能力进行限定,特别是当一个二进制程序要以 UID0 的身份被执行时。 + + 3. 安全管理标志(securebits)。 + 只有任务携带有这些标志。这些安全管理标志保护着前述信任凭据的操作及通过诸如 `execve()` 系统调用等操作进行的继承。 + + 4. 密钥和密钥环。 + + 只有任务会携带它们。它们携带并缓存不属于其他标准 UNIX 信任凭据类型的安全令牌。当进程访问一个文件时,它们能让普通程序在无需操心安全相关细节的情况下就让诸如获取网络文件系统秘钥一类的事情成为可能。 + + 密钥环是一种特殊类型的密钥。他们成套携带其他的密钥并能够对这些密钥提供检索。每个进程能够订阅多种不同的密钥环: + + 线程密钥环 + 进程密钥环 + 会话密钥环 + + 当一个进程访问一个密钥,如果该密钥事先不存在,它通常会被缓存到这些密钥环中的一种之上以便将来的检索能够找到它。 + + 关于使用密钥的详细信息,请参阅 [Documentation/security/keys.txt](https://www.kernel.org/doc/Documentation/security/keys.txt)。 + + 5. LSM + + Linux 安全模块( LSM )能够在一个任务所允许的操作上叠加额外的控制。当前 Linux 支持两种主要的 LSM 替代选择:SELinux 和 Smack。 + + 两种方案都是对系统中的主客对象贴上标签,然后对这些对象应用一组规则(策略),讲清楚哪些操作贴有这些标签的主体能对贴有标签的客体做,哪些不能做。 + + 6. AF_KEY + + 这是一个基于套接字管理网络协议栈信任凭据的方法 [RFC2367](https://www.ietf.org/rfc/rfc2367.txt)。本文不讨论它因为它不直接与任务及文件信任凭据进行交互,它主要是一个系统级别的信任凭据。 + +当打开一个文件时,所创建的文件结构体记录了任务的主体上下文中的一部分。这让借助该文件结构体来进行的操作能够直接从该文件结构体使用这些信任凭据而不需要访问提交这些操作的任务所持有的主体上下文。例如在网络文件系统上打开的一个文件,其信任凭据应该提交给服务器,而不论实际是谁在这个文件上进行读或者写操作。 + +## 文件标记 + + 磁盘上或通过网络获取的文件会含有构成该文件客体上下文的标记。按照文件系统的不同可能包括以下一个或多个标记: + + * UNIX UID,GID,模式; + + * Windows 用户ID; + + * 访问控制列表(ACL); + + * LSM 安全标签; + + * UNIX EXEC 特权提升位(SUID / SGID); + + * 文件能力 EXEC 特权提升位。 + +这些标记会与任务的主体上下文进行比对,以确认特定的操作是否获准执行。例如当 `execve()` 被调用时,根据可执行文件上的这些标记,特权提升位将起作用,决定所创建的进程是否获得额外的特权。 + +## 任务的信任凭据 + + 在 Linux 中,任务的所有信任凭据都直接( UID,GID )或间接(组,密钥,LSM 安全)地源于一个带引用计数的结构体 “ struct cred ”。每个任务都通过其 `task_struct` 结构体中的 “ cred ” 指针指向其信任凭据结构体。 + + 一组信任凭据一旦制作好并提交就不能更改,以下情况除外: + + 1. 其引用计数值可被改变; + + 2. 其所指向的 `group_info` 结构体的引用计数值可以改变; + + 3. 其所指向的安全数据的引用计数值可以改变; + + 4. 其所指向的任何密钥环的引用计数值可以改变; + + 5. 其所指向的任何密钥环可以被吊销,过期或改变安全属性; + + 6. 其所指向的任何密钥环中的内容可以被改变(密钥环作为一组共享的信任凭据,任何具有恰当访问权限的人均可对其进行修改的意义所在)。 + +要想修改 `cred struct` 结构体中的任何东西,必须遵守复制-替换原则。首先取得一个拷贝,在拷贝上进行修改,然后使用 RCU 锁机制来修改任务结构体中的指针使其指向新的拷贝。内核提供了辅助函数实现这个操作(见后续介绍)。 + +任务只能改变其*自身*的信任凭据;今后将不再允许一个任务更改其他任务的信任凭据。这意味着 `capset()` 系统调用将不能使用任意的 PID 而只能使用当前任务的 PID。并且当需要创建密钥环的进程请求实例化进程为其创建密钥环时,也不再允许其 `keyctl_instantiate()` 与 `keyctl_negate()` 调用与密钥环绑定。 + +### 不可变更信任凭据 + +一旦一个信任凭据集合被公布(例如通过调用 `commit_creds()` ),就应该被认为是不可变更的,只有以下两种情况除外: + + 1. 引用计数值是可以修改的。 + + 2. 尽管密钥环订阅的信任凭据集合不可变更,但密钥环的内容是可以变更的。 + +为了防止在编译期意外地改变了信任凭据,`task_struct` 结构体指向其信任凭据集合的指针是*const*类型,如同文件结构体。并且由于某些函数如 `get_cred()` 及 `put_cred()` 等是操作 const 类型指针的,因此类型转换就不是必要的了,但还是要暂时取消 const 限制以便能修改引用计数。 + +### 访问任务的信任凭据 + + 只允许任务更改自己的信任凭据极大地简化了操作——当前的任务可以不需要任何形式的锁就能读取并替换自身信任凭据。一个任务可以调 +``` + const struct cred *current_cred(); +``` + 来获取一个指向其信任凭据结构体的指针,使用完毕也不用释放。 + + 内核提供了一些辅助函数为获取任务信任凭据中的特定成员提供便利(每个函数中都只是简单地返回该成员值): +``` + uid_t current_uid(void) 当前真实UID + gid_t current_gid(void) 当前真实GID + gid_t current_euid(void) 当前有效UID + gid_t current_egid(void) 当前有效GID + uid_t current_fsuid(void) 当前文件访问UID + gid_t current_fsgid(void) 当前文件访问GID + kernel_cap_t current_cap(void) 当前有 + void *current_security(void) 当前的LSM指针 + struct user_struct *current_user(void) 当前用户帐户 +``` + 内核同样提供了一些辅助函数用于获取任务的信任凭据对: + +``` + void current_uid_gid(uid_t *, gid_t *); + void current_euid_egid(uid_t *, gid_t *); + void current_fsuid_fsgid(uid_t *, gid_t *); +``` + + 这些函数从当前任务的信任凭据中取得相应的值,然后通过参数成对地返回。 + + 此外,内核还提供一个函数用于获取当前进程的当前信任凭据集合: + +``` + const struct cred *get_current_cred(void); +``` + 还有以下函数,用于获取实际并不存放在 cred 结构体中的信任凭据: + +``` + struct user_struct *get_current_user(void); + struct group_info *get_current_groups(void); +``` + + 分别用于获取对当前进程的用户帐户结构体的引用,以及对补充组列表的引用。 + + 一旦获取到一个引用,就必须酌情使用 `put_cred()` , `free_uid()` 或者 `put_group_info()` 来释放。 + +### 访问其他任务的信任凭据 + + 虽然一个任务可以访问其自身的信任凭据而不需要加锁,但访问另一个任务的信任凭据就不是这样了。这时必须使用 RCU 读锁和 `rcu_dereference()`。 + + 该 `rcu_dereference()` 被下面的函数封装: +``` + const struct cred *__task_cred(struct task_struct *task); +``` + 这个函数应该在 RCU 读锁内使用,例如: + +``` + void foo(struct task_struct *t, struct foo_data *f) + { + const struct cred *tcred; + ... + rcu_read_lock(); + tcred = __task_cred(t); + f->uid = tcred->uid; + f->gid = tcred->gid; + f->groups = get_group_info(tcred->groups); + rcu_read_unlock(); + ... + } +``` + + 如果需要长时间持有其他任务的信任凭据,特别是其间可能睡眠的话,调用者应该使用下面的函数来获取对这些信任凭据的引用: +``` + const struct cred *get_task_cred(struct task_struct *task); +``` + 所有的 RCU 魔法都藏在这个函数里面了。调用者必须在使用完这些信任凭据后调用 `put_cred()` 来进行释放。 + +***注:*** `__task_cred()` 的结果不能直接传递给 `get_cred()` ,因为这可能会与 `commit_cred()` 形成竟争。 + + 还有一些辅助函数用来方便访问其他任务的信任凭据数据,这使得 RCU 魔法对调用者变得透明: +``` + uid_t task_uid(task) 任务的真实UID + uid_t task_euid(task) 任务的有效UID +``` + 如果调用者这个时候持有 RCU 读锁,那么应该用以下两个函数替代上面的函数: +``` + __task_cred(task)->uid + __task_cred(task)->euid +``` + 类似地,如果需要访问一个任务信任凭据的多个成员,那么也应该使用 RCU 读锁,调用 `__task_cred()`,结果存在一个临时的指针变量中,然后取得信任凭据的各个成员,最后再释放锁。这样做能够避免潜在的耗时的 RCU 魔法被多次调用。 + + 如果需要访问其他任务的单个信任凭据成员,那么可以使用: +``` + task_cred_xxx(task,memeber) +``` +其中,'member'是 cred 结构体的非指针成员。例如: +``` + uid_t task_cred_xxx(task,suid); +``` + 通过恰当的 RCU 魔法从任务中取得 `struct cred::suid`。这个函数不能用于 cred 的指针成员,因为指针成员所指向的内容可能在释放 RCU 读锁的那一刻已经消失了。 + +### 变更信任凭据 + + 如前所述,一个任务仅能改变其自身的信任凭据,而不能改变另一个任务的信任凭据。这意味着,它不需要使用任何形式的锁来改变自身的信任凭据。 + + 如果在一个函数内要改变当前进程的信任凭据,首先需要调用以下函数准备一套新的信任凭据: +``` + struct cred *prepare_creds(void); +``` + 这将锁定`current->cred_replace_mutex`而后分配空间并构造一个当前进程的信任凭据拷贝。如果操作成功则仍然持有该互斥锁。如果失败则返回 NULL(内存不足)。 + + 互斥锁可防止 `ptrace()` 改变进程的 ptrace 状态,当对构造和变更信任凭据进行安全检查时,ptrace 状态可能会改变其结果,尤其是使用`execve()`时。 + + 改变新的信任凭据集合需要谨慎操作,安全检查和钩子函数一个都不能少。这是因为此时当前信任凭据和新信任凭据同时存在,`current_cred()` 仍将返回当前信任凭据集合。 + + 当信任凭据集合准备好之后,需要用下列函数将其提交给当前进程: +``` + int commit_creds(struct cred *new); +``` + 这将从多个方面改变信任凭据和进程,也给 LSM 提供机会对其进行修改,之后该函数会调用 `rcu_assign_pointer()` 真正将新的信任凭据提交给 `current->cred`,它会释放 `current->cred_replace_mutex` 以便 `ptrace()` 能够发生,并向调度器等通告这些变化。 + + 这个函数的返回值确定为0,以便在诸如 `sys_setresuid()` 等函数的结尾处 tail-called。 + + 注意该函数会释放调用者持有的对新信任凭据的引用。调用者*不能*在这之后又去对新的信任凭据调用 `put_cred()`。 + + 而且还需要注意,一旦这个函数在新的信任凭据集合上被调用过了,这些信任凭据就_不能_再改动。 + + 如果在调用 `prepare_creds()` 时出现安全检查失败或者其他错误,那么需要调用: +``` + void abort_creds(struct cred *new); +``` + 这将会释放掉 `prepare_cred()` 所持有的 `current->cred_replace_mutex` 上的锁之后再释放掉新的信任凭据。 + + 一个典型的信任凭据变更函数会是这个样子: + +``` + int alter\_suid(uid\_t suid) + { + struct cred \*new; + int ret; + + new = prepare_creds(); + if (!new) + return -ENOMEM; + + new->suid = suid; + ret = security_alter_suid(new); + if (ret < 0) { + abort_creds(new); + return ret; + } + + return commit_creds(new); + } + +``` + +### 管理信任凭据 + + 有以下函数可协助管理信任凭据: +``` + void put_cred(const struct cred *cred); +``` + 该函数释放一次对给定信任凭据集合的引用。如果引用计数值归零,那么信任凭据将被 RCU 系统安排销毁。 +``` + const struct cred *get_cred(const struct cred *cred); +``` + 该函数取得一个正在使用的信任凭据集合的引用,并返回该指针。 +``` + struct cred *get_new_cred(struct cred *cred); +``` + 该函数取得一个正在构造的,因而是可变的信任凭据集合的引用,并返回该指针。 + +## 打开文件信任凭据 + + 当打开一个新文件时,内核获取到该打开任务的信任凭据并将其作为 `f_cred` 附加到文件结构体上以取代 `f_uid` 和 `f_gid`。之前访问 `file->f_uid` 及 `file->f_gid` 的代码需要调整为访问 `file->f_cred->f_suid` 及 `file->f_cred->f_sgid`。这时不使用 RCU 或其他锁机制访问 `f_cred` 是安全的,因为在文件结构体的整个生命周期内不论是信任凭据结构体还是指向它的指针都是不会被改变的,只有前面所列出的情形除外(参见任务的信任凭据一节)。 + +## 覆盖 VFS 所使用的信任凭据 + + 在某些情况下,会希望能覆盖 VFS 所使用的信任凭据,这可以使用另外的信任凭据集合来调用诸如 `vfs_mkdir()` 以达到此目的。该调用发生在以下函数中: + + * sys_faccessat() + + * do_coredump() + + * nfs4recover.c