In this guide you’ll find a full list of Linux syscalls along with their definition, parameters, and commonly used flags.
You can combine multiple flags by using a logical AND and passing the result to the argument in question.
Some notes about this guide:
- Calls that have been long depreciated or removed have been omitted.
- Items pertaining to outdated or infrequently used architectures (i.e. MIPS, PowerPC) are generally omitted.
- Structures are defined only once. If a
struct
is mentinoned and cannot be found in the syscall, please search the document for its definition.
Source materials include man pages, kernel source, and kernel development headers.
Table of Contents
- List of Linux Syscalls
- Table of Contents
- read
- write
- open
- close
- stat
- fstat
- lstat
- poll
- lseek
- mmap
- mprotect
- munmap
- brk
- rt_sigaction
- rt_sigprocmask
- rt_sigreturn
- ioctl
- pread64
- pwrite64
- readv
- writev
- access
- pipe
- select
- sched_yield
- mremap
- msync
- mincore
- madvise
- shmget
- shmat
- shmctl
- dup
- dup2
- pause
- nanosleep
- getitimer
- alarm
- setitimer
- getpid
- sendfile
- socket
- connect
- accept
- sendto
- recvfrom
- sendmsg
- recvmsg
- shutdown
- bind
- listen
- getsockname
- getpeername
- socketpair
- setsockopt
- getsockopt
- clone
- fork
- vfork
- execve
- exit
- wait4
- kill
- getppid
- uname
- semget
- semop
- semctl
- shmdt
- msgget
- msgsnd
- msgrcv
- msgctl
- fcntl
- flock
- fsync
- fdatasync
- truncate
- ftruncate
- getdents
- getcwd
- chdir
- fchdir
- rename
- mkdir
- rmdir
- creat
- link
- unlink
- symlink
- readlink
- chmod
- fchmod
- chown
- fchown
- lchown
- umask
- gettimeofday
- getrlimit
- getrusage
- sysinfo
- times
- ptrace
- getuid
- syslog
- getgid
- setuid
- setgid
- geteuid
- getegid
- setpgid
- getppid
- getpgrp
- setsid
- setreuid
- setregid
- getgroups
- setgroups
- setresuid
- setresgid
- getresuid
- getresgid
- getpgid
- setfsuid
- setfsgid
- getsid
- capget
- capset
- rt_sigpending
- rt_sigtimedwait
- rt_sigqueueinfo
- rt_sigsuspend
- sigaltstack
- utime
- mknod
- uselib
- personality
- ustat
- statfs
- fstatfs
- sysfs
- getpriority
- setpriority
- sched_setparam
- sched_getparam
- sched_setscheduler
- sched_getscheduler
- sched_get_priority_max
- sched_get_priority_min
- sched_rr_get_interval
- mlock
- munlock
- mlockall
- munlockall
- vhangup
- modify_ldt
- pivot_root
- prctl
- arch_prctl
- adjtimex
- setrlimit
- chroot
- sync
- acct
- settimeofday
- mount
- umount2
- swapon
- swapoff
- reboot
- sethostname
- setdomainname
- iopl
- ioperm
- init_module
- delete_module
- quotactl
- gettid
- readahead
- setxattr
- lsetxattr
- fsetxattr
- getxattr
- lgetxattr
- fgetxattr
- listxattr
- llistxattr
- flistxattr
- removexattr
- lremovexattr
- fremovexattr
- tkill
- time
- futex
- sched_setaffinity
- sched_getaffinity
- set_thread_area
- io_setup
- io_destroy
- io_getevents
- io_submit
- io_cancel
- get_thread_area
- lookup_dcookie
- epoll_create
- getdents64
- set_tid_address
- restart_syscall
- semtimedop
- fadvise64
- timer_create
- timer_settime
- timer_gettime
- timer_getoverrun
- timer_delete
- clock_settime
- clock_gettime
- clock_getres
- clock_nanosleep
- exit_group
- epoll_wait
- epoll_ctl
- tgkill
- utimes
- mbind
- set_mempolicy
- get_mempolicy
- mq_open
- mq_unlink
- mq_timedsend
- mq_timedreceive
- mq_notify
- kexec_load
- waitid
- add_key
- request_key
- keyctl
- ioprio_set
- ioprio_get
- inotify_init
- inotify_add_watch
- inotify_rm_watch
- migrate_pages
- openat
- mkdirat
- mknodat
- fchownat
- unlinkat
- renameat
- linkat
- symlinkat
- readlinkat
- fchmodat
- faccessat
- pselect6
- ppoll
- unshare
- set_robust_list
- get_robust_list
- splice
- tee
- sync_file_range
- vmsplice
- move_pages
- utimensat
- epoll_pwait
- signalfd
- timerfd_create
- eventfd
- fallocate
- timerfd_settime
- timerfd_gettime
- accept4
- signalfd4
- eventfd2
- epoll_create1
- dup3
- pipe2
- inotify_init1
- preadv
- pwritev
- rt_tgsigqueueinfo
- perf_event_open
- recvmmsg
- fanotify_init
- fanotify_mark
- name_to_handle_at
- open_by_handle_at
- syncfs
- sendmmsg
- setns
- getcpu
- process_vm_readv
- process_vm_writev
- kcmp
- finit_module
read
Reads from a specified file using a file descriptor. Before using this call, you must first obtain a file descriptor using the open
syscall. Returns bytes read successfully.
fd
– file descriptorbuf
– pointer to the buffer to fill with read contentscount
– number of bytes to read
write
Writes to a specified file using a file descriptor. Before using this call, you must first obtain a file descriptor using the open
syscall. Returns bytes written successfully.
fd
– file descriptorbuf
– pointer to the buffer to writecount
– number of bytes to write
open
Opens or creates a file, depending on the flags passed to the call. Returns an integer with the file descriptor.
pathname
– pointer to a buffer containing the full path and filenameflags
– integer with operation flags (see below)mode
– (optional) defines the permissions mode if file is to be created
open flags
O_APPEND
– append to existing fileO_ASYNC
– use signal-driven IOO_CLOEXEC
– use close-on-exec (avoid race conditions and lock contentions)O_CREAT
– create file if it doesn’t existO_DIRECT
– bypass cache (slower)O_DIRECTORY
– fail if pathname isn’t a directoryO_DSYNC
– ensure output is sent to hardware and metadata written before returnO_EXCL
– ensure creation of fileO_LARGEFILE
– allows use of file sizes represented byoff64_t
O_NOATIME
– do not increment access time upon openO_NOCTTY
– if pathname is a terminal device, don’t become controlling terminalO_NOFOLLOW
– fail if pathname is symbolic linkO_NONBLOCK
– if possible, open file with non-blocking IOO_NDELAY
– same asO_NONBLOCK
O_PATH
– open descriptor for obtaining permissions and status of a file but does not allow read/write operationsO_SYNC
– wait for IO to complete before returningO_TMPFILE
– create an unnamed, unreachable (via any other open call) temporary fileO_TRUNC
– if file exists, ovewrite it (careful!)
close
Close a file descriptor. After successful execution, it can no longer be used to reference the file.
fd
– file descriptor to close
stat
Returns information about a file in a structure named stat
.
path
– pointer to the name of the filebuf
– pointer to the structure to receive file information
On success, the buf
structure is filled with the following data:
struct stat { dev_t st_dev; /* device ID of device with file */ ino_t st_ino; /* inode */ mode_t st_mode; /* permission mode */ nlink_t st_nlink; /* number of hard links to file */ uid_t st_uid; /* owner user ID */ gid_t st_gid; /* owner group ID */ dev_t st_rdev; /* device ID (only if device file) */ off_t st_size; /* total size (bytes) */ blksize_t st_blksize; /* blocksize for I/O */ blkcnt_t st_blocks; /* number of 512 byte blocks allocated */ time_t st_atime; /* last access time */ time_t st_mtime; /* last modification time */ time_t st_ctime; /* last status change time */ };
fstat
Works exactly like the stat
syscall except a file descriptor (fd
) is provided instead of a path.
fd
– file descriptorbuf
– pointer to stat buffer (described instat
syscall)
Return data in buf
is identical to the stat
call.
lstat
Works exactly like the stat
syscall, but if the file in question is a symbolic link, information on the link is returned rather than its target.
path
– full path to filebuf
– pointer to stat buffer (described instat
syscall)
Return data in buf
is identical to the stat
call.
poll
Wait for an event to occur on the specified file descriptor.
fds
– pointer to an array ofpollfd
structures (described below)nfds
– number ofpollfd
items in thefds
arraytimeout
– sets the number of milliseconds the syscall should block (negative forcespoll
to return immediately)
struct pollfd { int fd; /* file descriptor */ short events; /* events requested for polling */ short revents; /* events that occurred during polling */ };
lseek
This syscall repositions the read/write offset of the associated file descriptor. Useful for setting the position to a specific location to read or write starting from that offset.
fd
– file descriptoroffset
– offset to read/write fromwhence
– specifies offset relation and seek behavior
whence flags
SEEK_SET
–offset
is the absolute offset position in the fileSEEK_CUR
–offset
is the current offset location plusoffset
SEEK_END
–offset
is the file size plusoffset
SEEK_DATA
– set offset to next location greater or equal tooffset
that contains dataSEEK_HOLE
– set offset to next hole in file great or equal tooffset
Returns resulting offset in bytes from the start of the file.
mmap
Maps files or devices into memory.
addr
– location hint for mapping location in memory, otherwise, if NULL, kernel assigns addresslength
– length of the mappingprot
– specifies memory protection of the mappingflags
– control visibility of mapping with other processesfd
– file descriptoroffset
– file offset
Returns a pointer to the mapped file in memory.
prot flags
PROT_EXEC
– allows execution of mapped pagesPROT_READ
– allows reading of mapped pagesPROT_WRITE
– allows mapped pages to be writtenPROT_NONE
– prevents access of mapped pages
flags
MAP_SHARED
– allows other processes to use this mappingMAP_SHARED_VALIDATE
– same asMAP_SHARED
but ensures all flags are validMAP_PRIVATE
– prevents other processes from using this mappingMAP_32BIT
– tells the kernel to locate mapping in the first 2 GB of RAMMAP_ANONYMOUS
– lets the mapping not be backed by any file (thus ignoringfd)
MAP_FIXED
– treatsaddr
argument as an actual address and not a hintMAP_FIXED_NOREPLACE
– same asMAP_FIXED
but prevents clobbering existing mapped rangesMAP_GROWSDOWN
– tells the kernel to expand mapping downward in RAM (useful for stacks)MAP_HUGETB
– forces use of huge pages in mappingMAP_HUGE_1MB
– use withMAP_HUGETB
to set 1 MB pagesMAP_HUGE_2MB
– use withMAP_HUGETB
to set 2 MB pagesMAP_LOCKED
– maps the region to be locked (similar behavior tomlock
)MAP_NONBLOCK
– prevents read-ahead for this mappingMAP_NORESERVE
– prevents allocation of swap space for this mappiningMAP_POPULATE
– tells the kernel to populate page tables for this mapping (causing read-ahead)MAP_STACK
– tells the kernel to allocate address suitable for use in a stackMAP_UNINITIALIZED
– prevents clearing of anonymous pages
mprotect
Sets or adjusts protection on a region of memory.
addr
– pointer to region in memoryprot
– protection flag
Returns zero when successful.
prot flags
PROT_NONE
– prevents access to memoryPROT_READ
– allows reading of memoryPROT_EXEC
– allows execution of memoryPROT_WRITE
– allows memory to be modifiedPROT_SEM
– allows memory to be used in atomic operationsPROT_GROWSUP
– sets protection mode upward (for arcitectures that have stack that grows upward)PROT_GROWSDOWN
– sets protection mode downward (useful for stack memory)
munmap
Unmaps mapped files or devices.
addr
– pointer to mapped addresslen
– size of mapping
Returns zero when successful.
brk
Allows for altering the program break that defines end of process’s data segment.
addr
– new program break address pointer
Returns zero when successful.
rt_sigaction
Change action taken when process receives a specific signal (except SIGKILL
and SIGSTOP
).
signum
– signal numberact
– structure for the new actionoldact
– structure for the old action
struct sigaction { void (*sa_handler)(int); void (*sa_sigaction)(int, siginfo_t *, void *); sigset_t sa_mask; int sa_flags; void (*sa_restorer)(void); };
siginfo_t { int si_signo; /* signal number */ int si_errno; /* errno value */ int si_code; /* signal code */ int si_trapno; /* trap that caused hardware signal (unusued on most architectures) */ pid_t si_pid; /* sending PID */ uid_t si_uid; /* real UID of sending program */ int si_status; /* exit value or signal */ clock_t si_utime; /* user time consumed */ clock_t si_stime; /* system time consumed */ sigval_t si_value; /* signal value */ int si_int; /* POSIX.1b signal */ void *si_ptr; /* POSIX.1b signal */ int si_overrun; /* count of timer overrun */ int si_timerid; /* timer ID */ void *si_addr; /* memory location that generated fault */ long si_band; /* band event */ int si_fd; /* file descriptor */ short si_addr_lsb; /* LSB of address */ void *si_lower; /* lower bound when address vioation occured */ void *si_upper; /* upper bound when address violation occured */ int si_pkey; /* protection key on PTE causing faut */ void *si_call_addr; /* address of system call instruction */ int si_syscall; /* number of attempted syscall */ unsigned int si_arch; /* arch of attempted syscall */ }
rt_sigprocmask
Retreive and/or set the signal mask of the thread.
how
– flag to determine call behaviorset
– new signal mask (NULL to leave unchanged)oldset
– previous signal mask
Returns zero upon success.
how flags
SIG_BLOCK
– set mask to block according toset
SIG_UNBLOCK
– set mask to allow according toset
SIG_SETMASK
– set mask toset
rt_sigreturn
Return from signal handler and clean the stack frame.
ioctl
Set parameters of device files.
d
– open file descriptor the device filerequest
– request code...
– untyped pointer
Returns zero upon success in most cases.
pread64
Read from file or device starting at a specific offset.
fd
– file descriptorbuf
– pointer to read buffercount
– bytes to readoffset
– offset to read from
Returns bytes read.
pwrite64
Write to file or device starting at a specific offset.
fd
– file descriptorbuf
– pointer to buffercount
– bytes to writeoffset
– offset to start writing
Returns bytes written.
readv
Read from file or device into multiple buffers.
fd
– file descriptoriov
– pointer to iovec structueiovcnt
– number of buffers (described by iovec)
struct iovec { void *iov_base; /* Starting address */ size_t iov_len; /* Number of bytes to transfer */ };
Returns bytes read.
writev
Write to file or device from multiple buffers.
fd
– file descriptoriov
– pointer to iovec structueiovcnt
– number of buffers (described by iovec)
struct iovec { void *iov_base; /* Starting address */ size_t iov_len; /* Number of bytes to transfer */ };
Returns bytes written.
access
Check permissions of current user for a specified file or device.
pathname
– file or devicemode
– permissions check to perform
Returns zero on success.
pipe
Create a pipe.
pipefd
– array of file descriptors with two ends of the pipe
Returns zero on success.
select
Wait for file descriptors to become ready for I/O.
struct timeval *timeout)
nfds
– number of file desctipros to monitor (add 1)readfds
– fixed buffer with list of file descriptors to wait for read accesswritefds
– fixed buffer with list of file descriptors to wait for write accessexceptfds
– fixed buffer with list of file descriptors to wait for exceptional conditionstimeout
– timeval structure with time to wait before returning
typedef struct fd_set { u_int fd_count; SOCKET fd_array[FD_SETSIZE]; }
struct timeval { long tv_sec; /* seconds */ long tv_usec; /* microseconds */ };
Returns number of file descriptors, or zero if timeout occurs.
sched_yield
Yield CPU time back to the kernel or other processes.
Returns zero on success.
mremap
Shrink or enlarge a memory region, possibly moving it in the process.
*new_address */)
old_address
– pointer to the old address to remapold_size
– size of old memory regionnew_size
– size of new memory regionflags
– define additional behavior
flags
MREMAP_MAYMOVE
– allow the kernel to move the region if there isn’t enough room (default)MREMAP_FIXED
– move the mapping (must also specifyMREMAP_MAYMOVE
)
msync
Syncronize a memory-mapped file previously mapped with mmap
.
addr
– address of memoy mapped filelength
– length of memory mappingflags
– define additional behavior
flags
MS_ASYNC
– schedule sync but return immediatelyMS_SYNC
– wait until sync is completeMS_INVALIDATE
– invalidate other mappings of same file
Returns zero on success.
mincore
Check if pages are in memory.
addr
– address of memory to checklength
– length of memory segmentvec
– pointer to array sized to(length+PAGE_SIZE-1) / PAGE_SIZE
that is clear if page is in memory
Returns zero, but vec
must be referenced for presence of pages in memory.
madvise
Advise kernel on how to use a given memory segment.
addr
– address of memorylength
– length of segmentadvice
– advice flag
advice
MADV_NORMAL
– no advice (default)MADV_RANDOM
– pages can be in random order (read-ahead performance may be hampered)MADV_SEQUENTIAL
– pages should be in sequential orderMADV_WILLNEED
– will need pages soon (hinting to kernel to schedule read-ahead)MADV_DONTNEED
– do not need anytime soon (discourages read-ahead)
shmget
Allocate System V shared memory segment.
key
– an identifier for the memory segmentsize
– length of memory segmentshmflg
– behavior modifier flag
shmflg
IPC_CREAT
– create a new segmentIPC_EXCL
– ensure creation happens, else call will failSHM_HUGETLB
– use huge pages when allocating segmentSHM_HUGE_1GB
– use 1 GB hugetlb sizeSHM_HUGE_2M
– use 2 MB hugetlb sizeSHM_NORESERVE
– do not reserve swap space for this segment
shmat
Attach shared memory segment to calling process’s memory space.
shmid
– shared memory segment idshmaddr
– shared memory segment addressshmflg
– define additional behavior
shmflg
SHM_RDONLY
– attach segment as read-onlySHM_REMAP
– replace exiting mapping
shmctl
Get or set control details on shared memory segment.
shmid
– shared memory segment idcmd
– command flagbuf
–shmid_ds
structure buffer for return or set parameters
struct shmid_ds { struct ipc_perm shm_perm; /* Ownership and permissions */ size_t shm_segsz; /* Size of shared segment (bytes) */ time_t shm_atime; /* Last attach time */ time_t shm_dtime; /* Last detach time */ time_t shm_ctime; /* Last change time */ pid_t shm_cpid; /* PID of shared segment creator */ pid_t shm_lpid; /* PID of last shmat(2)/shmdt(2) syscall */ shmatt_t shm_nattch; /* Number of current attaches */ ... };
struct ipc_perm { key_t __key; /* Key providedto shmget */ uid_t uid; /* Effective UID of owner */ gid_t gid; /* Effective GID of owner */ uid_t cuid; /* Effective UID of creator */ gid_t cgid; /* Effective GID of creator */ unsigned short mode; /* Permissions and SHM_DEST + SHM_LOCKED flags */ unsigned short __seq; /* Sequence */ };
Successful IPC_INFO or SHM_INFO syscalls return index of highest used entry in the kernel’s array of shared memory segments. Successful SHM_STAT syscalls return id of memory segment provided in shmid. Everything else returns zero upon success.
cmd
IPC_STAT
– get shared memory segment info and place in bufferIPC_SET
– set shared memory segment parameters defined in bufferIPC_RMID
– mark shared memory segment to be removed
dup
Duplicate file desciptor.
oldfd
– file descriptor to copy
Returns new file descriptor.
dup2
Same as dup
except dup2
uses file descriptor number specified in newfd
.
oldfd
– file descriptor to copynewfd
– new file descriptor
pause
Wait for a signal, then return.
Returns -1 when signal received.
nanosleep
Same as sleep
but with time specified in nanoseconds.
req
– pointer to syscall argument structurerem
– pointer to structure with remaining time if interrupted by signal
struct timespec { time_t tv_sec; /* time in seconds */ long tv_nsec; /* time in nanoseconds */ };
Returns zero upon successful sleep, otherwise time elapsed is copied into rem
structure.
getitimer
Get value from an interval timer.
which
– which kind of timercurr_value
– pointer toitimerval
structure with argument details
struct itimerval { struct timeval it_interval; /* Interval for periodic timer */ struct timeval it_value; /* Time until next expiration */ };
Returns zero on success.
which timers
ITIMER_REAL
– timer uses real timeITIMER_VIRTUAL
– timer uses user-mode CPU execution timeITIMER_PROF
– timer uses both user and system CPU execution time
alarm
Set an alarm for delivery of signal SIGALRM
.
seconds
– sendSIGALRM
in x seconds
Returns number of seconds remaining until a previously set alarm will trigger, or zero if no alarm was previously set.
setitimer
Create or destroy alarm specified by which
.
which
– which kind of timernew_value
– pointer toitimerval
structure with new timer detailsold_value
– if not null, pointer toitimerval
structure with previous timer details
struct itimerval { struct timeval it_interval; /* Interval for periodic timer */ struct timeval it_value; /* Time until next expiration */ };
Returns zero on success.
getpid
Get PID of current process.
Returns the PID of the process.
sendfile
Transfer data between two files or devices.
out_fd
– file descriptor for destinationin_fd
– file descriptor for sourceoffset
– position to begin readcount
– bytes to copy
Returns bytes written.
socket
Create an endpoint for network communication.
domain
– flag specifying type of sockettype
– flag specifying socket specificsprotocol
– flag specifying protocol for communication
domain flags
AF_UNIX
– Local communicationAF_LOCAL
– Same as AF_UNIXAF_INET
– IPv4 Internet protocolAF_AX25
– Amateur radio AX.25 protocolAF_IPXIPX
– Novell protocolsAF_APPLETALK
– AppleTalkAF_X25
– ITU-T X.25 / ISO-8208 protocolAF_INET6
– IPv6 Internet protocolAF_DECnet
– DECet protocol socketsAF_KEYKey
– IPsec management protocolAF_NETLINK
– Kernel user interface deviceAF_PACKET
– Low-level packet interfaceAF_RDS
– Reliable Datagram Sockets (RDS)AF_PPPOX
– Generic PPP transport layer for L2 tunnels (L2TP, PPPoE, etc.)AF_LLC
– Logical link control (IEEE 802.2 LLC)AF_IB
– InfiniBand native addressingAF_MPLS
– Multiprotocol Label SwitchingAF_CAN
– Controller Area Network automotive bus protocolAF_TIPC
– TIPC (cluster domain sockets)AF_BLUETOOTH
– Bluetooth low-level socket protocolAF_ALG
– Interface to kernel cryptography APIAF_VSOCK
– VSOCK protocol for hypervisor-guest communication (VMWare, etc.)AF_KCMKCM
– Kernel connection multiplexor interfaceAF_XDPXDP
– Express data path interface
type flags
SOCK_STREAM
– sequenced, reliable byte streamsSOCK_DGRAM
– datagrams (connectionless and unreliable messages, fixed maximum length)SOCK_SEQPACKET
– sequenced, reliable transmission for datagramsSOCK_RAW
– raw network protocol accessSOCK_RDM
– reliable datagram layer with possible out-of-order transmissionSOCK_NONBLOCK
– socket is non-blocking (avoid extra calls to fcntl)SOCK_CLOEXEC
– set close-on-exec flag
Returns file descriptor on success.
connect
Connect to a socket.
sockfd
– socket file descriptoraddr
– pointer to socket addressaddrlen
– size of address
Returns zero on success.
accept
Accept connection on socket.
sockfd
– socket file descriptoraddr
– pointer to socket addressaddrlen
– size of address
Returns file descriptor of accepted socket on success.
sendto
Send message on a socket.
sockfd
– socket file descriptorbuf
– buffer with message to sendlen
– length of messageflags
– additional parameters
flags
MSG_CONFIRM
– informs link layer a reply has been receivedMSG_DONTROUTE
– do not use gateway in transmission of packetMSG_DONTWAIT
– perform non-blocking operationMSG_EOR
– end of recordMSG_MORE
– more data to sendMSG_NOSIGNAL
– do not generate SIGPIPE signal if peer closed connectionMSG_OOB
– sends out-of-band data on supported sockets and protocols
recvfrom
Receive message from socket.
*src_addr, socklen_t *addrlen)
sockfd
– socket file descriptorbuf
– buffer to receive messagesize
– size of bufferflags
– additional parameterssrc_addr
– pointer to source addressaddrlen
– length of source address.
flags
MSG_CMSG_CLOEXEC
– set close-on-exec flag for socket file descriptorMSG_DONTWAIT
– perform operation in a non-blocking mannerMSG_ERRQUEUE
– queued errors should be received in socket error queue
Returns bytes received successfully.
sendmsg
Similar to the sendto
syscall but allows sending additional data via the msg
argument.
sockfd
– socket file descriptormsg
– pointer to msghdr structure with message to send (with headers)flags
– same assendto
syscall
struct msghdr { void *msg_name; /* optional address */ socklen_t msg_namelen; /* address size */ struct iovec *msg_iov; /* scatter/gather array */ size_t msg_iovlen; /* number of array elements in msg_iov */ void *msg_control; /* ancillary data */ size_t msg_controllen; /* ancillary data length */ int msg_flags; /* flags on received message */ };
recvmsg
Receive message from socket.
sockfd
– socket file descriptormsg
– pointer to msghdr structure (defined insendmsg
above) to receiveflags
– define additional behavior (seesendto
syscall)
shutdown
Shut down full-duplex connection of a socket.
sockfd
– socket file descriptorhow
– flags definining additional behavior
Returns zero on success.
how
SHUT_RD
– prevent further receptionsSHUT_WR
– prevent further transmissionsSHUT_RDWR
– prevent further reception and transmission
bind
Bind name to a socket.
sockfd
– socket file descriptoraddr
– pointer to sockaddr structure with socket addressaddrlen
– length of address
struct sockaddr { sa_family_t sa_family; char sa_data[14]; }
Returns zero on success.
listen
Listen on a socket for connections.
sockfd
– socket file descriptorbacklog
– maximum length for pending connection queue
Returns zero on success.
getsockname
Get socket name.
sockfd
– socket file descriptoraddr
– pointer to buffer where socket name will be returnedaddrlen
– length of buffer
Returns zero on success.
getpeername
Get the name of the connected peer socket.
sockfd
– socket file descriptoraddr
– pointer to buffer where peer name will be returnedaddrlen
– length of buffer
Returns zero on success.
socketpair
Create pair of sockets already connected.
Arguments are identical to socket
syscall except fourth argument (sv
) is an integer array that is filled with the two socket descriptors.
Returns zero on success.
setsockopt
Set options on a socket.
sockfd
– socket file descriptoroptname
– option to setoptval
– pointer to the value of the optionoptlen
– length of option
Returns zero on success.
getsockopt
Get current options of a socket.
sockfd
– socket file descriptoroptname
– option to getoptval
– pointer to receive option valueoptlen
– length of option
Returns zero on success.
clone
Create child process.
/* pid_t *parent_tid, void *tls, pid_t *child_tid */)
fd
– pointer to initial execution addressstack
– pointer to child process’s stackflag
– define behavior of clone syscallarg
– pointer to arguments for child process
flags
CLONE_CHILD_CLEARTID
– clear id of child thread at location referenced by child_tldCLONE_CHILD_SETTID
– store id of child thread at location referenced by child_tidCLONE_FILES
– parent and child process share same file descriptorsCLONE_FS
– parent and child process share same filesystem informationCLONE_IO
– child process shares I/O context with parentCLONE_NEWCGROUP
– child is created in new cgroup namespaceCLONE_NEWIPC
– child process created in new IPC namespaceCLONE_NEWNET
– create child in new network namespaceCLONE_NEWNS
– create child in new mount namespaceCLONE_NEWPID
– create child in new PID namespaceCLONE_NEWUSER
– create child in new user namespaceCLONE_NEWUTS
– create child process in new UTS namespaceCLONE_PARENT
– child is clone of the calling processCLONE_PARENT_SETTID
– store id of child thread at location referenced by parent_tidCLONE_PID
– child process is created with same PID as parentCLONE_PIDFD
– PID file descriptor of child process is placed in parent’s memoryCLONE_PTRACE
– if parent process is traced, trace child as wellCLONE_SETTLS
– thread local storage (TLS) descriptor is set to TLSCLONE_SIGHAND
– parent and child share signal handlersCLONE_SYSVSEM
– child and parent share same System V semaphore adjustment valuesCLONE_THREAD
– child is created in same thread group as parentCLONE_UNTRACED
– if parent is traced, child is not tracedCLONE_VFORK
– parent process is suspended until child callsexecve
or_exit
CLONE_VM
– parent and child run in same memory space
fork
Create child process.
Returns PID of child process.
vfork
Create child process without copying page tables of parent process.
Returns PID of child process.
execve
Execute a program.
pathname
– path to program to runargv
– pointer to array of arguments for programenvp
– pointer to array of strings (in key=value format) for the environment
Does not return on success, returns -1 on error.
exit
Terminate calling process.
status
– status code to return to parent
Does not return a value.
wait4
Wait for a process to change state.
pid
– PID of processwstatus
– status to wait foroptions
– options flags for callrusage
– pointer to structure with usage about child process filled on return
Returns PID of terminated child.
options
WNOHANG
– return if no child exitedWUNTRACED
– return if child stops (but not traced with ptrace)WCONTINUED
– return if stopped child resumed with SIGCONTWIFEXITED
– return if child terminates normallyWEXITSTATUS
– return exit status of childWIFSIGNALED
– return true if child was terminated with signalWTERMSIG
– return number of signal that caused child to terminateWCOREDUMP
– return true if child core dumpedIFSTOPPED
– return true if child was stopped by signalWSTOPSIG
– returns signal number that caused child to stopWIFCONTINUED
– return true if child was resumed with SIGCONT
kill
Send a signal to process.
pid
– PID of processsig
– number of signal to send to process
Return zero on success.
getppid
Get PID of parent’s calling process.
Returns the PID of parent of calling process.
uname
Get information about the kernel.
buf
– pointer toutsname
structure to receive information
Return zero on success.
struct utsname { char sysname[]; /* OS name (i.e. "Linux") */ char nodename[]; /* node name */ char release[]; /* OS release (i.e. "4.1.0") */ char version[]; /* OS version */ char machine[]; /* hardware identifer */ #ifdef _GNU_SOURCE char domainname[]; /* NIS or YP domain name */ #endif };
semget
Get System V semaphore set identifier.
key
– key of identifier to retreivensems
– number of semaphores per setsemflg
– semaphore flags
Returns value of semaphore set identifier.
semop
Perform operation on specified semampore(s).
semid
– id of semaphoresops
– pointer tosembuf
structure for operationsnsops
– number of operations
struct sembuf { ushort sem_num; /* semaphore index in array */ short sem_op; /* semaphore operation */ short sem_flg; /* flags for operation */ };
Return zero on success.
semctl
Perform control operation on semaphore.
semid
– semaphore set idsemnum
– number of semaphor in setcmd
– operation to perform
Optional fourth argument is a semun
structure:
union semun { int val; /* SETVAL value */ struct semid_ds *buf; /* buffer for IPC_STAT, IPC_SET */ unsigned short *array; /* array for GETALL, SETALL */ struct seminfo *__buf; /* buffer for IPC_INFO */ };
Returns non-negative value corresponding to cmd
flag on success, or -1 on error.
cmd
IPC_STAT
– copy information from kernel associated withsemid
intosemid_ds
referenced byarg.buf
IPC_SET
– write values ofsemid_ds
structure referenced byarg.buf
IPC_RMID
– remove semaphore setIPC_INFO
– get information about system semaphore limits infoseminfo
structureSEM_INFO
– returnseminfo
structure with same info asIPC_INFO
except some fields are returned with info about resources consumed by semaphoresSEM_STAT
– returnsemid_ds
structure likeIPC_STAT
butsemid
argument is index into kernel’s semaphore arraySEM_STAT_ANY
– returnseminfo
structure with same info asSEM_STAT
butsem_perm.mode
isn’t checked for read permissionGETALL
– returnsemval
for all semaphores in set specified bysemid
intoarg.array
GETNCNT
– return value ofsemncnt
for the semaphore of the set indexed bysemnum
GETPID
– return value ofsempid
for the semaphore of the set indexed bysemnum
GETVAL
– return value ofsemval
for the semaphore of the set indexed bysemnum
GETZCNT
– return value ofsemzcnt
for the semaphore of the set indexed bysemnum
SETALL
– set semval for all the semaphores set usingarg.array
SETVAL
– set value ofsemval
toarg.val
for the semaphore of the set indexed bysemnum
shmdt
Detach shared memory segment referenced by shmaddr
.
shmaddr
– address of shared memory segment to detach
Return zero on success.
msgget
Get System V message queue identifier.
key
– message queue identifiermsgflg
– ifIPC_CREAT
andIPC_EXCL
are specified and queue exists for key, thenmsgget
fails with return error set toEEXIST
Return message queue identifier.
msgsnd
Send a message to a System V message queue.
msqid
– message queue idmsgp
– pointer tomsgbuf
structuremsgsz
– size ofmsgbuf
structuremsgflg
– flags defining specific behavior
struct msgbuf { long mtype; /* msg type, must be greater than zero */ char mtext[1]; /* msg text */ };
Returns zero on success or otherwise modified by msgflg
.
msgflg
IPC_NOWAIT
– return immediately if no message of requested type in queueMSG_EXCEPT
– use withmsgtyp
> 0 to read first message in queue with type different frommsgtyp
MSG_NOERROR
– truncate message text if longer thanmsgsz
bytes
msgrcv
Receive message from a system V message queue.
msqid
– message queue idmsgp
– pointer tomsgbuf
structuremsgsz
– size ofmsgbuf
structuremsgtyp
– read first msg if 0, read first msg ofmsgtyp
if > 0, or if negative, read first msg in queue with type less or equal to absolute value ofmsgtyp
msgflg
– flags defining specific behavior
struct msgbuf { long mtype; /* msg type, must be greater than zero */ char mtext[1]; /* msg text */ };
Returns zero on success or otherwise modified by msgflg
.
msgctl
System V message contol.
msqid
– message queue idcmd
– command to executebuf
– pointer to buffer styled inmsqid_ds
struct msqid_ds { struct ipc_perm msg_perm; /* ownership/permissions */ time_t msg_stime; /* last msgsnd(2) time */ time_t msg_rtime; /* last msgrcv(2) time */ time_t msg_ctime; /* last change time */ unsigned long __msg_cbytes; /* bytes in queue */ msgqnum_t msg_qnum; /* messages in queue */ msglen_t msg_qbytes; /* max bytes allowed in queue pid_t msg_lspid; /* PID of last msgsnd(2) */ pid_t msg_lrpid; /* PID of last msgrcv(2) */ };
struct msginfo { int msgpool; /* kb of buffer pool used */ int msgmap; /* max # of entries in message map */ int msgmax; /* max # of bytes per single message */ int msgmnb; /* max # of bytes in the queue */ int msgmni; /* max # of message queues */ int msgssz; /* message segment size */ int msgtql; /* max # of messages on queues */ unsigned short int msgseg; /* max # of segments unused in kernel */ };
Returns zero on successor modified return value based on cmd
.
cmd
IPC_STAT
– copy data structure from kernel bymsqid
intomsqid_ds
structure referenced bybuf
IPC_SET
– updatemsqid_ds
structure referenced bybuf
to kernel, updating itsmsg_ctime
IPC_RMID
– remove message queueIPC_INFO
– returns information about message queue limits intomsginfo
structure referenced bybuf
MSG_INFO
– same asIPC_INFO
exceptmsginfo
structure is filled with usage vs. max usage statisticsMSG_STAT
– same asIPC_STAT
exceptmsqid
is a pointer into kernel’s internal array
fcntl
Manipulate a file descriptor.
fd
– file descriptorcmd
– cmd flag/* arg */
– additional parameters based oncmd
Return value varies based on cmd
flags.
cmd
Parameters in ()
is the optional /* arg */
with specified type.
F_DUPFD
– find lowest numbered file descriptor greater or equal to (int
) and duplicate it, returning new file descriptorF_DUPFD_CLOEXEC
– same asF_DUPFD
but sets close-on-exec flagF_GETFD
– return file descriptor flagsF_SETFD
– set file descriptor flags based on (int
)F_GETFL
– get file access modeF_SETFL
– set file access mode based on (int
)F_GETLK
– get record locks on file (pointer tostruct flock
)F_SETLK
– set lock on file (pointer tostruct flock
)F_SETLKW
– set lock on file with wait (pointer tostruct flock
)F_GETOWN
– return process id receivingSIGIO
andSIGURG
F_SETOWN
– set process id to receiveSIGIO
andSIGURG
(int
)F_GETOWN_EX
– return file descriptor owner settings (struct f_owner_ex *
)F_SETOWN_EX
– direct IO signals on file descriptor (struct f_owner_ex *
)F_GETSIG
– return signal sent when IO is availableF_SETSIG
– set signal sent when IO is available (int
)F_SETLEASE
– obtain lease on file descriptor (int
), where arg isF_RDLCK
,F_WRLCK
, andF_UNLCK
F_GETLEASE
– get current lease on file descriptor (F_RDLCK
,F_WRLCK
, orF_UNLCK
are returned)F_NOTIFY
– notify when dir referenced by file descriptor changes (int
) (DN_ACCESS
,DN_MODIFY
,DN_CREATE
,DN_DELETE
,DN_RENAME
,DN_ATTRIB
are returned)F_SETPIPE_SZ
– change size of pipe referenced by file descriptor to (int
) bytesF_GETPIPE_SZ
– get size of pipe referenced by file descriptor
flock
struct flock { ... short l_type; /* lock type: F_RDLCK, F_WRLCK, or F_UNLCK */ short l_whence; /* interpret l_start with SEEK_SET, SEEK_CUR, or SEEK_END */ off_t l_start; /* offset for lock */ off_t l_len; /* bytes to lock */ pid_t l_pid; /* PID of blocking process (F_GETLK only) */ ... };
f_owner_ex
struct f_owner_ex { int type; pid_t pid; };
flock
Apply or remove advisory lock on open file
fd
– file descriptoroperation
– operaton flag
Returns zero on success.
operation
LOCK_SH
– place shared lockLOCK_EX
– place exclusive lockLOCK_UN
– remove existing lock
fsync
Sync file’s data and metadata in memory to disk, flushing all write buffers and completes pending I/O.
fd
– file descriptor
Returns zero on success.
fdatasync
Sync file’s data (but not metadata, unless needed) to disk.
fd
– file descriptor
Returns zero on success.
truncate
Truncate file to a certain length.
path
– pointer to path of filelength
– length to truncate to
Returns zero on success.
ftruncate
Truncate file descriptor to a certain length.
fd
– file descriptorlength
– length to truncate to
Returns zero on success.
getdents
Get directory entries from a specified file descriptor.
fd
– file descriptor of directorydirp
– pointer tolinux_dirent
structure to receive return valuescount
– size ofdirp
buffer
Returns bytes read on success.
struct linux_dirent { unsigned long d_ino; /* number of inode */ unsigned long d_off; /* offset to next linux_dirent */ unsigned short d_reclen; /* length of this linux_dirent */ char d_name[]; /* filename (null terminated) */ char pad; /* padding byte */ char d_type; /* type of file (see types below) */ }
types
DT_BLK
– block deviceDT_CHR
– char deviceDT_DIR
– directoryDT_FIFO
– FIFO named pipeDT_LNK
– symlinkDT_REG
– regular fileDT_SOCK
– UNIX socketDT_UNKNOWN
– unknown
getcwd
Get current working directory
buf
– pointer to buffer to receive pathsize
– size ofbuf
Returns pointer to string containing current working directory.
chdir
Change the current directory.
path
– pointer to string with name of path
Returns zero on success.
fchdir
Change to the current directory specified by supplied file descriptor.
fd
– file descriptor
Returns zero on success.
rename
Rename or move a file.
oldpath
– pointer to string with old path/namenewpath
– pointer to string with new path/name
Returns zero on success.
mkdir
Make a directory.
pathname
– pointer to string with directory namemode
– file system permissions mode
Returns zero on success.
rmdir
Remove a directory.
pathname
– pointer to string with directory name
Returns zero on success.
creat
Create a file or device.
pathname
– pointer to string with file or device namemode
– file system permissions mode
Returns a file descriptor on success.
link
Creates a hard link for a file.
oldpath
– pointer to string with old filenamenewpath
– pointer to string with new filename
Returns zero on success.
unlink
Remove a file.
pathname
– pointer to string with path name
Return zero on success.
symlink
Create a symlink.
oldpath
– pointer to string with old path namenewpath
– pointer to string with new path name
Return zero on success.
readlink
Return name of a symbolic link.
path
– pointer to string with symlink namebuf
– pointer to buffer with resultbufsiz
– size of buffer for result
Returns number of bytes placed in buf
.
chmod
Set permission on a file or device.
path
– pointer to string with name of file or devicemode
– new permissions mode
Returns zero on success.
fchmod
Same as chmod
but sets permissions on file or device referenced by file descriptor.
fd
– file descriptormode
– new permissions mode
Returns zero on success.
chown
Change owner of file or device.
path
– pointer to string with name of file or deviceowner
– new owner of file or devicegroup
– new group of file or device
Returns zero on success.
fchown
Same as chown
but sets owner and group on a file or device referenced by file descriptor.
fd
– file descriptorowner
– new ownergroup
– new group
Returns zero on success.
lchown
Same as chown
but doesn’t reference symlinks.
path
– pointer to string with name of file or deviceowner
– new ownergroup
– new group
Returns zero on success.
umask
Sets the mask used to create new files.
mask
– mask for new files
System call will always succeed and returns previous mask.
gettimeofday
tv
– pointer to timeval structure to retreive timetz
– pointer to timezone structure to receive time zone
struct timeval { time_t tv_sec; /* seconds */ suseconds_t tv_usec; /* microseconds */ };
struct timezone { int tz_minuteswest; /* minutes west of GMT */ int tz_dsttime; /* DST correction type */ };
Returns zero on success.
getrlimit
Get current resource limits.
resource
– resource flagrlim
– pointer to rlimit structure
struct rlimit { rlim_t rlim_cur; /* soft limit */ rlim_t rlim_max; /* hard limit */ };
Returns zero on success and fills rlim
structure with results.
resource flags
RLIMIT_AS
– max size of process virtual memoryRLIMIT_CORE
– max size of core fileRLIMIT_CPU
– max CPU time, in secondsRLIMIT_DATA
– max size of process’s data segmentRLIMIT_FSIZE
– max size of files that process is allowed to createRLIMIT_LOCKS
– maxflock
andfcntl
leases allowedRLIMIT_MEMLOCK
– max bytes of RAM allowed to be lockedRLIMIT_MSGQUEUE
– max size of POSIX message queuesRLIMIT_NICE
– max nice valueRLIMIT_NOFILE
– max number of files allowed to be opened plus oneRLIMIT_NPROC
– max number of processes or threadsRLIMIT_RSS
– max resident set pagesRLIMIT_RTPRIO
– real-time priority ceilingRLIMIT_RTTIME
– limit in microseconds of real-time CPU schedulingRLIMIT_SIGPENDING
– max number of queued signalsRLIMIT_STACK
– max size of process stack
getrusage
Obtain resource usage.
who
– target flagusage
– pointer torusage
structure
struct rusage { struct timeval ru_utime; /* used user CPU time */ struct timeval ru_stime; /* used system CPU time */ long ru_maxrss; /* maximum RSS */ long ru_ixrss; /* shared memory size */ long ru_idrss; /* unshared data size */ long ru_isrss; /* unshared stack size */ long ru_minflt; /* soft page faults */ long ru_majflt; /* hard page faults */ long ru_nswap; /* swaps */ long ru_inblock; /* block input operations */ long ru_oublock; /* block output operations */ long ru_msgsnd; /* sent # of IPC messages */ long ru_msgrcv; /* received # IPC messages */ long ru_nsignals; /* number of signals received */ long ru_nvcsw; /* voluntary context switches */ long ru_nivcsw; /* involuntary context switches */ };
Returns zero on success.
who target
RUSAGE_SELF
– get usage statistics for calling processRUSAGE_CHILDREN
– get usage statistics for all children of calling processRUSAGE_THREAD
– get usage statistics for calling thread
sysinfo
Return information about the system.
info
– pointer tosysinfo
structure
struct sysinfo { long uptime; /* seconds since boot */ unsigned long loads[3]; /* 1/5/15 minute load avg */ unsigned long totalram; /* total usable memory size */ unsigned long freeram; /* available memory */ unsigned long sharedram; /* shared memory amount */ unsigned long bufferram; /* buffer memory usage */ unsigned long totalswap; /* swap space size */ unsigned long freeswap; /* swap space available */ unsigned short procs; /* total number of current processes */ unsigned long totalhigh; /* total high memory size */ unsigned long freehigh; /* available high memory size */ unsigned int mem_unit; /* memory unit size in bytes */ char _f[20-2*sizeof(long)-sizeof(int)]; /* padding to 64 bytes */ };
Returns zero on success and places system information in sysinfo
structure.
times
Get process times.
buf
– pointer totms
structure
struct tms { clock_t tms_utime; /* user time */ clock_t tms_stime; /* system time */ clock_t tms_cutime; /* children user time */ clock_t tms_cstime; /* children system time */ };
Returns clock ticks since arbitary point in past and may overflow. tms
structure is filled with values.
ptrace
Trace a process.
request
– determine type of trace to performpid
– process id to traceaddr
– pointer to buffer for certain response valuesdata
– pointer to buffer used in certain types of traces
Returns zero on request, placing trace data into addr
and/or data
, depending on trace details in request flags.
request flags
PTRACE_TRACEME
– indicate process traced by parentPTRACE_PEEKTEXT
andPTRACE_PEEKDATA
– read word ataddr
and return as result of callPTRACE_PEEKUSER
– read word ataddr
inUSER
area of the traced process’s memoryPTRACE_POKETEXT
andPTRACE_POKEDATA
– copydata
intoaddr
in traced process’s memoryPTRACE_POKEUSER
– copydata
intoaddr
in the traced process’sUSER
area in memoryPTRACE_GETREGS
– copy traced program’s general registers intodata
PTRACE_GETFPREGS
– copy traced program’s floating-point registers intodata
PTRACE_GETREGSET
– read traced program’s registers in architecture-agnostic wayPTRACE_SETREGS
– modify traced program’s general registersPTRACE_SETFPREGS
– modify traced program’s floating-point registersPTRACE_SETREGSET
– modify traced program’s registers (architecture-agnostic)PTRACE_GETSIGINFO
– get info about signal that caused stop intosiginfo_t
structurePTRACE_SETSIGINFO
– set signal info by copyingsiginfo_t
structure fromdata
into traced programPTRACE_PEEKSIGINFO
– getsiginfo_t
structures without removing queued signalsPTRACE_GETSIGMASK
– copy mask of blocked signals intodata
which will be asigset_t
structurePTRACE_SETSIGMASK
– change blocked signals mask to value indata
which should be asigset_t
structurePTRACE_SETOPTIONS
– set options fromdata
, wheredata
is a bit mask of the following options:PTRACE_O_EXITKILL
– sendSIGKILL
to traced program if tracing program existsPTRACE_O_TRACECLONE
– stop traced program at nextclone
syscall and start tracing new processPTRACE_O_TRACEEXEC
– stop traced program at nextexecve
syscallPTRACE_O_TRACEEXIT
– stop the traced program at exitPTRACE_O_TRACEFORK
– stop traced program at nextfork
and start tracing forked processPTRACE_O_TRACESYSGOOD
– set bit 7 in signal number (SIGTRAP|0x80) when sending system call trapsPTRACE_O_TRACEVFORK
– stop traced program at nextvfork
and start tracing new processPTRACE_O_TRACEVFORKDONE
– stop traced program after nextvfork
PTRACE_O_TRACESECCOMP
– stop traced program whenseccomp
rule is triggeredPTRACE_O_SUSPEND_SECCOMP
– suspend traced program’s seccomp protections
PTRACE_GETEVENTMSG
– get message about most recentptrace
event and put indata
of tracing programPTRACE_CONT
– restart traced process that was stopped and ifdata
is not zero, send number of signal to itPTRACE_SYSCALL
andPTRACE_SIGNELSTEP
– restart traced process that was stopped but stop at entry or exit of next syscallPTRACE_SYSEMU
– continue, then stop on entry for next syscall (but don’t execute it)PTRACE_SYSEMU_SINGLESTEP
– same asPTRACE_SYSEMU
but single step if instruction isn’t a syscallPTRACE_LISTEN
– restart traced program but prevent from executing (similar toSIGSTOP
)PTRACE_INTERRUPT
– stop the traced programPTRACE_ATTACH
– attach to processpid
PTRACE_SEIZE
attach to processpid
but do not stop processPTRACE_SECCOMP_GET_FILTER
– allows for drump of traced program’s classic BPF filters, whereaddr
is the index of filter anddata
is pointer to structuresock_filter
PTRACE_DETACH
– detach then restart stopped traced programPTRACE_GET_THREAD_AREA
– reads TLS entry into GDT with index specified byaddr
, placing copy structuser_desc
atdata
PTRACE_SET_THREAD_AREA
– sets TLS entry into GTD with index specified byaddr
, assigning it structuser_desc
atdata
PTRACE_GET_SYSCALL_INFO
– get information about syscall that caused stop and place structptrace_syscall_info
intodata
, whereaddr
is size of buffer
struct ptrace_peeksiginfo_args { u64 off; /* queue position to start copying signals */ u32 flags; /* PTRACE_PEEKSIGINFO_SHARED or 0 */ s32 nr; /* # of signals to copy */ };
struct ptrace_syscall_info { __u8 op; /* type of syscall stop */ __u32 arch; /* AUDIT_ARCH_* value */ __u64 instruction_pointer; /* CPU instruction pointer */ __u64 stack_pointer; /* CPU stack pointer */ union { struct { /* op == PTRACE_SYSCALL_INFO_ENTRY */ __u64 nr; /* syscall number */ __u64 args[6]; /* syscall arguments */ } entry; struct { /* op == PTRACE_SYSCALL_INFO_EXIT */ __s64 rval; /* syscall return value */ __u8 is_error; /* syscall error flag */ } exit; struct { /* op == PTRACE_SYSCALL_INFO_SECCOMP */ __u64 nr; /* syscall number */ __u64 args[6]; /* syscall arguments */ __u32 ret_data; /* SECCOMP_RET_DATA part of SECCOMP_RET_TRACE return value */ } seccomp; }; };
getuid
Get UID of calling process.
Returns the UID. Always succeeds.
syslog
Read or clear kernel message buffer.
type
– function to performbufp
– pointer to buffer (used for reading)len
– length of buffer
Returns bytes read, available to read, total size of kernel buffer, or 0, depending on type flag.
type flag
SYSLOG_ACTION_READ
– readlen
bytes of kernel message log intobufp
, returns number of bytes readSYSLOG_ACTION_READ_ALL
– read entire kernel message log intobufp
, reading lastlen
bytes from kernel, returning bytes readSYSLOG_ACTION_READ_CLEAR
– read, then clear kernel message log intobufp
, up tolen
bytes, returning bytes readSYSLOG_ACTION_CLEAR
– clear the kernel message log buffer, returns zero on successSYSLOG_ACTION_CONSOLE_OFF
– prevents kernel messages being sent to the consoleSYSLOG_ACTION_CONSOLE_ON
– enables kernel messages being sent to the consoleSYSLOG_ACTION_CONSOLE_LEVEL
– sets the log level of messages (values 1 to 8 vialen
) to allow message filteringSYSLOG_ACTION_SIZE_UNREAD
– returns number of bytes available for reading in kernel message logSYSLOG_ACTION_SIZE_BUFFER
– returns size of kernel message buffer
getgid
Get GID of calling process.
Returns the GID. Always succeeds.
setuid
Set UID of calling process.
uid
– new UID
Returns zero on success.
setgid
Set GID of calling process.
gid
– new GID
Returns zero on success.
geteuid
Get effective UID of calling process.
Returns the effective UID. Always succeeds.
getegid
Get effective GID of calling process.
Returns the effective GID. Always succeeds.
setpgid
Set process group ID of a process.
pid
– process IDpgid
– process group ID
Returns zero on success.
getppid
Get process group ID of a process.
pid
– process ID
Returns process group ID.
getpgrp
Get process group ID of calling process.
Return process group ID.
setsid
Create session if calling process isn’t leader of a process group.
Returns created session ID.
setreuid
Set both real and effective UID for calling process.
ruid
– the real UIDeuid
– the effective UID
Returns zero on success.
setregid
Set both real and effective GID for calling process.
rgid
– the real GIDegid
– the effective GID
Returns zero on success.
getgroups
Get a list of supplementary group IDs for calling process.
size
– size of arraylist
list
– array ofgid_t
to retreive list
Returns number of supplementary group IDs retreived into list
.
setgroups
Set list of supplementary group IDs for calling process.
size
– size of arraylist
list
– array ofgid_t
to set list
Returns zero on success.
setresuid
Sets real, effective, and saved UID.
ruid
– the real UIDeuid
– the effective UIDsuid
– the saved UID
Returns zero on success.
setresgid
Sets real, effective, and saved GID.
rgid
– the real GIDegid
– the effective GIDsgid
– the saved GID
Returns zero on success.
getresuid
Get the real, effective, and saved UID.
ruid
– the real UIDeuid
– the effective UIDsuid
– the saved UID
Returns zero on success.
getresgid
Get the real, effective, and saved GID.
rgid
– the real GIDegid
– the effective GIDsgid
– the saved GID
Returns zero on success.
getpgid
Get process group ID of a process.
pid
– process ID
Returns process group ID.
setfsuid
Set UID for filesystem checks.
Always returns previous filesystem UID.
setfsgid
Set GID for filesystem checks.
Always returns previous filesystem GID.
getsid
Get session ID.
Returns session ID.
capget
Get capabilities of a thread.
hdrp
– capability header structuredatap
– capability data structure
typedef struct __user_cap_header_struct { __u32 version; int pid; } *cap_user_header_t;
typedef struct __user_cap_data_struct { __u32 effective; __u32 permitted; __u32 inheritable; } *cap_user_data_t;
Returns zero on success.
capset
Set capabilities of a thread.
hdrp
– capability header structuredatap
– capability data structure
typedef struct __user_cap_header_struct { __u32 version; int pid; } *cap_user_header_t;
typedef struct __user_cap_data_struct { __u32 effective; __u32 permitted; __u32 inheritable; } *cap_user_data_t;
Returns zero on success.
rt_sigpending
Return signal set that are pending delivery to calling process or thread.
set
– pointer tosigset_t
structure to retreive mask of signals.
rt_sigtimedwait
Suspend execution (until timeout
) of calling process or thread until a signal referenced in set
is pending.
set
– pointer tosigset_t
structure to define signals to wait forinfo
– if not null, pointer tosiginfo_t
structure with info about signaltimeout
– atimespec
structure setting a maximum time to wait before resuming execution
struct timespec { long tv_sec; /* time in seconds */ long tv_nsec; /* time in nanoseconds */ }
rt_sigqueueinfo
Queue a signal.
tgid
– thread group idsig
– signal to sendinfo
– pointer to structuresiginfo_t
Returns zero on success.
rt_sigsuspend
Wait for a signal.
mask
– pointer tosigset_t
structure (defined insigaction
)
Always returns with -1.
sigaltstack
Set/get signal stack context.
ss
– pointer tostack_t
structure representing new signal stackoss
– pointer tostack_t
structure used for getting information on current signal stack
typedef struct { void *ss_sp; /* stack base address */ int ss_flags; /* flags */ size_t ss_size; /* bytes in stack */ } stack_t;
Returns zero on success.
utime
Change the last access and modification time of a file.
filename
– pointer to string with filenametimes
– pointer to structureutimbuf
structure
struct utimbuf { time_t actime; /* access time */ time_t modtime; /* modification time */ };
Returns zero on success.
mknod
Create a special file (usually used for device files).
pathname
– pointer to string with full path of file to createmode
– permissions and type of filedev
– device number
Returns zero on success.
uselib
Load a shared library.
library
– pointer to string with full path of library file
Return zero on success.
personality
Set process execution domain (personality)
persona
– domain of persona
Returns previous persona on success unless persona
is set to 0xFFFFFFFF
.
ustat
Get filesystem statistics
dev
– number of device with mounted filesystemubuf
– pointer toustat
structure for return values
struct ustat { daddr_t f_tfree; /* free blocks */ ino_t f_tinode; /* free inodes */ char f_fname[6]; /* filesystem name */ char f_fpack[6]; /* filesystem pack name */ };
Returns zero on success and ustat
structure referenced by ubuf
is filled with statistics.
statfs
Get filesystem statistics.
path
– pointer to string with filename of any file on the mounted filesystembuf
– pointer tostatfs
structure
struct statfs { __SWORD_TYPE f_type; /* filesystem type */ __SWORD_TYPE f_bsize; /* optimal transfer block size */ fsblkcnt_t f_blocks; /* total blocks */ fsblkcnt_t f_bfree; /* free blocks */ fsblkcnt_t f_bavail; /* free blocks available to unprivileged user */ fsfilcnt_t f_files; /* total file nodes */ fsfilcnt_t f_ffree; /* free file nodes */ fsid_t f_fsid; /* filesystem id */ __SWORD_TYPE f_namelen; /* maximum length of filenames */ __SWORD_TYPE f_frsize; /* fragment size */ __SWORD_TYPE f_spare[5]; };
Returns zero on success.
fstatfs
Works just like statfs
except provides filesystem statistics on via a file descriptor.
fd
– file descriptorbuf
– pointer tostatfs
structure
Returns zero on success.
sysfs
Get filesystem type information.
int sysfs(int option, const char *fsname) int sysfs(int option, unsigned int fs_index, char *buf) int sysfs(int option)
option
– when set to3
, return number of filesystem types in kernel, or can be1
or2
as indicated belowfsname
– pointer to string with name of filesystem (setoption
to1
)fs_index
– index into null-terminated filesystem identifier string written to buffer atbuf
(setoption
to2
)buf
– pointer to buffer
Returns filesystem index when option
is 1
, zero for 2
, and number of filesystem types in kernel for 3
.
getpriority
Get priority of a process.
which
– flag determining which priority to getwho
– PID of process
Returns priority of specified process.
which
PRIO_PROCESS
– process
*PRIO_PGRP
– process groupPRIO_USER
– user ID
setpriority
Set priority of a process.
which
– flag determining which priority to setwho
– PID of processprio
– priority value (-20
to19
)
Returns zero on success.
sched_setparam
Set scheduling parameters of a process.
pid
– PID of processparam
– pointer tosched_param
structure
Returns zero on success.
sched_getparam
pid
– PID of processparam
– pointer tosched_param
structure
Returns zero on success.
sched_setscheduler
Set scheduling parameters for a process.
pid
– PID of processpolicy
– policy flagparam
– pointer tosched_param
structure
Returns zero on success.
policy
SCHED_OTHER
– standard round-robin time sharing policySCHED_FIFO
– first-in-first-out scheduling policySCHED_BATCH
– executes processes in a batch-style scheduleSCHED_IDLE
– denotes a process be set for low priority (background)
sched_getscheduler
Get scheduling parameters for a process.
pid
– PID of process
Returns policy
flag (see sched_setscheduler
).
sched_get_priority_max
Get static priority maximum.
policy
– policy flag (seesched_setscheduler
)
Returns maximum priority value for provided policy.
sched_get_priority_min
Get static priority minimum.
policy
– policy flag (seesched_setscheduler
)
Returns minimum priority value for provided policy.
sched_rr_get_interval
Get SCHED_RR
interval for a process.
pid
– PID of processtp
– pointer totimespec
structure
Returns zero on success and fills tp
with intervals for pid
if SCHED_RR
is the scheduling policy.
mlock
Lock all or part of calling process’s memory.
addr
– pointer to start of address spacelen
– length of address space to lock
Returns zero on success.
munlock
Unlock all or part of calling process’s memory.
addr
– pointer to start of address spacelen
– length of address space to unlock
Returns zero on success.
mlockall
Lock all address space of calling process’s memory.
flags
– flags defining additional behavior
flags
MCL_CURRENT
– lock all pages as of time of calling this syscallMCL_FUTURE
– lock all pages that are mapped to this process in the futureMCL_ONFAULT
– mark all current (or future, along withMCL_FUTURE
) when they are page faulted
munlockall
Unlock all address space of calling process’s memory.
Returns zero on success.
vhangup
Send a "hangup" signal to the current terminal.
Returns zero on success.
modify_ldt
Read or write to the local descriptor table for a process
func
–0
for read,1
for writeptr
– pointer to LDTbytecount
– bytes to read, or for write, size ofuser_desc
structure
struct user_desc { unsigned int entry_number; unsigned int base_addr; unsigned int limit; unsigned int seg_32bit:1; unsigned int contents:2; unsigned int read_exec_only:1; unsigned int limit_in_pages:1; unsigned int seg_not_present:1; unsigned int useable:1; };
Returns bytes read or zero for success when writing.
pivot_root
Change root mount.
new_root
– pointer to string with path to new mountput_old
– pointer to string with path for old mount
Returns zero on success.
prctl
unsigned long arg5)
option
– specify operation flagarg2
,arg3
,arg4
, andarg5
– variables used depending onoption
, seeoption
flags
option
PR_CAP_AMBIENT
– read/change ambient capability of calling thread referencing value inarg2
, in regards to:PR_CAP_AMBIENT_RAISE
– capability inarg3
is added to ambient setPR_CAP_AMBIENT_LOWER
– capability inarg3
is removed from ambient setPR_CAP_AMBIENT_IS_SET
– returns1
if capability inarg3
is in the ambient set,0
if notPR_CAP_AMBIENT_CLEAR_ALL
– remove all capabilities from ambient set, setarg3
to0
PR_CAPBSET_READ
– return1
if capability specified inarg2
is in calling thread’s capability bounding set,0
if notPR_CAPBSET_DROP
– if calling thread hasCAP_SETPCAP
capability in user namespace, drop capability inarg2
from capability bounding set for calling processPR_SET_CHILD_SUBREAPER
– ifarg2
is not zero, set "child subreaper" attribute for calling process, ifarg2
is zero, unsetPR_GET_CHILD_SUBREAPER
– return "child subreaper" setting of calling process in location pointed to byarg2
PR_SET_DUMPABLE
– set state of dumpable flag viaarg2
PR_GET_DUMPABLE
– return current dumpable flag for calling processPR_SET_ENDIAN
– set endian-ness of calling process toarg2
viaPR_ENDIAN_BIG
,PR_ENDIAN_LITTLE
, orPR_ENDIAN_PPC_LITTLE
PR_GET_ENDIAN
– return endian-ness of calling process to location pointed byarg2
PR_SET_KEEPCAPS
– set state of calling process’s "keep capabilities" flag viaarg2
PR_GET_KEEPCAPS
– return current state of calling process’s "keep capabilities" flagPR_MCE_KILL
– set machine check memory corruption kill policy for calling process viaarg2
PR_MCE_KILL_GET
– return current per-process machine check kill policyPR_SET_MM
– modify kernel memory map descriptor fields of calling process, wherearg2
is one of the following options andarg3
is the new value to setPR_SET_MM_START_CODE
– set address above which program text can runPR_SET_MM_END_CODE
– set address below which program text can runPR_SET_MM_START_DATA
– set address above which initialized and uninitialized data are placedPR_SET_MM_END_DATA
– set address below which initialized and uninitialized data are placedPR_SET_MM_START_STACK
– set start address of stackPR_SET_MM_START_BRK
– set address above which program heap can be expanded withbrk
PR_SET_MM_BRK
– set currentbrk
valuePR_SET_MM_ARG_START
– set address above which command line is placedPR_SET_MM_ARG_END
– set address below which command line is placedPR_SET_MM_ENV_START
– set address above which environment is placedPR_SET_MM_ENV_END
– set address below which environment is placedPR_SET_MM_AUXV
– set new aux vector, witharg3
providing new address andarg4
containing size of vectorPR_SET_MM_EXE_FILE
– Supersede/proc/pid/exe
symlink with a new one pointing to file descriptor inarg3
PR_SET_MM_MAP
– provide one-shot access to all addresses by passing structprctl_mm_map
pointer inarg3
with size inarg4
PR_SET_MM_MAP_SIZE
– returns size ofprctl_mm_map
structure, wherearg4
is pointer to unsigned int
PR_MPX_ENABLE_MANAGEMENT
– enable kernel management of memory protection extensionsPR_MPX_DISABLE_MANAGEMENT
– disable kernel management of memory protection extensionsPR_SET_NAME
– set name of calling process to null-terminated string pointed to byarg2
PR_GET_NAME
– get name of calling process in null-terminated string into buffer sized to 16 bytes referenced by pointer inarg2
PR_SET_NO_NEW_PRIVS
– set calling process no_new_privs attribute to value inarg2
PR_GET_NO_NEW_PRIVS
– return value of no_new_privs for calling processPR_SET_PDEATHSIG
– set parent-death signal of calling process toarg2
PR_GET_PDEATHSIG
– return value of parent-death signal intoarg2
PR_SET_SECCOMP
– set "seccomp" mode for calling process viaarg2
PR_GET_SECCOMP
– get "seccomp" mode of calling processPR_SET_SECUREBITS
– set "securebits" flags of calling thread to value inarg2
PR_GET_SECUREBITS
– return "securebits" flags of calling processPR_GET_SPECULATION_CTRL
– return state of speculation misfeature specified inarg2
PR_SET_SPECULATION_CTRL
– set state of speculation misfeature specified inarg2
PR_SET_THP_DISABLE
– set state of "THP disable" flag for calling processPR_TASK_PERF_EVENTS_DISABLE
– disable all performance counters for calling processPR_TASK_PERF_EVENTS_ENABLE
– enable performance counters for calling processPR_GET_THP_DISABLE
– return current setting of "THP disable" flagPR_GET_TID_ADDRESS
– returnclear_child_tid
address set byset_tid_address
PR_SET_TIMERSLACK
– sets current timer slack value for calling processPR_GET_TIMERSLACK
– return current timer slack value for calling processPR_SET_TIMING
– set statistical process timing or accurate timestamp-based process timing by flag inarg2
(PR_TIMING_STATISTICAL or PR_TIMING_TIMESTAMP)PR_GET_TIMING
– return process timing method in usePR_SET_TSC
– set state of flag determining if timestamp counter can be read by process inarg2
(PR_TSC_ENABLE or PR_TSC_SIGSEGV)PR_GET_TSC
– return state of flag determing whether timestamp counter can be read in location pointed byarg2
Returns zero on success or value specified in option
flag.
arch_prctl
Set architecture-specific thread state.
-
code
– defines additional behavior -
addr
or*addr
– address, or pointer in the case of "get" operations -
ARCH_SET_FS
– set 64-bit base for FS register toaddr
-
ARCH_GET_FS
– return 64-bit base value for FS register of current process in memory referenced byaddr
-
ARCH_SET_GS
– set 64-bit base address for GS register toaddr
-
ARCH_GET_GS
– return 64-bit base value for GS register of current process in memory referenced byaddr
Returns zero on success.
adjtimex
Tunes kernel clock.
buf
– pointer to buffer withtimex
structure
struct timex { int modes; /* mode selector */ long offset; /* time offset in nanoseconds if STA_NANO flag set, otherwise microseconds */ long freq; /* frequency offset */ long maxerror; /* max error in microseconds */ long esterror; /* est. error in microseconds */ int status; /* clock command / status */ long constant; /* PLL (phase-locked loop) time constant */ long precision; /* clock precision in microseconds, read-only */ long tolerance; /* clock frequency tolerance, read-only */ struct timeval time; /* current time (read-only, except ADJ_SETOFFSET) */ long tick; /* microseconds between clock ticks */ long ppsfreq; /* PPS (pulse per second) frequency, read-only */ long jitter; /* PPS jitter, read-only, in nanoseconds if STA_NANO flag set, otherwise microseconds */ int shift; /* PPS interval duration in seconds, read-only */ long stabil; /* PPS stability, read-only */ long jitcnt; /* PPS count of jitter limit exceeded events, read-only */ long calcnt; /* PPS count of calibration intervals, read-only */ long errcnt; /* PPS count of calibration errors, read-only */ long stbcnt; /* PPS count of stability limit exceeded events, read-only */ int tai; /* TAI offset set by previous ADJ_TAI operations, in seconds, read-only */ /* padding bytes to allow future expansion */ };
Return clock state, either TIME_OK
, TIME_INS
, TIME_DEL
, TIME_OOP
, TIME_WAIT
, or TIME_ERROR
.
setrlimit
Set resource limits.
resource
– type of resource to set (seegetrlimit
for list)rlim
– pointer torlimit
structure
struct rlimit { rlim_t rlim_cur; /* soft limit */ rlim_t rlim_max; /* hard limit */ };
Returns zero on success.
chroot
Change root directory.
path
– pointer to string containing path to new mount
Returns zero on success.
sync
Flush filesystem caches to disk.
Returns zero on success.
acct
Toggle process accounting.
filename
– pointer to string with existing file
Returns zero on success.
settimeofday
Set the time of day.
tv
– pointer totimeval
structure of new time (seegettimeofday
for structure)tz
– pointer totimezone
structure (seegettimeofday
for structure)
Returns zero on success.
mount
Mount a file system.
unsigned long mountflags, const void *data)
source
– pointer to string containing device pathtarget
– pointer to string containing mount target pathfilesystemtype
– pointer to filesystem type (see/proc/filesystems
for supported filesystems)mountflags
– flags or mount optionsdata
– usually a comma-separated list of options understood by the filesystem type
Returns zero on success.
mountflags
MS_BIND
– perform bind mount, making file or subtree visible at another point within file systemnMS_DIRSYNC
– make dir changes synchronousMS_MANDLOCK
– allow mandatory lockingMS_MOVE
– move subtree, source specifies existing mount point and target specifies new locationMS_NOATIME
– don’t update access timeMS_NODEV
– don’t allow access to special filesMS_NODIRATIME
– don’t update access times for directoriesMS_NOEXEC
– don’t allow programs to be executedMS_NOSUID
– don’t honor SUID or SGID bits when running programsMS_RDONLY
– mount read-onlyMS_RELATIME
– update last access time if current value of atime is less or equal to mtime or ctimeMS_REMOUNT
– remount existing mountMS_SILENT
– suppress disply of printk() warning messages in kernel logMS_STRICTATIME
– always update atime when accessedMS_SYNCHRONOUS
– make write synchronous
umount2
Unmount a filesystem.
target
– poiner to string with filesystem to umountflags
– additional options
Returns zero on success.
flags
MNT_FORCE
– force unmount even if busy, which can cause data lossMNT_DETACH
– perform lazy unmount and make mount point unavailable for new access, then actually unmount when mount isn’t busyMNT_EXPIRE
– mark mount point as expiredUMOUNT_NOFOLLOW
– do not dereference target if symlink
swapon
Start swapping to specified device.
path
– pointer to string with path to deviceswapflags
– flags for additional options
Returns zero on success.
swapflags
SWAP_FLAG_PREFER
– new swap area will have a higher priority than the default priority levelSWAP_FLAG_DISCARD
– discard or trim freed swap pages (for SSDs)
swapoff
Stop swapping to specified device.
path
– pointer to string with path to device
Returns zero on success.
reboot
Reboot the system.
magic
– must be set toLINUX_REBOOT_MAGIC1
orLINUX_REBOOT_MAGIC2A
for this call to workmagic2
– must be set toLINUX_REBOOT_MAGIC2
orLINUX_REBOOT_MAGIC2C
for this call to workarg
– pointer to additional argument flag
Does not return on success, returns -1
on failure.
arg
LINUX_REBOOT_CMD_CAD_OFF
– CTRL+ALT+DELETE is disabled, and CTRL+ALT+DELETE will sendSIGINT
toinit
LINUX_REBOOT_CMD_CAD_ON
– CTRL+ALT+DELETE enabledLINUX_REBOOT_CMD_HALT
– halt system and display "System halted."LINUX_REBOOT_CMD_KEXEC
– execute a previously loaded kernel withkexec_load
, requiresCONFIG_KEXEC
in kernelLINUX_REBOOT_CMD_POWER_OFF
– power down systemLINUX_REBOOT_CMD_RESTART
– restart system and display "Restarting system."LINUX_REBOOT_CMD_RESTART2
– restart system and display "Restarting system with command aq%saq."
sethostname
Set hostname of machine.
name
– pointer to string with new namelen
– length of new name
Returns zero on success.
setdomainname
Set NIS domain name.
name
– pointer to string with new namelen
– length of new name
Return zero on success.
iopl
Change I/O privilage level
level
– new privilege level
Returns zero on success.
ioperm
Set I/O permissions.
from
– starting port addressnum
– number of bitsturn_on
– zero or non-zero denotes enabled or disabled
Returns zero on success.
init_module
Load module into kernel with module file specified by file descriptor.
module_image
– pointer to buffer with binary image of module to loadlen
– size of bufferparam_values
– pointer to string with parameters for kernel
Returns zero on success.
delete_module
Unload a kernel module.
name
– pointer to string with name of moduleflags
– modify behavior of unload
Return zero on success.
flags
O_NONBLOCK
– immediately return from syscallO_NONBLOCK | O_TRUNC
– unload module immediately even if reference count is not zero
quotactl
Change disk quotas.
cmd
– command flagspecial
– pointer to string with path to mounted block deviceid
– user or group IDaddr
– address of data structure, optional to somecmd
flags
cmd
Q_QUOTAON
– turn on quotas for filesystem referenced byspecial
, withid
specifying quota format to use:QFMT_VFS_OLD
– original formatQFMT_VFS_V0
– standard VFS v0 formatQFMT_VFS_V1
– format with support for 32-bit UIDs and GIDs
Q_QUOTAOFF
– turn off quotas for filesystme referenced byspecial
Q_GETQUOTA
– get quota limits and usage for a user or group id, referenced byid
, whereaddr
is pointer todqblk
structureQ_GETNEXTQUOTA
– same asQ_GETQUOTA
but returns info for next id greater or equal to id that has quota set, whereaddr
points tonextdqblk
structureQ_SETQUOTA
– set quota info for user or group id, usingdqblk
structure referenced byaddr
Q_GETINFO
– get info about quotafile, whereaddr
points todqinfo
structureQ_SETINFO
– set information about quotafile, whereaddr
points todqinfo
structureQ_GETFMT
– get quota format used on filesystem referenced byspecial
, whereaddr
points to 4 byte buffer where format number will be storedQ_SYNC
– update on-disk copy of quota usage for filesystemQ_GETSTATS
– get statistics about quota subsystem, whereaddr
points to adqstats
structureQ_XQUOTAON
– enable quotas for an XFS filesystemQ_XQUOTAOFF
– disable quotas on an XFS filesystemQ_XGETQUOTA
– on XFS filesystems, get disk quota limits and usage for user id specified byid
, whereaddr
points tofs_disk_quota
structureQ_XGETNEXTQUOTA
– same asQ_XGETQUOTA
but returnsfs_disk_quota
referenced byaddr
for next id greater or equal than id that has quota setQ_XSETQLIM
– on XFS filesystems, set disk quota for UID, whereaddr
references pointer tofs_disk_quota
structureQ_XGETQSTAT
– returns XFS specific quota info infs_quota_stat
referenced byaddr
Q_XGETQSTATV
– returns XFS specific quota info infs_quota_statv
referenced byaddr
Q_XQUOTARM
– on XFS filesystems, free disk space used by quotas, whereaddr
references unsigned int value containing flags (same asd_flaags
field offs_disk_quota
structure)
struct dqblk { uint64_t dqb_bhardlimit; /* absolute limit on quota blocks alloc */ uint64_t dqb_bsoftlimit; /* preferred limit on quota blocks */ uint64_t dqb_curspace; /* current space used in bytes */ uint64_t dqb_ihardlimit; /* max number of allocated inodes */ uint64_t dqb_isoftlimit; /* preferred inode limit */ uint64_t dqb_curinodes; /* current allocated inodes */ uint64_t dqb_btime; /* time limit for excessive use over quota */ uint64_t dqb_itime; /* time limit for excessive files */ uint32_t dqb_valid; /* bit mask of QIF_* constants */ };
struct nextdqblk { uint64_t dqb_bhardlimit; uint64_t dqb_bsoftlimit; uint64_t dqb_curspace; uint64_t dqb_ihardlimit; uint64_t dqb_isoftlimit; uint64_t dqb_curinodes; uint64_t dqb_btime; uint64_t dqb_itime; uint32_t dqb_valid; uint32_t dqb_id; };
struct dqinfo { uint64_t dqi_bgrace; /* time before soft limit becomes hard limit */ uint64_t dqi_igrace; /* time before soft inode limit becomes hard limit */ uint32_t dqi_flags; /* flags for quotafile */ uint32_t dqi_valid; };
struct fs_disk_quota { int8_t d_version; /* version of structure */ int8_t d_flags; /* XFS_{USER,PROJ,GROUP}_QUOTA */ uint16_t d_fieldmask; /* field specifier */ uint32_t d_id; /* project, UID, or GID */ uint64_t d_blk_hardlimit; /* absolute limit on disk blocks */ uint64_t d_blk_softlimit; /* preferred limit on disk blocks */ uint64_t d_ino_hardlimit; /* max # allocated inodes */ uint64_t d_ino_softlimit; /* preferred inode limit */ uint64_t d_bcount; /* # disk blocks owned by user */ uint64_t d_icount; /* # inodes owned by user */ int32_t d_itimer; /* zero if within inode limits */ int32_t d_btimer; /* as above for disk blocks */ uint16_t d_iwarns; /* # warnings issued regarding # of inodes */ uint16_t d_bwarns; /* # warnings issued regarding disk blocks */ int32_t d_padding2; /* padding */ uint64_t d_rtb_hardlimit; /* absolute limit on realtime disk blocks */ uint64_t d_rtb_softlimit; /* preferred limit on realtime disk blocks */ uint64_t d_rtbcount; /* # realtime blocks owned */ int32_t d_rtbtimer; /* as above, but for realtime disk blocks */ uint16_t d_rtbwarns; /* # warnings issued regarding realtime disk blocks */ int16_t d_padding3; /* padding */ char d_padding4[8]; /* padding */ };
struct fs_quota_stat { int8_t qs_version; /* version for future changes */ uint16_t qs_flags; /* XFS_QUOTA_{U,P,G}DQ_{ACCT,ENFD} */ int8_t qs_pad; /* padding */ struct fs_qfilestat qs_uquota; /* user quota storage info */ struct fs_qfilestat qs_gquota; /* group quota storage info */ uint32_t qs_incoredqs; /* number of dqots in core */ int32_t qs_btimelimit; /* limit for blocks timer */ int32_t qs_itimelimit; /* limit for inodes timer */ int32_t qs_rtbtimelimit; /* limit for realtime blocks timer */ uint16_t qs_bwarnlimit; /* limit for # of warnings */ uint16_t qs_iwarnlimit; /* limit for # of warnings */ };
struct fs_qfilestatv { uint64_t qfs_ino; /* inode number */ uint64_t qfs_nblks; /* number of BBs (512-byte blocks) */ uint32_t qfs_nextents; /* number of extents */ uint32_t qfs_pad; /* pad for 8-byte alignment */ };
struct fs_quota_statv { int8_t qs_version; /* version for future changes */ uint8_t qs_pad1; /* pad for 16-bit alignment */ uint16_t qs_flags; /* XFS_QUOTA_.* flags */ uint32_t qs_incoredqs; /* number of dquots incore */ struct fs_qfilestatv qs_uquota; /* user quota info */ struct fs_qfilestatv qs_gquota; /* group quota info */ struct fs_qfilestatv qs_pquota; /* project quota info */ int32_t qs_btimelimit; /* limit for blocks timer */ int32_t qs_itimelimit; /* limit for inodes timer */ int32_t qs_rtbtimelimit; /* limit for realtime blocks timer */ uint16_t qs_bwarnlimit; /* limit for # of warnings */ uint16_t qs_iwarnlimit; /* limit for # of warnings */ uint64_t qs_pad2[8]; /* padding */ };
Returns zero on success.
gettid
Get thread ID.
Returns thread ID of calling process.
readahead
Read file into page cache.
fd
– file descriptor of file to read aheadoffset
– offset from start of file to readcount
– number of bytes to read
Returns zero on success.
setxattr
Set extended attribute value.
size_t size, int flags)
path
– pointer to string with filenamename
– pointer to string with attribute namevalue
– pointer to string with attribute valuesize
– size ofvalue
flags
– set toXATTR_CREATE
to create attribute,XATTR_REPLACE
to replace
Returns zero on success.
lsetxattr
Set extended attribute value of symbolic link.
size_t size, int flags)
path
– pointer to string with symlinkname
– pointer to string with attribute namevalue
– pointer to string with attribute valuesize
– size ofvalue
flags
– set toXATTR_CREATE
to create attribute,XATTR_REPLACE
to replace
Returns zero on success.
fsetxattr
Set extended attribute value of file referenced by file descriptor.
fd
– file descriptor of file in questionname
– pointer to string with attribute namevalue
– pointer to string with attribute valuesize
– size ofvalue
flags
– set toXATTR_CREATE
to create attribute,XATTR_REPLACE
to replace
Returns zero on success.
getxattr
Get extended attribute value.
path
– pointer to string with filenamename
– pointer to string with attribute namevalue
– pointer to string with attribute valuesize
– size ofvalue
Returns size of extended attribute value.
lgetxattr
Get extended attribute value from symlink.
path
– pointer to string with symlinkname
– pointer to string with attribute namevalue
– pointer to string with attribute valuesize
– size ofvalue
Returns size of extended attribute value.
fgetxattr
Get extended attribute value from file referenced by file descriptor.
fd
– file descriptor of file in questionname
– pointer to string with attribute namevalue
– pointer to string with attribute valuesize
– size ofvalue
Returns size of extended attribute value.
listxattr
List extended attribute names.
path
– pointer to string with filenamelist
– pointer to list of attribute namessize
– size of list buffer
Returns size of name list.
llistxattr
List extended attribute names for a symlink.
path
– pointer to string with symlinklist
– pointer to list of attribute namessize
– size of list buffer
Returns size of name list.
flistxattr
List extended attribute names for file referenced by file descriptor.
fd
– file descriptor of file in questionlist
– pointer to list of attribute namessize
– size of list buffer
Returns size of name list.
removexattr
Remove an extended attribute.
path
– pointer to string with filenamename
– pointer to string with name of attribute to remove
Returns zero on success.
lremovexattr
Remove an extended attribute of a symlink.
path
– pointer to string with filenamename
– pointer to string with name of attribute to remove
Returns zero on success.
fremovexattr
Remove an extended attribute of a file referenced by a file descriptor.
fd
– file descriptor of file in questionname
– pointer to string with name of attribute to remove
Returns zero on success.
tkill
Send a signal to a thread.
tid
– thread idsig
– signal to send
Returns zero on success.
time
Get time in seconds.
t
– if not NULL, return value is also stored in referenced memory address
Returns time (in seconds) since UNIX Epoch.
futex
Fast user-space locking.
int *uaddr2, int val3)
uaddr
– pointer to address of value to monitor for changeop
– operation flagtimeout
– pointer totimespec
structure with timeoutuaddr2
– pointer to integer used for some operationsval3
– additional argument in some operations
Return value depends on operation detailed above.
op
FUTEX_WAIT
– atomically varifies thatuaddr
still contains valueval
and sleeps awaitingFUTEX_WAKE
on this addressFUTEX_WAKE
– wakes at mostval
processes waiting on futex addressFUTEX_REQUEUE
– wakes upval
processes and requeues all waiters on futex at addressuaddr2
FUTEX_CMP_REQUEUE
– similar toFUTEX_REQUEUE
but first checks if locationuaddr
contains value ofval3
sched_setaffinity
Set process CPU affinity mask.
pid
– PID of processcpusetsize
– length of data atmask
mask
– pointer to mask
Returns zero on success.
sched_getaffinity
Get process CPU affinity mask.
pid
– PID of processcpusetsize
– length of data atmask
mask
– pointer to mask
Returns zero on success with mask placed in memory referenced by mask
.
set_thread_area
Set thread local storage area.
u_info
– pointer touser_desc
structure
Returns zero on success.
io_setup
Create async I/O context.
nr_events
– total number of events to receivectx_idp
– pointer reference to created handle
Returns zero on success.
io_destroy
Destroy async I/O context.
ctx_id
– ID of context to destroy
Returns zero on success.
io_getevents
Read async I/O events from queue.
*eventsstruct, timespec *timeout)
ctx_id
– AIO context IDmin_nr
– minimum number of events to readnr
– number of events to readeventsstruct
– pointer toio_event
structuretimeout
– pointer totimespec
timeout structure
Returns number of events read, or zero if no events are available or are less than min_nr
.
io_submit
Submit async I/O blocks for processing.
ctx_id
– AIO context IDnrstruct
– number of structuresiocbpp
– pointer toiocb
structure
Returns number of iocb
submitted.
io_cancel
Cancel previously submitted async I/O operation.
ctx_id
– AIO context IDiocb
– pointer toiocb
structureresult
– pointer toio_event
structure
Returns zero on success and copies event to memory referenced by result
.
get_thread_area
Get a thread local storage area.
u_info
– pointer touser_desc
structure to receive data
Returns zero on success.
lookup_dcookie
Return directory entry’s path.
cookie
– unique identifer of a directory entrybuffer
– pointer to buffer with full path of directory entrylen
– length of buffer
Returns bytes written to buffer
with path string.
epoll_create
Open epoll file descriptor.
size
– ignored, but must be greater than 0
Returns file desctriptor.
getdents64
Get directory entries.
fd
– file descriptor of directorydirp
– pointer tolinux_dirent
structure for resultscount
– size of thedirp
buffer
struct linux_dirent { unsigned long d_ino; /* inode number */ unsigned long d_off; /* offset to next linux_dirent */ unsigned short d_reclen; /* length of this linux_dirent */ char d_name[]; /* null-terminated filename */ char pad; /* zero padding byte */ char d_type; /* file type */ }
Returns bytes read, and at end of directory returns zero.
set_tid_address
Set pointer to thread ID.
tidptr
– pointer to thread ID
Returns PID of calling process.
restart_syscall
Restart a syscall.
Returns value of system call it restarts.
semtimedop
Same as the semop
syscall except if calling thread would sleep, duraton is limited to timeout.
semid
– id of semaphoresops
– pointer tosembuf
structure for operationsnsops
– number of operationstimeout
– timeout for calling thread, and upon return from syscall time elapsed placed in structure
Returns zero on success.
fadvise64
Predeclare access pattern for file data to allow kernel to optimize I/O operations.
fd
– file descriptor of file in questionoffset
– offset that access will beginlen
– length of anticipated access, or0
to end of fileadvice
– advice to give kernel
Returns zero on success.
advice
POSIX_FADV_NORMAL
– application has no specific advicePOSIX_FADV_SEQUENTIAL
– application expects to access data sequentiallyPOSIX_FADV_RANDOM
– data will be access randomlyPOSIX_FADV_NOREUSE
– data will be accessed only oncePOSIX_FADV_WILLNEED
– data will be needed in near futurePOSIX_FADV_DONTNEED
– data will not be needed in near future
timer_create
Create POSIX per-process timer.
clockid
– type of clock to usesevp
– pointer to sigevent structure explaining how caller will be notified when timer expirestimerid
– pointer to buffer that will receive timer ID
Returns zero on success.
union sigval { int sival_int; void *sival_ptr; };
struct sigevent { int sigev_notify; /* method of notification */ int sigev_signo; /* notification signal */ union sigval sigev_value; /* data to pass with notification */ void (*sigev_notify_function) (union sigval); /* Function used for thread notification */ void *sigev_notify_attributes; /* attributes for notification thread */ pid_t sigev_notify_thread_id; /* id of thread to signal */ };
clockid
CLOCK_REALTIME
– settable system wide real time clockCLOCK_MONOTONIC
– nonsettable monotonicly increasing clock measuring time from unspecified point in pastCLOCK_PROCESS_CPUTIME_ID
– clock measuring CPU time consumed by the calling process and its threadsCLOCK_THREAD_CPUTIME_ID
– clock measuring CPU time consumed by calling thread
timer_settime
Arm or disarm POSIX per-process timer.
struct itimerspec *old_value)
timerid
– id of timerflags
– specifyTIMER_ABSTIME
to processnew_value->it_value
as an absolute valuenew_value
– pointer toitimerspec
structure defining new initial and new interval for timerold_value
– pointer to structure to receive previous timer details
struct itimerspec { struct timespec it_interval; /* interval */ struct timespec it_value; /* expiration */ };
Returns zero on success.
timer_gettime
Returns time until next expiration from POSIX per-process timer.
timerid
– id of timercurr_value
– pointer toitimerspec
structure where current timer values are returned
Returns zero on success.
timer_getoverrun
Get overrun count on a POSIX per-process timer.
timerid
– id of timer
Returns overrun count of specified timer.
timer_delete
Delete POSIX per-process timer.
timerid
– id of timer
Returns zero on success.
clock_settime
Set specified clock.
clk_id
– clock idtp
– pointer totimespec
structure with clock detais
Returns zero on success.
clock_gettime
Get time from specified clock.
clk_id
– clock idtp
– pointer totimespec
structure returned with clock detais
Returns zero on success.
clock_getres
Obtain resolution of specified clock.
clk_id
– clock idres
– pointer totimespec
structure returned with detais
Returns zero on success.
clock_nanosleep
High-resolution sleep with specifiable clock.
*request, struct timespec *remain)
clock_id
– type of clock to useflags
– specifyTIMER_ABSTIME
to processrequest
is interpreted as an absolute valueremain
– pointer totimespec
structure to receive remaining time on sleep
Returns zero after sleep interval.
exit_group
Exit all threads in a process.
status
– status code to return
Does not return.
epoll_wait
Wait for I/O event on epoll file descriptor.
epfd
– epoll file descriptorevents
– pointer toepoll_event
structure with events available to calling processmaxevents
– maximum number of events, must e greater than zerotimeout
– timeout in milliseconds
typedef union epoll_data { void *ptr; int fd; uint32_t u32; uint64_t u64; } epoll_data_t;
struct epoll_event { uint32_t events; /* epoll events */ epoll_data_t data; /* user data variable */ };
Returns number of file descriptors ready for requested I/O or zero if timeout occured before any were available.
epoll_ctl
Control interface for epoll file descriptor.
epfd
– epoll file descriptorop
– operation flagfd
– file descirptor for target fileevent
– pointer toepoll_event
structure with event, purpose altered byop
Returns zero on success.
op
EPOLL_CTL_ADD
– addfd
to interest listEPOLL_CTL_MOD
– change settings associated withfd
in interest list to new settings specified inevent
EPOLL_CTL_DEL
– remove target file descriptorfd
from interest list, withevent
argument ignored
tgkill
Send signal to a thread.
tgid
– thread group idtid
– thread idsig
– signal to send
Returns zero on success.
utimes
Change file last access and modification times.
filename
– pointer to string with file in questiontimes
– array oftimeval
structure wheretimes[0]
specifies new access time wheretimes[1]
specifies new modification time
Returns zero on success.
mbind
Set NUMA memory policy on a memory range.
*nodemask, unsigned long maxnode, unsigned flags)
addr
– pointer to starting memory addresslen
– length of memory segmentmode
– NUMA modenodemask
– pointer to mask defining nodes that mode applies tomaxnode
– max number of bits fornodemask
flags
– setMPOL_F_STATIC_NODES
to specify physical nodes,MPOL_F_RELATIVE_NODES
to specify node ids relative to set allowed by threads current cpuset
Returns zero on success.
mode
MPOL_DEFAULT
– remove any nondefault policy and restore default behaviorMPOL_BIND
– specify policy restricting memory allocation to node specified innodemask
MPOL_INTERLEAVE
– specify page allocations be interleaved across set of nodes specified innodemask
MPOL_PREFERRED
– set preferred node for allocationMPOL_LOCAL
– mode specifies "local allocation" – memory is allocated on the node of the CPU that triggers allocation
set_mempolicy
Set default NUMA memory policy for thread and its offspring.
unsigned long maxnode)
mode
– NUMA modenodemask
– pointer to mask defining node that mode applies tomaxnode
– max number of bits fornodemask
Return zero on success.
get_mempolicy
Get NUMA memory policy for thread and its offspring.
void *addr, unsigned long flags)
mode
– NUMA modenodemask
– pointer to mask defining node that mode applies tomaxnode
– max number of bits fornodemask
addr
– pointer to memory regionflags
– defines behavior of call
Return zero on success.
flags
MPOL_F_NODE
or0
(zero preferred) – get information about calling thread’s default policy and store innodemask
bufferMPOL_F_MEMS_ALLOWED
–mode
argument is ignored and subsequent calls return set of nodes thread is allowed to specify is returned innodemask
MPOL_F_ADDR
– get information about policy foraddr
mq_open
Creates a new or open existing POSIX message queue.
mqd_t mq_open(const char *name, int oflag, mode_t mode, struct mq_attr *attr)
name
– pointer to string with name of queueoflag
– define operation of callmode
– permissions to place on queueattr
– pointer tomq_attr
structure to define parameters of queue
struct mq_attr { long mq_flags; /* flags (not used for mq_open) */ long mq_maxmsg; /* max messages on queue */ long mq_msgsize; /* max message size in bytes */ long mq_curmsgs; /* messages currently in queue (not used for mq_open) */ };
oflag
O_RDONLY
– open queue to only receive messagesO_WRONLY
– open queue to send messagesO_RDWR
– open queue for both send and receiveO_CLOEXEC
– set close-on-exec flag for message queue descriptorO_CREAT
– create message queue if it doesn’t existO_EXCL
– ifO_CREAT
specified and queue already exists, fail withEEXIST
O_NONBLOCK
– open queue in nonblocking mode
mq_unlink
Remove message queue.
name
– pointer to string with queue name
Returns zero on success.
mq_timedsend
Send message to message queue.
const struct timespec *abs_timeout)
mqdes
– descriptor pointing to message queuemsg_ptr
– pointer to messagemsg_len
– length of messagemsg_prio
– priority of messageabs_timeout
– pointer totimespec
structure defining timeout
Returns zero on success.
mq_timedreceive
Receive a message from a message queue.
mqdes
– descriptor pointing to message queuemsg_ptr
– pointer to buffer to receive messagemsg_len
– length of message
Return number of bytes in received message.
mq_notify
Register to receive notification when message is available in a message queue.
mqdes
– descriptor pointing to message queuesevp
– pointer tosigevent
structure
Returns zero on success.
kexec_load
Load new kernel for execution at a later time.
kexec_segment *segments, unsigned long flags)
entry
– entry address in kernel imagenr_segments
– number of segments referenced bysegments
pointersegments
– pointer tokexec_segment
structure defining kernel layoutflags
– modify behavior of call
struct kexec_segment { void *buf; /* user space buffer */ size_t bufsz; /* user space buffer length */ void *mem; /* physical address of kernel */ size_t memsz; /* physical address length */ };
Returns zero on success.
flags
KEXEC_FILE_UNLOAD
– unload currently loaded kernelKEXEC_FILE_ON_CRASH
– load new kernel in memory region reserved for crash kernelKEXEC_FILE_NO_INITRAMFS
– specify that loading initrd/initramfs is optional
waitid
Wait for change of state in process.
idtype
– definesid
scope, specifyingP_PID
for process id,P_PGID
process group id, orP_ALL
to wait for any child whereid
is ignoredid
– id of process or process group, defined byidtype
infop
– pointer tosiginfo_t
structure filled in by returnoptions
– modifies behavior of syscall
Returns zero on success.
options
WNOHANG
– return immediately if no child has exitedWUNTRACED
– also return if child as stopped but not tracedWCONTINUED
– also return if stopped child has resumed viaSIGCONT
WIFEXITED
– returns true if child was terminated normallyWEXITSTATUS
– returns exist status of childWIFSIGNALED
– returns true if child process terminated by signalWTERMSIG
– returns signal that caused child process to terminateWCOREDUMP
– returns true if child produced core dumpWIFSTOPPED
– returns true if child process stopped by delivery of signalWSTOPSIG
– returns number of signal that causd child to stopWIFCONTINUED
– returns true if child process was resumed viaSIGCONT
WEXITED
– wait for terminated childrenWSTOPPED
– wait for stopped children via delivery of signalWCONTINUED
– wait for previously stopped children that were resumed viaSIGCONT
WNOWAIT
– leave child in waitable state
add_key
Add key to kernel’s key management.
*payload, size_t plen, key_serial_t keyring)
type
– pointer to string with type of keydescription
– pointer to string with description of keypayload
– key to addplen
– length of keykeyring
– serial number of keyring or special flag
Returns serial number of created key.
keyring
KEY_SPEC_THREAD_KEYRING
– specifies caller’s thread-specific keyringKEY_SPEC_PROCESS_KEYRING
– specifies caller’s process-specific keyringKEY_SPEC_SESSION_KEYRING
– specifies caller’s session-specific keyringKEY_SPEC_USER_KEYRING
– specifies caller’s UID-specific keyringKEY_SPEC_USER_SESSION_KEYRING
– specifies caller’s UID-session keyring
request_key
Request key from kernel’s key management.
const char *callout_info, key_serial_t keyring)
type
– pointer to string with type of keydescription
– pointer to string with description of keycallout_info
– pointer to string set if key isn’t foundkeyring
– serial number of keyring or special flag
Returns serial number of key found on success.
keyctl
Manipulate kernel’s key management.
cmd
– command flag modifying syscall behavior...
– additional arguments percmd
flag
Returns serial number of key found on success.
cmd
KEYCTL_GET_KEYRING_ID
– ask for keyring idKEYCTL_JOIN_SESSION_KEYRING
– join or start named session keyringKEYCTL_UPDATE
– update keyKEYCTL_REVOKE
– revoke keyKEYCTL_CHOWN
– set ownership of keyKEYCTL_SETPERM
– set permissions on a keyKEYCTL_DESCRIBE
– describe keyKEYCTL_CLEAR
– clear contents of keyringKEYCTL_LINK
– link key into keyringKEYCTL_UNLINK
– unlink key from keyringKEYCTL_SEARCH
– search for key in keyringKEYCTL_READ
– read key or keyring’s contentsKEYCTL_INSTANTIATE
– instantiate partially constructed keyKEYCTL_NEGATE
– negate partially constructed keyKEYCTL_SET_REQKEY_KEYRING
– set default request-key keyringKEYCTL_SET_TIMEOUT
– set timeout on a keyKEYCTL_ASSUME_AUTHORITY
– assume authority to instantiate key
ioprio_set
Set I/O scheduling class and priority.
which
– flag specifying target ofwho
who
– id determined bywhich
flagioprio
– bit mask specifying scheduling class and priority to assign towho
process
Returns zero on success.
which
IOPRIO_WHO_PROCESS
–who
is process or thread id, or0
to use calling threadIOPRIO_WHO_PGRP
–who
– is a process id identifying all members of a process group, or0
to operate on process group where calling process is memberIOPRIO_WHO_USER
–who
is UID identifying all processes that have a matching real UID
ioprio_get
Get I/O scheduling class and priority.
which
– flag specifying target ofwho
who
– id determined bywhich
flag
Return ioprio
value of process with highest I/O priority of matching processes.
inotify_init
Initialize an inotify instance.
Returns file descriptor of new inotify event queue.
inotify_add_watch
Add watch to an initalized inotify instance.
fd
– file descriptor referring to inodify instance with watch list to be modifiedpathname
– pointer to string with path to monitormask
– mask of events to be monitored
Returns watch descriptor on success.
inotify_rm_watch
Remove existing watch from inotify instance.
fd
– file descriptor associated with watchwd
– watch descriptor
Returns zero on success.
migrate_pages
Move pages in process to another set of nodes.
*old_nodes, const unsigned long *new_nodes)
pid
– PID of process in questionmaxnode
– max nodes inold_nodes
andnew_nodes
masksold_nodes
– pointer to mask of node numbers to move fromnew_nodes
– pointer to mask of node numbers to move to
Returns number of pages that couldn’t be moved.
openat
Open file relative to directory file descirptor.
int openat(int dirfd, const char *pathname, int flags, mode_t mode)
dirfd
– file descriptor of directorypathname
– pointer to string with path nameflags
– seeopen
syscallmode
– seeopen
syscall
Returns new file descriptor on success.
mkdirat
Create directory relative to directory file descriptor.
dirfd
– file descriptor of directorypathname
– pointer to string with path namemode
– seemkdir
syscall
Returns zero on success.
mknodat
Create a special file relative to directory file descriptor.
dirfd
– file descriptor of directorypathname
– pointer to string with path namemode
– seemknod
syscalldev
– device number
Returns zero on success.
fchownat
Change ownership of file relative to directory file descriptor.
dirfd
– file descriptor of directorypathname
– pointer to string with path nameowner
– user id (UID)group
– group id (GID)flags
– ifAT_SYMLINK_NOFOLLOW
is specified, do no dereference symlinks
unlinkat
Delete name and possibly file it references.
dirfd
– file descriptor of directorypathname
– pointer to string with path nameflags
– seeunlink
orrmdir
Returns zero on success.
renameat
Change name or location of file relative to directory file descriptor.
olddirfd
– file descriptor of directory with sourceoldpath
– pointer to string with path name to sourcenewdirfd
– file descriptor of directory with targetnewpath
– pointer to string with path name to target
Returns zero on success.
linkat
Create a hard link relative to directory file descriptor.
olddirfd
– file descriptor of directory with sourceoldpath
– pointer to string with path name to sourcenewdirfd
– file descriptor of directory with targetnewpath
– pointer to string with path name to targetflags
– seelink
Returns zero on success.
symlinkat
Create a symbolic link relative to directory file descriptor.
target
– pointer to string with targetnewdirfd
– file descriptor of directory with targetlinkpath
– pointer to string with source
Returns zero on success.
readlinkat
Read contents of symbolic link pathname relative to directory file descriptor.
dirfd
– file descriptor relative to symlinkpathname
– pointer to string with symlink pathbuf
– pointer to buffer receiving symlink pathnamebufsiz
– size ofbuf
Returns number of bytes placed into buf
on success.
fchmodat
Change permissions of file relative to a directory file descriptor.
dirfd
– file descriptor of directorypathname
– pointer to string with file in questionmode
– permissions maskflags
– seechmod
Returns zero on success.
faccessat
Check user’s permissions for a given file relative to a directory file descriptor.
dirfd
– file descriptor of directorypathname
– pointer to string with file in questionmode
– specify check to performflags
– seeaccess
Returns zero if permissions are granted.
pselect6
Synchronous I/O multiplexing. Works just like select
with a modified timeout and signal mask.
const struct timespec *timeout, const sigset_t *sigmask)
nfds
– number of file desctipros to monitor (add 1)readfds
– fixed buffer with list of file descriptors to wait for read accesswritefds
– fixed buffer with list of file descriptors to wait for write accessexceptfds
– fixed buffer with list of file descriptors to wait for exceptional conditionstimeout
– timeval structure with time to wait before returningsigmask
– pointer to signal mask
Returns number of file descriptors contained in returned descriptor sets.
ppoll
Wait for an event on a file descriptor like poll
but allows for a signal to interrupt timeout.
const sigset_t *sigmask)
fds
– pointer to an array ofpollfd
structures (described below)nfds
– number ofpollfd
items in thefds
arraytimeout_ts
– sets the number of milliseconds the syscall should block (negative forcespoll
to return immediately)sigmask
– signal mask
Returns number of structures having nonzero revents
fields, or zero upon timeout.
unshare
Disassociate parts of process execution context.
flags
– define behavior of call
flags
CLONE_FILES
– unsuare file descriptor table so calling process no longer shares file descriptors with other processesCLONE_FS
– unshare file system attributes so calling process no longer shares its root or current directory, or umask with other processesCLONE_NEWIPC
– unshare System V IPC namespace so calling process has private copy of System V IPC namespace not shraed with other processesCLONE_NEWNET
– unshare network namespace so calling process is moved to a new network namespace not shared with other processesCLONE_NEWNS
– unsure mount namespaceCLONE_NEWUTS
– unsuare UTS IPC namespaceCLONE_SYSVSEM
– unshare System V sempaphore undo values
set_robust_list
Set list of robust futexes.
pid
– thread/process id, or if0
current process id is usedhead
– pointer to location of list headlen_ptr
– length ofhead_ptr
Returns zero on success.
get_robust_list
Get list of robust futexes.
pid
– thread/process id, or if0
current process id is usedhead
– pointer to location of list headlen_ptr
– length ofhead_ptr
Returns zero on success.
splice
Splice data to/from a pipe.
fd_in
– file descriptor referring to a pipe for inputfd_out
– file descriptor referring to a pipe for outputoff_in
– null iffd_in
refers to a pipe, otherwise points to offset for readoff_out
– null iffd_out
refers to a pipe, otherwise points to offset for writelen
– total bytes to transferflags
– defines additional behavior related to syscall
Returns number of bytes spliced to or from pipe.
flags
SPLICE_F_MOVE
– try to move pages instead of copyingSPLICE_F_NONBLOCK
– try not to block I/OSPLICE_F_MORE
– advise kernel that more data coming in subsequent spliceSPLICE_F_GIFT
– only forvmsplice
, gift user pages to kernel
tee
Duplicate pipe content.
fd_in
– file descriptor referring to a pipe for inputfd_out
– file descriptor referring to a pipe for outputlen
– total bytes to transferflags
– defines additional behavior related to syscall (see flags forsplice
)
Returns number of bytes duplicated between pipes.
sync_file_range
Sync filesegment with disk.
fd
– file descriptor of file in questionoffset
– offset to begin syncnbytes
– number of bytes to syncflags
– defines additional behavior
Returns zero on success.
flags
SYNC_FILE_RANGE_WAIT_BEFORE
– wait after write of all pages in range already submitted to device driver before performing any writeSYNC_FILE_RANGE_WRITE
– write all dirty pages in range already not submitted for writeSYNC_FILE_RANGE_WAIT_AFTER
– wait after write of all pages in range before performing any write
vmsplice
Splice user pages into pipe.
flags)
fd
– file descriptor of pipeiovec
– pointer to array ofiovec
structuresnr_segs
– ranges of user memoryflags
– defines additional behavior (seesplice
)
Return number of bytes transferred into pipe.
move_pages
Move pages of process to another node.
*nodes, int *status, int flags)
pid
– process idpages
– array of pointers to pages to movenodes
– array of integers specifying location to move each pagestatus
– array of integers to receive status of each pageflags
– defines additional behavior
Returns zero on success.
flags
MPOL_MF_MOVE
– move only pages in exclusvie useMPOL_MF_MOVE_ALL
– pages shared between multiple processes can also be moved
utimensat
Change timestamps with nanosecond precision.
times[2], int flags)
dirfd
– directory file descriptorpathname
– pointer to string with path of filetimes
– array of timestamps, wheretimes[0]
is new last access time andtimes[1]
is new last modification timeflags
– ifAT_SYMLINK_NOFOLLOW
specified, update timestamps on symlink
Returns zero on success.
epoll_pwait
Wait for I/O event on epoll file descriptor. Same as epoll_wait
with a signal mask.
const sigset_t *sigmask)
epfd
– epoll file descriptorevents
– pointer toepoll_event
structure with events available to calling processmaxevents
– maximum number of events, must e greater than zerotimeout
– timeout in millisecondssigmask
– signal mask to catch
Returns number of file descriptors ready for requested I/O or zero if timeout occured before any were available.
signalfd
Create file descriptor that can receive signals.
fd
– if-1
, create new file descriptor, otherwise use existing file descriptormask
– signal maskflags
– set toSFD_NONBLOCK
to assignO_NONBLOCK
on new file descriptor, orSFD_CLOEXEC
to setFD_CLOEXEC
flag on new file descriptor
Returns file descripor on success.
timerfd_create
Create timer that notifies a file descriptor.
clockid
– specifyCLOCK_REALTIME
orCLOCK_MONOTONIC
flags
– set toTFD_NONBLOCK
to assignO_NONBLOCK
on new file descriptor, orTFD_CLOEXEC
to setFD_CLOEXEC
flag on new file descriptor
Returns new file descriptor.
eventfd
Create file descriptor for event notification.
initval
– counter maintained by kernelflags
– define additional behavior
Returns new eventfd
file descriptor.
flags
EFD_CLOEXEC
– set close-on-exec flag on new file descriptor (FD_CLOEXEC)EFD_NONBLOCK
– setO_NONBLOCK
on new file descriptor, saving extra call tofcntl
to set this statusEFD_SEMAPHORE
– perform semaphore-like semantics for reads from new file descriptor
fallocate
Allocate file space.
fd
– file descriptor in questionmode
– defines behavioroffset
– starting range of allocationlen
– length of allocation
mode
FALLOC_FL_KEEP_SIZE
– do not change file size even if offset+len is greater than the original file sizeFALLOC_FL_PUNCH_HOLE
– deallocate space in specified range, zeroing blocks
timerfd_settime
Arms or disarms timer referenced by fd
.
struct itimerspec *old_value)
fd
– file descriptorflags
– set to0
to start relative timer, orTFD_TIMER_ABSTIME
to use absolute timernew_value
– pointer toitimerspec
structure to set valueold_value
– pointer toitimerspec
structure to receive previous value after successful update
Returns zero on success.
timerfd_gettime
Get current setting of timer referenced by fd
.
fd
– file descriptorcurr_value
– pointer toitimerspec
structure with current timer value
Returns zero on success.
accept4
Same as accept
syscall.
signalfd4
Same as signalfd
syscall.
eventfd2
Same as eventfd
without flags
argument.
epoll_create1
Same as epoll_create
without flags
argument.
dup3
Same as dup2
except calling program can force close-on-exec flag to be set on new file descriptor.
pipe2
Same as pipe
.
inotify_init1
Same as inotify_init
without flags
argument.
preadv
Same as readv
but adds offset
argument to mark start of input.
pwritev
Same as writev
but adds offset
argument to mark start of output.
rt_tgsigqueueinfo
Not intended for application use. Instead, use rt_sigqueue
.
perf_event_open
Start performance monitoring.
unsigned long flags)
attr
– pointer toperf_event_attr
structure for additional configurationpid
– process idcpu
– cpu idgroup_fd
– create event groupsflags
– defines additional behavior options
struct perf_event_attr { __u32 type; /* event type */ __u32 size; /* attribute structure size */ __u64 config; /* type-specific configuration */ union { __u64 sample_period; /* sampling period */ __u64 sample_freq; /* sampling frequency */ }; __u64 sample_type; /* specify values included in sample */ __u64 read_format; /* specify values returned in read */ __u64 disabled : 1, /* off by default */ inherit : 1, /* inherited by children */ pinned : 1, /* must always be on PMU */ exclusive : 1, /* only group on PMU */ exclude_user : 1, /* don't count user */ exclude_kernel : 1, /* don't count kernel */ exclude_hv : 1, /* don't count hypervisor */ exclude_idle : 1, /* don't count when idle */ mmap : 1, /* include mmap data */ comm : 1, /* include comm data */ freq : 1, /* use freq, not period */ inherit_stat : 1, /* per task counts */ enable_on_exec : 1, /* next exec enables */ task : 1, /* trace fork/exit */ watermark : 1, /* wakeup_watermark */ precise_ip : 2, /* skid constraint */ mmap_data : 1, /* non-exec mmap data */ sample_id_all : 1, /* sample_type all events */ exclude_host : 1, /* don't count in host */ exclude_guest : 1, /* don't count in guest */ exclude_callchain_kernel : 1, /* exclude kernel callchains */ exclude_callchain_user : 1, /* exclude user callchains */ __reserved_1 : 41; union { __u32 wakeup_events; /* every x events, wake up */ __u32 wakeup_watermark; /* bytes before wakeup */ }; __u32 bp_type; /* breakpoint type */ union { __u64 bp_addr; /* address of breakpoint*/ __u64 config1; /* extension of config */ }; union { __u64 bp_len; /* breakpoint length */ __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */ __u64 sample_regs_user; /* user regs to dump on samples */ __u32 sample_stack_user; /* stack size to dump on samples */ __u32 __reserved_2; /* align to u64 */ };
Returns new open file descriptor on success.
flags
PERF_FLAG_FD_NO_GROUP
– allows creating event as part of event group without a leaderPERF_FLAG_FD_OUTPUT
– reroute output from event to group leaderPERF_FLAG_PID_CGROUP
– activate per-container full system monitoring
recvmmsg
Receive multiple messages on a socket using single syscall.
struct timespec *timeout)
sockfd
– socket file descriptormsgvec
– pointer to array ofmmsghdr
structuresvlen
-size ofmsgvec
arrayflags
– specify flags fromrecvmsg
or specifyMSG_WAITFORONE
to activateMSG_DONTWAIT
after receipt of first messagetimeout
– pointer totimespec
structure specfying timeout
Returns number of messages received in msgvec
on success.
fanotify_init
Create fanotify group.
flags
– defines additional parametersevent_f_flags
– defines file status flags set on file descriptors created for fanotify events
Returns new file descriptor on success.
flags
FAN_CLASS_PRE_CONTENT
– allow receipt of events notifying access or attempted access of a file before containing final contentFAN_CLASS_CONTENT
– allow receipt of events notifying access or attempted access of a file containing final contentFAN_REPORT_FID
– allow receipt of events containing info about filesystem related to an eventFAN_CLASS_NOTIF
– default value, allowing only for receipt of events notifying file access
event_f_flags
O_RDONLY
– read-only accessO_WRONLY
– write-only accessO_RDWR
– read/write accessO_LARGEFILE
– support files exceeding 2 GBO_CLOEXEC
– enable close-on-exec flag for file descriptor
fanotify_mark
Add/remote/modify a fanotify
mark on a file.
int dirfd, const char *pathname)
fanotify_fd
– file descriptor fromfanotify_init
flags
– defines additional behaviormask
– file maskdirfd
– use depends onflags
andpathname
, seedirfd
below
Returns zero on success.
dirfd
- If
pathname
isNULL
,dirfd
is a file descriptor to be marked - If
pathname
isNULL
anddirfd
isAT_FDCWD
then current working directory is marked - If
pathname
is an absolute path,dirfd
is ignored - If
pathname
is a relative path anddirfd
is notAT_FDCWD
, thenpathname
anddirfd
define the file to be marked - If
pathname
is a relative path anddirfd
isAT_FDCWD
, thenpathname
is used to determine file to be marked
flags
FAN_MARK_ADD
– events inmask
are added to mark or ignore maskFAN_MARK_REMOVE
– events inmask
are removed from mark or ignore maskFAN_MARK_FLUSH
– remove all masks for filesystems, for mounts, or all marks for files and directories fromfanotify
groupFAN_MARK_DONT_FOLLOW
– ifpathname
is a symlink, mark link instead of file it refersFAN_MARK_ONLYDIR
– if object marked is not a directory, then raise errorFAN_MARK_MOUNT
– mark mount point specified bypathname
FAN_MARK_FILESYSTEM
– mark filesystem specified bypathname
FAN_MARK_IGNORED_MASK
– events inmask
will be added or removed from ignore maskFAN_MARK_IGNORED_SURV_MODIFY
– ignore mask will outlast modify eventsFAN_ACCESS
– create event when file or dir is accessedFAN_MODIFY
– create event when file is modifiedFAN_CLOSE_WRITE
– create event when file that is writable is closedFAN_CLOSE_NOWRITE
– create event when a file that is read-only or a directory is closedFAN_OPEN
– create event when file or dir openedFAN_OPEN_EXEC
– create event when file is opened to be executedFAN_ATTRIB
– create event when file or dir metadata is changedFAN_CREATE
– create event when file or dir is created in marked directoryFAN_DELETE
– create event when file or dir is deleted in marked directoryFAN_DELETE_SELF
– create event when marked file or dir is deletedFAN_MOVED_FROM
– create event when file or dir is moved in a marked directoryFAN_MOVED_TO
– create event when file or dir has been moved to a marked directoryFAN_MOVE_SELF
– create event when marked file or directory is movedFAN_Q_OVERFLOW
– create event when overflow of event queue occursFAN_OPEN_PERM
– create event when a process requests permission to open file or directoryFAN_OPEN_EXEC_PERM
– create event when a process requests permission to open a file to executeFAN_ACCESS_PERM
– create event when a process reqests permission to read a file or directoryFAN_ONDIR
– create events for directories themselves are accessedFAN_EVENT_ON_CHILD
– create events applying to the immediate children of marked directories
name_to_handle_at
Returns file handle and mount ID for file specified by dirfd
and pathname
.
*handle, int *mount_id, int flags)
dirfd
– directory file descriptorpathname
– pointer to string with full path to filefile_handle
– pointer tofile_handle
structuremount_id
– pointer to filesystem mount containingpathname
Returns zero on success and mount_id
is populated.
open_by_handle_at
Opens file corresponding to handle that is returned from name_to_handle_at
syscall.
mount_fd
– file descriptorhandle
– pointer tofile_handle
structureflags
– same flags foropen
syscall
struct file_handle { unsigned int handle_bytes; /* size of f_handle (in/out) */ int handle_type; /* type of handle (out) */ unsigned char f_handle[0]; /* file id (sized by caller) (out) */ };
Returns a file descriptor.
syncfs
Flush filesystem cache specified by a file descriptor.
fd
– file descriptor residing on disk to flush
Returns zero on success.
sendmmsg
Send multiple messages via socket.
sockfd
– file descriptor specifying socketmsgvec
– pointer tommsghdr
structurevlen
– number of messages to sendflags
– flags defining operation (same assendto
flags)
struct mmsghdr { struct msghdr msg_hdr; /* header of message */ unsigned int msg_len; /* bytes to transmit */ };
Returns number of messages sent from msgvec
.
setns
Reassociate a thread with namespace.
fd
– file descriptor specifying a namespacenstype
– specify type of namespace (0
allows any namespace)
Returns zero on success.
nsflag
CLONE_NEWCGROUP
– file descriptor must reference cgroup namespaceCLONE_NEWIPC
– file descriptor must reference IPC namespaceCLONE_NEWNET
– file descriptor must reference network namespaceCLONE_NEWNS
– file descriptor must reference a mount namespaceCLONE_NEWPID
– file descriptor must reference descendant PID namespaceCLONE_NEWUSER
– file descriptor must reference user namespaceCLONE_NEWUTS
– file descriptor must reference UTS namespace
getcpu
Return CPU/NUMA node for calling process or thread.
cpu
– pointer to the CPU numbernode
– pointer to the NUMA node numbertcache
– set to NULL (no longer used)
Returns zero on success.
process_vm_readv
Copy data between a remote (another) process and the local (calling) process.
const struct iovec *remote_iov, unsigned long riovcnt, unsigned long flags)
pid
– source process IDlocal_iov
– pointer toiovec
structure with details about local address spaceliovcnt
– number of elements inlocal_iov
remote_iov
– pointer toiovec
structure with details about remote address spaceriovcnt
– number of elements inremote_iov
flags
– unused, set to 0
Returns number of bytes read.
process_vm_writev
Copy data from the local (calling) process to a remote (another) process.
const struct iovec *remote_iov, unsigned long riovcnt, unsigned long flags)
pid
– source process IDlocal_iov
– pointer toiovec
structure with details about local address spaceliovcnt
– number of elements inlocal_iov
remote_iov
– pointer toiovec
structure with details about remote address spaceriovcnt
– number of elements inremote_iov
flags
– unused, set to zero
struct iovec { void *iov_base; /* start address */ size_t iov_len; /* bytes to transfer */ };
Returns number of bytes written.
kcmp
Compare two processes to see if they share resources in the kernel.
pid1
– the first process IDpid2
– the second process IDtype
– type of resource to compareidx1
– flag-specific resource indexidx2
– flag-specific resource index
Returns zero if processes share the same resource.
type flags
KCMP_FILE
– check if file descriptors specified inidx1
andidx2
are shared by both processesKCMP_FILES
– check if the two processes share the same set of open file descriptors (idx1
andidx2
are not used)KCMP_FS
– check if the two processes share the same filesystem information (for example, the filesystem root, mode creation mask, working directory, etc.)KCMP_IO
– check if processes share the same I/O contextKCMP_SIGHAND
– check if processes share same table of signal dispositionsKCMP_SYSVSEM
– check if processes share same semaphore undo operationsKCMP_VM
– check if processes share same address spaceKCMP_EPOLL_TFD
– check if file descriptor referenced inidx1
of processpid1
is present inepoll
referenced byidx2
of processpid2
, whereidx2
is a structurekcmp_epoll_slot
describing target file
struct kcmp_epoll_slot { __u32 efd; __u32 tfd; __u64 toff; };
finit_module
Load module into kernel with module file specified by file descriptor.
fd
– file descriptor of kernel module file to loadparam_values
– pointer to string with parameters for kernelflags
– flags for module load
Returns zero on success.
flags
MODULE_INIT_IGNORE_MODVERSIONS
– ignore symbol version hashesMODULE_INIT_IGNORE_VERMAGIC
– ignore kernel version magic