The following list describes many of the files and directories under the /proc hierarchy.
5:cpuacct,cpu,cpuset:/daemons
This is a write-only file, writable only by owner of the process.
The following values may be written to the file:
A further value can be written to affect a different bit:
The /proc/[pid]/clear_refs file is present only if the CONFIG_PROC_PAGE_MONITOR kernel configuration option is enabled.
This file provides a superset of the prctl(2) PR_SET_NAME and PR_GET_NAME operations, and is employed by pthread_setname_np(3) when used to rename threads other than the caller.
$ cd /proc/20/cwd; /bin/pwd
Note that the pwd command is often a shell built-in, and might not work properly. In bash(1) , you may use pwd -P.
In a multithreaded process, the contents of this symbolic link are not available if the main thread has already terminated (typically by calling pthread_exit(3) ).
$ strings /proc/1/environ
Under Linux 2.0 and earlier, /proc/[pid]/exe is a pointer to the binary which was executed, and appears as a symbolic link. A readlink(2) call on this file under Linux 2.0 returns a string in the format:
[device]:inode
For example, [0301]:1502 would be inode 1502 on device major 03 (IDE, MFM, etc. drives) minor 01 (first partition on the first drive).
find(1) with the -inum option can be used to locate the file.
For file descriptors for pipes and sockets, the entries will be symbolic links whose content is the file type with the inode. A readlink(2) call on this file returns a string in the format:
type:[inode]
For example, socket:[2248868] will be a socket and its inode is 2248868. For sockets, that inode can be used to find more information in one of the files under /proc/net/.
For file descriptors that have no corresponding inode (e.g., file descriptors produced by epoll_create(2) , eventfd(2) , inotify_init(2) , signalfd(2) , and timerfd(2) ), the entry will be a symbolic link with contents
of the form
anon_inode:<file-type>
In some cases, the file-type is surrounded by square brackets.
For example, an epoll file descriptor will have a symbolic link whose content is the string anon_inode:[eventpoll].
In a multithreaded process, the contents of this directory are not available if the main thread has already terminated (typically by calling pthread_exit(3) ).
Programs that will take a filename
as a command-line argument, but will not take input from standard input
if no argument is supplied, or that write to a file named as a command-line
argument, but will not send their output to standard output if no argument
is supplied, can nevertheless be made to use standard input or standard
out using /proc/[pid]/fd. For example, assuming that -i is the flag designating
an input file and -o is the flag designating an output file:
$ foobar -i /proc/self/fd/0 -o /proc/self/fd/1 ...
and you have a working filter.
/proc/self/fd/N is approximately the same as /dev/fd/N in some UNIX and UNIX-like systems. Most Linux MAKEDEV scripts symbolically link /dev/fd to /proc/self/fd, in fact.
Most systems provide symbolic links /dev/stdin,
/dev/stdout, and /dev/stderr, which respectively link to the files 0, 1,
and 2 in /proc/self/fd. Thus the example command above could be written
as:
$ foobar -i /dev/stdin -o /dev/stdout ...
$ cat /proc/12015/fdinfo/4pos: 1000 flags: 01002002
The pos field is a decimal number showing the current file offset. The flags field is an octal number that displays the file access mode and file status flags (see open(2) ).
The files in this directory are readable only by the owner of the process.
# cat /proc/3828/iorchar: 323934931 wchar: 323929600 syscr: 632687 syscw: 632675 read_bytes: 0 write_bytes: 323932160 cancelled_write_bytes: 0
The fields are as follows:
$ ls -l /proc/self/map_files/lr--------. 1 root root 64 Apr 16 21:31 3252e00000-3252e20000 -> /usr/lib64/ld-2.15.so ...
Although these entries are present for memory regions that were mapped
with the MAP_FILE flag, the way anonymous shared memory (regions created
with the MAP_ANON | MAP_SHARED flags) is implemented in Linux means that
such regions also appear on this directory. Here is an example where the
target file is the deleted /dev/zero one:
lrw-------. 1 root root 64 Apr 16 21:33 7fc075d2f000-7fc075e6f000 -> /dev/zero (deleted)
This directory appears only if the CONFIG_CHECKPOINT_RESTORE kernel configuration option is enabled.
The format of the file is:
address perms offset dev inode pathname00400000-00452000 r-xp 00000000 08:02 173521 /usr/bin/dbus-daemon 00651000-00652000 r--p 00051000 08:02 173521 /usr/bin/dbus-daemon 00652000-00655000 rw-p 00052000 08:02 173521 /usr/bin/dbus-daemon 00e03000-00e24000 rw-p 00000000 00:00 0 [heap] 00e24000-011f7000 rw-p 00000000 00:00 0 [heap] ... 35b1800000-35b1820000 r-xp 00000000 08:02 135522 /usr/lib64/ld-2.15.so 35b1a1f000-35b1a20000 r--p 0001f000 08:02 135522 /usr/lib64/ld-2.15.so 35b1a20000-35b1a21000 rw-p 00020000 08:02 135522 /usr/lib64/ld-2.15.so 35b1a21000-35b1a22000 rw-p 00000000 00:00 0 35b1c00000-35b1dac000 r-xp 00000000 08:02 135870 /usr/lib64/libc-2.15.so 35b1dac000-35b1fac000 ---p 001ac000 08:02 135870 /usr/lib64/libc-2.15.so 35b1fac000-35b1fb0000 r--p 001ac000 08:02 135870 /usr/lib64/libc-2.15.so 35b1fb0000-35b1fb2000 rw-p 001b0000 08:02 135870 /usr/lib64/libc-2.15.so ... f2c6ff8c000-7f2c7078c000 rw-p 00000000 00:00 0 [stack:986] ... 7fffb2c0d000-7fffb2c2e000 rw-p 00000000 00:00 0 [stack] 7fffb2d48000-7fffb2d49000 r-xp 00000000 00:00 0 [vdso]
The address field is the address space in the process that the mapping occupies. The perms field is a set of permissions:
r = read w = write x = execute s = shared p = private (copy on write)
The offset field is the offset into the file/whatever; dev is the device (major:minor); inode is the inode on that device. 0 indicates that no inode is associated with the memory region, as would be the case with BSS (uninitialized data).
The pathname field will usually be the file that is backing the mapping. For ELF files, you can easily coordinate with the offset field by looking at the Offset field in the ELF program headers (readelf -l).
There are additional helpful pseudo-paths:
Under Linux 2.0, there is no field giving pathname.
36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11)
- mount ID: unique identifier of the mount (may be reused after umount(2) ).
- parent ID: ID of parent mount (or of self for the top of the mount tree).
- major:minor: value of st_dev for files on filesystem (see stat(2) ).
- root: root of the mount within the filesystem.
- mount point: mount point relative to the process’s root.
- mount options: per-mount options.
- optional fields: zero or more fields of the form "tag[:value]".
- separator: marks the end of the optional fields.
- filesystem type: name of filesystem in the form "type[.subtype]".
- mount source: filesystem-specific information or "none".
- super options: per-superblock options.
For more information on mount propagation see: Documentation/filesystems/sharedsubtree.txt in the Linux kernel source tree.
device /dev/sda7 mounted on /home with fstype ext3 [statistics] ( 1 ) ( 2 ) (3 ) (4)
- The name of the mounted device (or "nodevice" if there is no corresponding device).
- The mount point within the filesystem tree.
- The filesystem type.
- Optional statistics and configuration information. Currently (as at Linux 2.6.26), only NFS filesystems export information via this field.
See namespaces(7) for more information.
The badness heuristic assigns a value to each candidate task ranging from 0 (never kill) to 1000 (always kill) to determine which process is targeted. The units are roughly a proportion along that range of allowed memory the process may allocate from, based on an estimation of its current memory and swap use. For example, if a task is using all allowed memory, its badness score will be 1000. If it is using half of its allowed memory, its score will be 500.
There is an additional factor included in the badness score: root processes are given 3% extra memory over other tasks.
The amount of "allowed" memory depends on the context in which the OOM-killer was called. If it is due to the memory assigned to the allocating task’s cpuset being exhausted, the allowed memory represents the set of mems assigned to that cpuset (see cpuset(7) ). If it is due to a mempolicy’s node(s) being exhausted, the allowed memory represents the set of mempolicy nodes. If it is due to a memory limit (or swap limit) being reached, the allowed memory is that configured limit. Finally, if it is due to the entire system being out of memory, the allowed memory represents all allocatable resources.
The value of oom_score_adj is added to the badness score before it is used to determine which task to kill. Acceptable values range from -1000 (OOM_SCORE_ADJ_MIN) to +1000 (OOM_SCORE_ADJ_MAX). This allows user space to control the preference for OOM-killing, ranging from always preferring a certain task or completely disabling it from OOM killing. The lowest possible value, -1000, is equivalent to disabling OOM-killing entirely for that task, since it will always report a badness score of 0.
Consequently, it is very simple for user space to define the amount of memory to consider for each task. Setting a oom_score_adj value of +500, for example, is roughly equivalent to allowing the remainder of tasks sharing the same system, cpuset, mempolicy, or memory controller resources to use at least 50% more memory. A value of -500, on the other hand, would be roughly equivalent to discounting 50% of the task’s allowed memory from being considered as scoring against the task.
For backward compatibility with previous kernels, /proc/[pid]/oom_adj can still be used to tune the badness score. Its value is scaled linearly with oom_score_adj.
Writing to /proc/[pid]/oom_score_adj or /proc/[pid]/oom_adj will change the other with its scaled value.
- If set, the page is present in RAM.
- If set, the page is in swap space
In a multithreaded process, the contents of this symbolic link are not available if the main thread has already terminated (typically by calling pthread_exit(3) ).
00400000-0048a000 r-xp 00000000 fd:03 960637 /bin/bash Size: 552 kB Rss: 460 kB Pss: 100 kB Shared_Clean: 452 kB Shared_Dirty: 0 kB Private_Clean: 8 kB Private_Dirty: 0 kB Referenced: 460 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kBThe first of these lines shows the same information as is displayed for the mapping in /proc/[pid]/maps. The remaining lines show the size of the mapping, the amount of the mapping that is currently resident in RAM ("Rss"), the process’ proportional share of this mapping ("Pss"), the number of clean and dirty shared pages in the mapping, and the number of clean and dirty private pages in the mapping. "Referenced" indicates the amount of memory currently marked as referenced or accessed. "Anonymous" shows the amount of memory that does not belong to any file. "Swap" shows how much would-be-anonymous memory is also used, but out on swap.
The "KernelPageSize" entry is the page size used by the kernel to back a VMA. This matches the size used by the MMU in the majority of cases. However, one counter-example occurs on PPC64 kernels whereby a kernel using 64K as a base page size may still use 4K pages for the MMU on older processors. To distinguish, this patch reports "MMUPageSize" as the page size used by the MMU.
The "Locked" indicates whether the mapping is locked in memory or not.
"VmFlags" field represents the kernel flags associated with the particular virtual memory area in two letter encoded manner. The codes are the following:
rd - readable
wr - writable
ex - executable
sh - shared
mr - may read
mw - may write
me - may execute
ms - may share
gd - stack segment grows down
pf - pure PFN range
dw - disabled write to the mapped file
lo - pages are locked in memory
io - memory mapped I/O area
sr - sequential read advise provided
rr - random read advise provided
dc - do not copy area on fork
de - do not expand area on remapping
ac - area is accountable
nr - swap space is not reserved for the area
ht - area uses huge tlb pages
nl - non-linear mapping
ar - architecture specific flag
dd - do not include area into core dump
sd - soft-dirty flag
mm - mixed map area
hg - huge page advise flag
nh - no-huge page advise flag
mg - mergeable advise flag
The /proc/[pid]/smaps file is present only if the CONFIG_PROC_PAGE_MONITOR kernel configuration option is enabled.
The fields, in order, with their proper scanf(3) format specifiers, are:
The format for this field was %lu before Linux 2.6.
Before Linux 2.6, this was a scaled value based on the scheduler weighting given to this process.
The format for this field was %lu before Linux 2.6.
The format for this field was %lu before Linux 2.6.22.
size (1) total program size (same as VmSize in /proc/[pid]/status) resident (2) resident set size (same as VmRSS in /proc/[pid]/status) share (3) shared pages (i.e., backed by a file) text (4) text (code) lib (5) library (unused in Linux 2.6) data (6) data + stack dt (7) dirty pages (unused in Linux 2.6)
$ cat /proc/$$/statusName: bash State: S (sleeping) Tgid: 3515 Pid: 3515 PPid: 3452 TracerPid: 0 Uid: 1000 1000 1000 1000 Gid: 100 100 100 100 FDSize: 256 Groups: 16 33 100 VmPeak: 9136 kB VmSize: 7896 kB VmLck: 0 kB VmHWM: 7572 kB VmRSS: 6316 kB VmData: 5224 kB VmStk: 88 kB VmExe: 572 kB VmLib: 1708 kB VmPTE: 20 kB VmSwap: 0 kB Threads: 1 SigQ: 0/3067 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000010000 SigIgn: 0000000000384004 SigCgt: 000000004b813efb CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: ffffffffffffffff Seccomp: 0 Cpus_allowed: 00000001 Cpus_allowed_list: 0 Mems_allowed: 1 Mems_allowed_list: 0 voluntary_ctxt_switches: 150 nonvoluntary_ctxt_switches: 545
If the process is blocked, but not in a system call, then the file displays -1 in place of the system call number, followed by just the values of the stack pointer and program counter. If process is not blocked, then file contains just the string "running".
This file is present only if the kernel was configured with CONFIG_HAVE_ARCH_TRACEHOOK.
In a multithreaded process, the contents of the /proc/[pid]/task directory are not available if the main thread has already terminated (typically by calling pthread_exit(3) ).
The uid_map file exposes the mapping of user IDs from the user namespace of the process pid to the user namespace of the process that opened uid_map (but see a qualification to this point below). In other words, processes that are in different user namespaces will potentially see different values when reading from a particular uid_map file, depending on the user ID mappings for the user namespaces of the reading processes.
Each line in the file specifies a 1-to-1 mapping of a range of contiguous between two user namespaces. The specification in each line takes the form of three numbers delimited by white space. The first two numbers specify the starting user ID in each user namespace. The third number specifies the length of the mapped range. In detail, the fields are interpreted as follows:
- The start of the range of user IDs in the user namespace of the process pid.
- The start of the range of user IDs to which the user IDs specified by field one map. How field two is interpreted depends on whether the process that opened uid_map and the process pid are in the same user namespace, as follows:
In order for a process to write to the /proc/[pid]/uid_map (/proc/[pid]/gid_map) file, the following requirements must be met:
(2^order) * PAGE_SIZE
The binary buddy allocator algorithm inside the kernel will split one chunk into two chunks of a smaller order (thus with half the size) or combine two contiguous chunks into one larger chunk of a higher order (thus with double the size) to satisfy allocation requests and to counter memory fragmentation. The order matches the column number, when starting to count at zero.
For example on a x86_64 system:
Node 0, zone DMA 1 1 1 0 2 1 1 0 1 1 3 Node 0, zone DMA32 65 47 4 81 52 28 13 10 5 1 404 Node 0, zone Normal 216 55 189 101 84 38 37 27 5 3 587
In this example, there is one node containing three zones and there are 11 different chunk sizes. If the page size is 4 kilobytes, then the first zone called DMA (on x86 the first 16 megabyte of memory) has 1 chunk of 4 kilobytes (order 0) available and has 3 chunks of 4 megabytes (order 10) available.
If the memory is heavily fragmented, the counters for higher order chunks will be zero and allocation of large contiguous areas will fail.
Further information about the zones can be found in /proc/zoneinfo.
ID: 1 signal: 60/00007fff86e452a8 notify: signal/pid.2634 ClockID: 0 ID: 0 signal: 60/00007fff86e452a8 notify: signal/pid.2634 ClockID: 1
The lines shown for each timer have the following meanings:
cat /lib/modules/$(uname -r)/build/.config
Incidentally, this file may be used by mount(8) when no filesystem is specified and it didn’t manage to determine the filesystem type. Then filesystems contained in this file are tried (excepted those that are marked with "nodev").
cache buffer size in KB capacity number of sectors driver driver version geometry physical and logical geometry identify in hexadecimal media media type model manufacturer’s model number settings drive settings smart_thresholds in hexadecimal smart_values in hexadecimal
The hdparm(8) utility provides access to this information in a friendly format.
The total length of the file is the size of physical memory (RAM) plus 4KB.
Information in this file is retrieved with the dmesg(1) program.
0 - KPF_LOCKED
1 - KPF_ERROR
2 - KPF_REFERENCED
3 - KPF_UPTODATE
4 - KPF_DIRTY
5 - KPF_LRU
6 - KPF_ACTIVE
7 - KPF_SLAB
8 - KPF_WRITEBACK
9 - KPF_RECLAIM
10 - KPF_BUDDY
11 - KPF_MMAP (since Linux 2.6.31)
12 - KPF_ANON (since Linux 2.6.31)
13 - KPF_SWAPCACHE (since Linux 2.6.31)
14 - KPF_SWAPBACKED (since Linux 2.6.31)
15 - KPF_COMPOUND_HEAD (since Linux 2.6.31)
16 - KPF_COMPOUND_TAIL (since Linux 2.6.31)
16 - KPF_HUGE (since Linux 2.6.31)
18 - KPF_UNEVICTABLE (since Linux 2.6.31)
19 - KPF_HWPOISON (since Linux 2.6.31)
20 - KPF_NOPAGE (since Linux 2.6.31)
21 - KPF_KSM (since Linux 2.6.32)
22 - KPF_THP (since Linux 3.4)
For further details on the meanings of these bits, see the kernel source file Documentation/vm/pagemap.txt. Before kernel 2.6.29, KPF_WRITEBACK, KPF_RECLAIM, KPF_BUDDY, and KPF_LOCKED did not report correctly.
This 1GB is memory which has been "committed" to by the VM and can be used at any time by the allocating application. With strict overcommit enabled on the system (mode 2 in IR /proc/sys/vm/overcommit_memory ), allocations which would exceed the CommitLimit will not be permitted. This is useful if one needs to guarantee that processes will not fail due to lack of memory once that memory has been successfully allocated.
IP address HW type Flags HW address Mask Device 192.168.0.50 0x1 0x2 00:50:BF:25:68:F3 * eth0 192.168.0.250 0x1 0xc 00:00:00:00:00:00 * eth0
Here "IP address" is the IPv4 address of the machine and the "HW type" is the hardware type of the address from RFC 826. The flags are the internal flags of the ARP structure (as defined in /usr/include/linux/if_arp.h) and the "HW address" is the data link layer mapping for that IP address if it is known.
Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 2776770 11307 0 0 0 0 0 0 2776770 11307 0 0 0 0 0 0 eth0: 1215645 2751 0 0 0 0 0 0 1782404 4324 0 0 0 427 0 0 ppp0: 1622270 5552 1 0 0 0 0 0 354130 5669 0 0 0 0 0 0 tap0: 7714 81 0 0 0 0 0 0 7714 81 0 0 0 0 0 0
indx interface_name dmi_u dmi_g dmi_address 2 eth0 1 0 01005e000001 3 eth1 1 0 01005e000001 4 eth2 1 0 01005e000001
sl local_address rem_address st tx_queue rx_queue tr rexmits tm->when uid 1: 01642C89:0201 0C642C89:03FF 01 00000000:00000001 01:000071BA 00000000 0 1: 00000000:0801 00000000:0000 0A 00000000:00000000 00:00000000 6F000100 0 1: 00000000:0201 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0
Num RefCount Protocol Flags Type St Path 0: 00000002 00000000 00000000 0001 03 1: 00000001 00000000 00010000 0001 01 /dev/printer
Here "Num" is the kernel table slot number, "RefCount" is the number of users of the socket, "Protocol" is currently always 0, "Flags" represent the internal kernel flags holding the status of the socket. Currently, type is always "1" (UNIX domain datagram sockets are not yet supported in the kernel). "St" is the internal state of the socket and Path is the bound path (if any) of the socket.
1 4207 0 2 65535 0 0 0 1 (1) (2) (3)(4) (5) (6) (7) (8)
- The ID of the queue. This matches what is specified in the --queue-num or --queue-balance options to the iptables(8) NFQUEUE target. See iptables-extensions(8) for more information.
- The netlink port id subscribed to the queue.
- The number of packets currently queued and waiting to be processed by the application.
- The copy mode of the queue. It is either 1 (metadata only) or 2 (also copy payload data to userspace).
- Copy range, i.e. how many bytes of packet payload should be copied to userspace at most.
- queue dropped. Number of packets that had to be dropped by the kernel because too many packets are already waiting for userspace to send back the mandatory accept/drop verdicts.
- queue user dropped. Number of packets that were dropped within the netlink subsystem. Such drops usually happen when the corresponding socket buffer is full, i.e. userspace is not able to read messages fast enough.
- sequence number. Every queued packet is associated with a (32-bit) monotonically-increasing sequence number. This shows the ID of the most recent packet queued.
This file has been deprecated in favor of a new /proc interface for PCI (/proc/bus/pci). It became optional in Linux 2.2 (available with CONFIG_PCI_OLD_PROC set at kernel compilation). It became once more nonoptionally enabled in Linux 2.4. Next, it was deprecated in Linux 2.6 (still available with CONFIG_PCI_LEGACY_PROC set), and finally removed altogether since Linux 2.6.17.
You can also write to some of the files to reconfigure the subsystem or switch certain features on or off.
The command
echo aqscsi add-single-device 1 0 5 0aq > /proc/scsi/scsiwill cause host scsi1 to scan on SCSI channel 0 for a device on ID 5 LUN 0. If there is already a device known on this address or the address is invalid, an error will be returned.
Reading these files will usually show driver and host configuration, statistics, and so on.
Writing to these files allows different things on different hosts. For example, with the latency and nolatency commands, root can switch on and off command latency measurement code in the eata_dma driver. With the lockup and unlock commands, root can control bus lockups simulated by the scsi_debug driver.
cache-name num-active-objs total-objs object-size num-active-slabs total-slabs num-pages-per-slab
See slabinfo(5) for details.
echo 100000 > /proc/sys/fs/file-max
Privileged processes (CAP_SYS_ADMIN) can override the file-max limit.
Starting with Linux 2.4, there is no longer a static limit on the number of inodes, and this file is removed.
nr_inodes is the number of inodes the system has allocated. nr_free_inodes represents the number of free inodes.
preshrink is nonzero when the nr_inodes > inode-max and the system needs to prune the inode list instead of allocating more; since Linux 2.4, this field is a dummy value (always zero).
- the target is a regular file;
- the target file does not have its set-user-ID permission bit enabled;
- the target file does not have both its set-group-ID and group-executable permission bits enabled; and
- the caller has permission to read and write the target file (either via the file’s permissions mask or because it has suitable capabilities).
# echo aqdarkstaraq > /proc/sys/kernel/hostname# echo aqmydomainaq > /proc/sys/kernel/domainname
has the same effect as
# hostname aqdarkstaraq# domainname aqmydomainaq
Note, however, that the classic darkstar.frop.org has the hostname "darkstar" and DNS (Internet Domain Name Server) domainname "frop.org", not to be confused with the NIS (Network Information Service) or YP (Yellow Pages) domainname. These two domain names are in general different. For a detailed discussion see the hostname(1) man page.
0 - disable sysrq completely
1 - enable all functions of sysrq
>1 - bit mask of allowed sysrq functions, as follows:
2 - enable control of console logging level
4 - enable control of keyboard (SAK, unraw)
8 - enable debugging dumps of processes etc.
16 - enable sync command
32 - enable remount read-only
64 - enable signaling of processes (term, kill, oom-kill)
128 - allow reboot/poweroff
256 - allow nicing of all real-time tasks
This file is present only if the CONFIG_MAGIC_SYSRQ kernel configuration option is enabled. For further details see the Linux kernel source file Documentation/sysrq.txt.
#5 Wed Feb 25 21:49:24 MET 1998
The "#5" means that this is the fifth kernel built from this source base and the date behind it indicates the time the kernel was built.
To free pagecache, use:
echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes, use:
echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes, use:
echo 3 > /proc/sys/vm/drop_caches
Because writing to this file is a nondestructive operation and dirty objects are not freeable, the user should run sync(1) first.
The file has one of the following values:
This feature is active only on architectures/platforms with advanced machine check handling and depends on the hardware capabilities.
Applications can override the memory_failure_early_kill setting individually with the prctl(2) PR_MCE_KILL operation.
If this contains the value zero, this information is suppressed. On very large systems with thousands of tasks, it may not be feasible to dump the memory state information for each one. Such systems should not be forced to incur a performance penalty in OOM situations when the information may not be desired.
If this is set to nonzero, this information is shown whenever the OOM-killer actually kills a memory-hogging task.
The default value is 0.
If this is set to zero, the OOM-killer will scan through the entire tasklist and select a task based on heuristics to kill. This normally selects a rogue memory-hogging task that frees up a large amount of memory when killed.
If this is set to nonzero, the OOM-killer simply kills the task that triggered the out-of-memory condition. This avoids a possibly expensive tasklist scan.
If /proc/sys/vm/panic_on_oom is nonzero, it takes precedence over whatever value is used in /proc/sys/vm/oom_kill_allocating_task.
The default value is 0.
Only one of overcommit_kbytes or overcommit_ratio can have an effect: if overcommit_kbytes has a nonzero value, then it is used to calculate CommitLimit, otherwise overcommit_ratio is used. Writing a value to either of these files causes the value in the other file to be set to zero.
In mode 2 (available since Linux 2.6), the total virtual address space that can be allocated (CommitLimit in /proc/meminfo) is calculated
as
CommitLimit = (total_RAM - total_huge_TLB) *
overcommit_ratio / 100 + total_swap
where:
Since Linux 3.14, if the value in /proc/sys/vm/overcommit_kbytes is nonzero, then CommitLimit is instead calculated as:
CommitLimit = overcommit_kbytes + total_swap
If this file is set to the value 0, the kernel’s OOM-killer will kill some rogue process. Usually, the OOM-killer is able to kill a rogue process and the system will survive.
If this file is set to the value 1, then the kernel normally panics when out-of-memory happens. However, if a process limits allocations to certain nodes using memory policies (mbind(2) MPOL_BIND) or cpusets (cpuset(7) ) and those nodes reach memory exhaustion status, one process may be killed by the OOM-killer. No panic occurs in this case: because other nodes’ memory may be free, this means the system as a whole may not have reached an out-of-memory situation yet.
If this file is set to the value 2, the kernel always panics when an out-of-memory condition occurs.
The default value is 0. 1 and 2 are for failover of clustering. Select either according to your policy of failover.
If enabled in the kernel (CONFIG_TIMER_STATS), but not used, it has almost zero runtime overhead and a relatively small data-structure overhead. Even if collection is enabled at runtime, overhead is low: all the locking is per-CPU and lookup is hashed.
The /proc/timer_stats file is used both to control sampling facility and to read out the sampled information.
The timer_stats functionality is inactive on bootup. A sampling period can be started using the following command:
# echo 1 > /proc/timer_stats
The following command stops a sampling period:
# echo 0 > /proc/timer_stats
The statistics can be retrieved by:
$ cat /proc/timer_stats
While sampling is enabled, each readout from /proc/timer_stats will see newly updated statistics. Once sampling is disabled, the sampled information is kept until a new sample period is started. This allows multiple readouts.
Sample output from /proc/timer_stats:
$ cat /proc/timer_statsTimer Stats Version: v0.3 Sample period: 1.764 s Collection: active 255, 0 swapper/3 hrtimer_start_range_ns (tick_sched_timer) 71, 0 swapper/1 hrtimer_start_range_ns (tick_sched_timer) 58, 0 swapper/0 hrtimer_start_range_ns (tick_sched_timer) 4, 1694 gnome-shell mod_delayed_work_on (delayed_work_timer_fn) 17, 7 rcu_sched rcu_gp_kthread (process_timeout) ... 1, 4911 kworker/u16:0 mod_delayed_work_on (delayed_work_timer_fn) 1D, 2522 kworker/0:0 queue_delayed_work_on (delayed_work_timer_fn) 1029 total events, 583.333 events/sec
Linux version 1.0.9 (quinlan@phaze) #1 Sat May 14 01:51:54 EDT 1994
This manual page is incomplete, possibly inaccurate, and is the kind of thing that needs to be updated very often.
The Linux kernel source files: Documentation/filesystems/proc.txt Documentation/sysctl/fs.txt, Documentation/sysctl/kernel.txt, Documentation/sysctl/net.txt, and Documentation/sysctl/vm.txt.