UNIX Process Overview (Inside a Linux Process, and Types of Process)

by Himanshu Arora on February 10, 2012

A process is a running instance of a program. In this article we used two terms ‘program’ and ‘running instance’. Suppose we run a program simultaneously 5 times, then corresponding to each instance there will be a process running in the system. So we say that a process is a “running instance” of a program.

As you already know, you can use ps command to view the processes running on your system. For effective use of the ps command, refer to 7 Practical PS Command Examples for Process Monitoring.

1. Peeping Inside a Process

Now, since we are clear with what exactly a process is, lets dig a bit deeper to see what a process consists of. A Unix process can be thought of as a container which contains:

Program Instructions

Program instructions are kept in text segments which are executed by CPU. Usually for programs like text editors which are executed frequently the text segment is shared. This segment has read only privileges which means that a program cannot modify its text segment.

Data

Mostly the data is kept in data segment. Data segment can be classified into initialized data segment and uninitialized data segment. As the name suggest, initialized data segment contains those global variables which are initialized before hand while uninitialized data segment (also known as ‘BSS’ segment) contains uninitialized global variables. Also, static variables are stored in data segment.

Local variables which are local to functions are stored on stack. Stack is particular to a function and besides containing the information about local variables it also contains information about the address where the flow will return once the execution of function is done. Stack also contains information about the callers environment, like some of the machine registers are also stored on stack. A function which is called allocates memory for its local variables and temporary variables on stack itself. In case of recursive function an independent stack for each function call exists.

Then there is data which is stored on heap. This memory for this data is allocated on runtime on heap segment. Heap segment is not local to a process but shared across processes. This is the reason why C programmers worry a lot about memory leaks which are caused on heap segment and may affect other processes on the system.

Command line arguments and environment variables

A process also contains room for storing environment variables and the command line arguments that we pass to the program. Usually the vector containing the command line information is stored here and then the address of this vector of information and number of elements in vector is copied to ‘argv’ and ‘argc’ (the two arguments to ‘main()’ function).

Besides the above information, a process also contains information like

State of its I/O
Its priority and other control information

One of the most important control information for a process is the privileges. A process directly inherits all the privileges of the user who has triggered this process. For example a process triggered by user who does not have superuser privileges cannot do stuff that require root privileges while a process triggered by root can do any thing that it is programmed to do. An exception to the above rule is where a process can acquire greater privileges than the user who triggered it if the setuid or setgid bit is set for that particular process. But we will not go into much detail about it here(refer to the man pages of setuid and setgid for more information on this).

2. Background and foreground processes

As we already discussed that we can start a process by its name in Unix. Like some standard programs ‘ls’, ‘ps’ etc can be started by just typing their name on the shell prompt. There are two ways in which we can start a process

Starting in foreground
Starting in background

Suppose there is a utility that consumes some time and does a count. Lets say the the name of the utility is ‘count’ Now to trigger and run the program in foreground, I run the following command (where ‘count’ is the name of the binary from the code above) :

$ ./count
Counting done

So we see that, after running the binary ‘./count’, it took almost 10 seconds before the output was displayed on stdout and until then the shell was occupied by this process only. ie You could not perform any other operation on the same shell. Now, to trigger a process in background, add ‘&’ at the end of the command:

$ ./count &
[1] 4120

$ # Do some work on shell while the above program is working in the background

$ Counting done

The ampersand ‘&’ sign indicates that this process needs to be run as a background process. By running a background process, we can have access to the shell for doing any further operations. Like, in the output above, after running the binary ‘count’ in background, I used a couple of more commands on the same shell and when the binary ‘count’ was done with its processing, the output was thrown back on the same shell(the last line). So we can conclude that by default every process runs in foreground, receives input(if any) from keyboard and returns output to the user. While a background process is one which gets disconnected from the keyboard and user can use the same shell to do more operations.

For more information on foreground and background processes refer to: How to Manage UNIX Background Jobs

3. Types of process

So we see that process is a concept that is fundamental to an operating system. Almost every activity on an OS takes form of a process to do some stuff. There are different types of processes running on a system, some of them are :

Child processes

A process that is created by some other process during run-time. Usually child processes are created to execute some binary from within an existing process. Child processes are created using fork() system call. Normally process are made to run through shell/terminal. In that case the shell becomes the parent and the executed process becomes the child process. On Unix/Linux each process has a parent except the init process(we will learn about this later).

Daemon Processes

These are special processes that run in background. They are system related process that have no associated terminal. These processes run will root permissions and usually provide services to processes. As we already know that a daemon process does not have an attached terminal, well to achieve this the process has to be detached from the terminal. The ideal way on Linux/Unix to do this is to run a process through terminal and from within this process create another process and then terminate the parent process. Since the parent is terminated so now the child will become independent of the terminal and would be taken over by init process and hence would become a daemon process. A typical example would be a mail daemon that waits for the arrival of e-mails and notify when a mail is received.

Orphan processes

Usually a process creates a child process (as described above) and when the child process terminates, a signal is issued to the parent so that parent can do all the stuff that it is required to do when one of the child gets terminated. But there are situations when parent gets killed. In that case the child processes become orphan and then taken under by the init process. Though the init process takes the ownership of the orphan process but still these process are called as orphan as their original parents no longer exists.

Zombie process

When a child process gets terminated or completes its execution, then its entry in the process table remains until the parent process fetches the status information of the terminated child. So, until then the terminated process enters zombie state and is known as zombie process. When a process is terminated then all the memory and resources associated with the process are released but the entry of the process in process table exists. A signal SIGCHILD is send to the parent of the process (that just terminated). Typically, the handler of this signal in the parent executes a ‘wait’ call that fetches the exit status of the terminated process and then the entry of this zombie process from the process table is also removed.

4. The init process

As we discussed earlier, init process is the 5th stage in the 6 Stage of Linux Boot Process.

You would be cognizant of the famous ‘chicken and egg’ theory regarding who came first. In terms of processes, as each process has a parent process, the same question can be asked about parent or child process. Well, fortunately there is an answer here. The answer is the init process that is started as a first process during boot sequence. That means there is no parent of init process. Lets verify it, since PID of init is ‘1’, we use the ps command :

So we see from the output that PPID is 0, which means that there is no parent for this process.

$ ps -l 1
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY        TIME CMD
4 S     0     1     0  0  80   0 -  5952 poll_s ?          0:00 /sbin/init

The following are some of the commands that deal with processes: Top command, HTOP Command, PS command, Kill (pkill, xkill) command.

Add your comment

If you enjoyed this article, you might also like..

Comments on this entry are closed.

Ganesh b February 10, 2012, 7:35 am

thanks for sharing such a nice
informative document. keep it
up.

∞
BalaC February 13, 2012, 3:40 am

@Himanshu: “But there are situations when parent gets killed. In that case the child processes become orphan”. Can you provide some example

∞
kumarbka February 13, 2012, 9:23 am

Thanks for the share and nice info…

∞
Himanshu Arora February 13, 2012, 10:29 am

@BalaC : Creating a Daemon process is one example where the parent process is killed, child becomes orphan and finally taken over by the init process.

∞
pupu February 14, 2012, 5:16 am

Nice article. Just one note – many daemons novadays run under different user than root. They are usually started with root UID , but they just do whatever requires root priviledges (read config file, bind to priviledged port) and then switch to another, less priviledged user.

∞
Amit Dhiman August 31, 2013, 4:36 am

Thanx For such an informational Article . I have gone through the article , and found some really good knowledge .

There is an ambiguity , you are saying that heap memory for all the processes is shared , means allocated from a single source of huge heap memory .But i read at many places that . every process has its own heap memory .
you can follow this link.

∞
Sam October 29, 2013, 10:15 pm

Good job guys, good job.

∞

Next post: UNIX Kernel: Reentrant Kernels, Synchronization and Critical Sections

Previous post: How To Install Apache Hadoop Pseudo Distributed Mode on a Single Node