Linux Memory Management – Swapping, Caches and Shared VM

by Himanshu Arora on February 27, 2012

This article is part of our on-going UNIX kernel overview series.

In the previous article of the kernel series, we discussed about Linux virtual memory and demand paging.

Though virtual memory and demand paging are the building blocks of the Linux memory management system, there are various other concepts that make Linux memory management very powerful.

In this article we will try to touch base on some of these concepts (Swapping, Caching and Shared virtual memory).

I. Linux Swapping

Suppose there is a situation where a process needs to have one of its virtual page into the physical memory but the physical memory has no room for any more pages to be brought in.

What will happen in this case?

Well, the OS will have to manage this situation by allowing this page to be in physical memory. But for this to happen, a page already residing in physical memory needs to be discarded.

Now, if the page to be discarded is from an executable image or data file and the page has not been written to, then it can easily be discarded as whenever required, the same page can easily be brought back into physical memory from the same executable image or data file.

But lets suppose the page which the OS is going to discard is the one which has been written to, then this kind of page is known as a dirty page.

A dirty page has to be preserved so as to be used at some later stage. When dirty pages are discarded from physical memory then they are saved in a special file known as swap files. This is known as swapping.

The time taken to access a swap page is quite substantial if compared to speed of processor.

So we see that OS should have a good swapping algorithm in order to decide on discarding vs swapping. An inefficient swap algorithm may lead to a phenomenon where-in the OS gets so busy in swapping files and then again reading back to physical memory that its actually devotes very less time to the real work that processes demand. This phenomenon is known as thrashing.

Also, A set of pages that a process is continuously using is known as working set . A good swap algorithm would rarely let the OS get into thrashing and would also make sure that working set of all the processes is always there in the physical memory.

Linux decides on which pages are to be kept in the memory and which pages are to removed using a scheme of ‘Least recently used‘.

In this scheme, each page in physical memory has an age associated with it. The age changes with the fact that the page is being accessed or not. If the page is being frequently accessed then that page is supposed to be quite young in age while if a page is not being accessed than that page becomes older. The older age pages are proffered to be swapped/discarded from physical memory.

II. Caches

In order to extract the most out of system, fast processors and operating systems are being developed. Though this is good but one aspect which makes the processor , operating system and their interaction faster is the concept of caches.

Some of the important caches in Linux are described below.

1. Linux Swap Cache

As already discussed above, only dirty pages are swapped as we need to retain the pages that have been modified. Also, suppose if a page was modified and was swapped, now if the same page was brought back into physical memory and now there is a need to swap it again but the page has not been modified any further then there is no need to swap this page. Just discard it as this version of the page is already there in the swap file. This saves a good amount of time which otherwise would have been wasted.

Now to implement the above concept, Linux makes use of swap cache.

A swap cache is nothing but a list of page table entries with one entry per physical page.
Each entry corresponds to a swapped out page along with the information about the swap file in which the page is being held along with its exact location in the swap file.
If any page table entry in swap cache is non-zero then it represents a page that is being held in a swap file and that page has not been modified any further.
If a page has its entry in the swap cache and is further modified then its entry from swap cache is removed.
This way the cache contains information only on those pages that are not modified since they were last swapped.

So we see that swap cache helps a lot in increasing the efficiency of swapping mechanism.

2. Hardware Cache

As we have already discussed in previous article that a processor reads page table entries to convert virtual address to physical address. Usually a processor stores the information of page table entries in a hardware cache.

This hardware cache consists of Translational look-aside buffers or TLBs.

Whenever a processor needs to translate a virtual address, then it tries to fetch the page table entry information from TLBs. If it finds the entry then it proceeds further but if processor is not able to find any such entry then it tells the OS that a TLB miss has occurred and asks the OS to fix things up.

To deliver this information of TLB miss to OS, some kind of exception mechanism is used which is processor dependent. Now, the OS finds the correct entry and updates the TLB entry with it. When the exception is cleared (after OS fixes the problem) then the processor again tries to search the TLBs for the entry and this time it finds a valid entry.

3. Linux Buffer Cache

A buffer cache contains data buffers that the block device drivers use.

A block device driver is a one that operates on block of data i.e. it can be accessed by reading or writing fixed chunks or blocks of data. A buffer cache is indexed. The device identifier is used for the indexing purpose.

The buffer cache makes the reading/writing very efficient and fast. For example consider a block device for example a hard disk. Reading/writing a hard disk requires file I/O which is quite expensive if we do it on hard disk each time a read or write is done. This buffer cache which sits in between, saves time as reads and write are done on this and rest is taken care by the cache.

To view swap, memory, page, block IO, traps, disks and cpu activity, you can use tools like vmstat or sar.

III. Shared Virtual memory

When code is written then great care is taken by the developers that no piece of code is unnecessarily repeated. For example, functions are used in programs so that same piece of code can be called anytime from within the code.

A group of functions which can be commonly used are clubbed up into libraries. There-in comes the concept of shared memory which is once loaded into memory and can be used by multiple processes.

Virtual memory makes it easy for processes to share memory this is because the physical address are mapped through page tables and this is very much possible that same physical page frame number could be mapped in page table of multiple processes. This concept is known as shared virtual memory.

Add your comment

If you enjoyed this article, you might also like..

Comments on this entry are closed.

Bilal MK February 27, 2012, 3:52 am

Nice article…

∞
Frank Chang February 29, 2012, 11:48 am

Very good article!! : )

∞
Anonymous March 1, 2012, 3:20 am

Could you please provide tutorial on memory swapping, paging with the examples.
It will help us to understand the concept clearly and we can use that in our daily practices.

Thanks,
Manoj_K

∞
gus3 March 1, 2012, 6:18 am

The TLB is not the only resident in hardware cache. Once a physical address is known for fetching, the hardware next looks in either the instruction (I) cache, if the fetch is requesting code, or the data (D) cache, if the fetch is requesting an instruction operand. If the contents of the physical address have been fetched recently, the appropriate cache will send the requested contents back to the CPU, much more quickly than a fetch from system RAM could supply.

If the requested contents are not present in the cache, then the contents must be fetched from the system RAM. Most of the time, this happens automatically (e.g. the x86 family), but it isn’t a requirement. The old Alpha processor would actually manage the caches and RAM fetching from software.

∞

Next post: 10 Linux DIG Command Examples for DNS Lookup

Previous post: Creating a Daemon Process in C Language with an Example Program