载入中。。。

载入中。。。

公告

载入中。。。

文章

评论

留言

连接

   流媒体世界

信息

登陆

搜索

2006-10-11 18:04:00
Linux MMU summary(4)

 
4. Page Swap and Cache
 
A user process page is kept in either the page cache or the swap cache, the kernel tries to keep as much as pages remain in the memory, so that page faults can be serviced quickly, but, physical memory is always limited resources, thus, swap cache can be a better-than-none replacement.
 
Logically, the cache is a layer between the kernel memory management code and the disk I/O code. Practically, the cache here means, only freeing the memory has to be necessary, we swap the pages to disk, otherwize, keeping the pages as long as they can.
 
The kernel maintains several page lists which comprise the whole page cache:
1. active_list: pages on the list have page->age > 0, may be clean or dirty, and may be mapped by process page-table entries.
2. inactive_dirty_list: pages on the list have page->age == 0, may be clean or dirty, and are not mapped by any process PTE.
3. inactive_clean_list: each zone has its own inactive_clean_list, which contains clean pages whose age == 0, not mapped by any process PTE.
 
When a page fault is occurred, the kernel first looks for the fault page in the page cache, if found, it can be moved directly to the active_list.
 
Overall, the kernel performs the page replacement more like "not recently used", rather than strict LRU. Furthermore, each page either for an executable image or mmap()ed file is associated with a per-inode cache, allowing the disk file to be used as backing storage for the page. For those anonymous pages(i.e. the pages like malloc()ed memory) are assigned an entry in the system swap file, and those pages are maintained in the swap cache.
 
The life cycle of a user page:
 
1. Page loading, there are three ways that can lead to a page loaded from the disk.
    a. The process attempts to access the page, it is then read in by the page-fault handler and added to the page cache and to the process page table.
    b. The page is read in during a swap read-ahead operation. The reason is simple as a cluster of blocks on disk is easy to read sequentially.
    c. The page is read in during a mmap cluster read-ahead operation, in which case a sequential of adjacent pages following the fault page in a mmaped file is read.
 
2. when the page is written by the process, it becomes dirty. At this point, the page is still on the active_list.
 
3. suppose the page is not used for a while, thus the page->age count is gradually reduced by the periodic invocations(actually, the prime is refill_inactive()) of the kernel swap daemon kswapd(). The frequence of kswapd() invocations will increase as the memory pressure increases.
 
4. if the physical memory is tight, kswapd() will call swap_out() to try to evict pages from the process's virtual address space.Since the page hasn't been referenced and has age 0, the PTE will be dropped, and the only remaining reference to the page is the one resulting from its presence in the page cache (assuming, of course, that no other process has mapped the file in the meantime). swap_out() does not actually swap the page out; rather, it simply removes the process's reference to the page, and depends upon the page cache and swap machinery to ensure the page gets written to disk if necessary. (If a PTE has been referenced when swap_out() examines it, the mapped page is aged up - made younger - rather than being unmapped.) 
 
5. Time passes... a little or a lot, depending on memory demand.
 
6. refill_inactive_scan() comes along, trying to find pages that can be moved to the inactive_dirty list. Since the page is not mapped by any process and has age 0, it is moved from the active_list to the inactive_dirty list.
 
7. Process A attempts to access the page, but it's not present in the process VM since the PTE has been cleared by swap_out(). The fault handler calls __find_page_nolock() to try to locate the page in the page cache, and lo and behold, it's there, so the PTE can be immediately restored, and the page is moved to the active_list, where it remains as long as it is actively used by the process.
 
8. More time passes... swap_out() clears the process 's PTE for the page, refill_inactive_scan() deactivates the page, moving it to the inactive_dirty list.
 
9. More time passes... memory gets low.
 
10. page_launder() is invoked to clean some dirty pages. It finds P on the inactive_dirty_list, notices that it's actually dirty, and attempts to write it out to the disk. When the page has been written, it can then be moved to the inactive_clean_list. The following sequence of events occurs when page_launder() actually decides to write out a page:
 
    * Lock the page.

    * We determine the page needs to be written, so we call the writepage method of the page's mapping. That call invokes some filesystem-specific code to perform an asynchronous write to disk with the page locked. At that point, page_launder() is finished with the page: it remains on the inactive_dirty_list, and will be unlocked once the async write completes.

    * Next time page_launder() is called it will find the page clean and move it to the inactive_clean_list, assuming no process has found it in the pagecache and started using it in the meantime.     

 

11. page_launder() runs again, finds the page unused and clean, and moves it to the inactive_clean_list of the page's zone.

 

12. An attempt is made by someone to allocate a single free page from the page's zone. Since the request is for a single page, it can be satisfied by reclaiming an inactive_clean page; The page is chosen for reclamation. reclaim_page() removes the page from the page cache (thereby ensuring that no other process will be able to gain a reference to it during page fault handling), and it is given to the caller as a free page.

Or:

kreclaimd comes along trying to create free memory. It reclaims the page and then frees it.

 

Note that this is only one possible sequence of events: a page can live in the page cache for a long time, aging, being deactivated, being recovered by processes during page fault handling and thereby reactivated, aging, being deactivated, being laundered, being recovered and reactivated...

Pages can be recovered from the inactive_clean and active lists as well as from the inactive_dirty list. Read-only pages, naturally, are never dirty, so page_launder() can move them from the inactive_dirty_list to the inactive_clean_list "for free," so to speak.

Pages on the inactive_clean list are periodically examined by the kreclaimd kernel thread and freed. The purpose of this is to try to produce larger contiguous free memory blocks, which are needed in some situations.

Finally, note that the page is in essence a logic page, though of course it is instantiated by some particular physical page.

* This summary is heavily based on the Joseph Knapka's summary, most of the words cann't be packed any more, I just copied down and put here for reference. What a great summary, thanks!



  • 标签:Linux MMU 
  • 发表评论:
    载入中。。。
    Powered by Oblog.