summaryrefslogtreecommitdiff
path: root/fs/f2fs/checkpoint.c
AgeCommit message (Collapse)Author
2014-04-02f2fs: use list_for_each_entry{_safe} for simplyfying codeChao Yu
This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for simplfying code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02f2fs: avoid free slab cache under spinlockChao Yu
Move kmem_cache_free out of spinlock protection region for better performance. Change log from v1: o remove spinlock protection for kmem_cache_free in destroy_node_manager suggested by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20f2fs: call f2fs_wait_on_page_writeback instead of native functionJaegeuk Kim
If a page is on writeback, f2fs can face with deadlock due to under writepages. This is caused by merging IOs inside f2fs, so if it comes to detect, let's throw merged IOs, which is implemented by f2fs_wait_on_page_writeback. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: introduce nr_pages_to_write for segment alignmentJaegeuk Kim
This patch introduces nr_pages_to_write to align page writes to the segment or other operational unit size, which can be tuned according to the system environment. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: increase pages_skipped when skipping writepagesJaegeuk Kim
This patch increases pages_skipped when skipping writepages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: avoid small data writes by skipping writepagesJaegeuk Kim
This patch introduces nr_pages_to_skip(sbi, type) to determine writepages can be skipped. The dentry, node, and meta pages can be conrolled by F2FS without breaking the FS consistency. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: introduce get_dirty_dents for readabilityJaegeuk Kim
The get_dirty_dents gives us the number of dirty dentry pages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: remove the unused ctor argument of f2fs_kmem_cache_create()Gu Zheng
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-28f2fs: fix dirty page accounting when redirtyChao Yu
We should de-account dirty counters for page when redirty in ->writepage(). Wu Fengguang described in 'commit 971767caf632190f77a40b4011c19948232eed75': "writeback: fix dirtied pages accounting on redirty De-account the accumulative dirty counters on page redirty. Page redirties (very common in ext4) will introduce mismatch between counters (a) and (b) a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied b) NR_WRITTEN, BDI_WRITTEN This will introduce systematic errors in balanced_rate and result in dirty page position errors (ie. the dirty pages are no longer balanced around the global/bdi setpoints)." Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27f2fs: readahead contiguous SSA blocks for f2fs_gcChao Yu
If there are multi segments in one section, we will read those SSA blocks which have contiguous address one by one in f2fs_gc. It may lost performance, let's read ahead SSA blocks by merge multi read request. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: show counts of checkpoint in statusChangman Lee
This patch shows the counts of checkpoint in f2fs' status. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pagesChao Yu
This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages function into ra_meta_pages. Additionally the new function is used to readahead cp block in recover_orphan_inodes. Change log from v1: o fix a deadloop bug pointed by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: clean up redundant function callJaegeuk Kim
This patch integrates inode_[inc|dec]_dirty_dents with inc_page_count to remove redundant calls. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix f2fs_write_meta_page at no checkpoint statusJaegeuk Kim
If f2fs entered errorneous checkpoint status, it should skip writing meta pages instead of redirtying the pages out. Otherwise, it cannot unmount the partition even though f2fs is under read-only status. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: introduce NODE_MAPPING for code consistencyJaegeuk Kim
This patch adds NODE_MAPPING which is similar as META_MAPPING introduced by Gu Zheng. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: remove the orphan block page arrayGu Zheng
As the orphan_blocks may be max to 504, so it is not security and rigorous to store such a large array in the kernel stack as Dan Carpenter said. In fact, grab_meta_page has locked the page in the page cache, and we can use find_get_page() to fetch the page safely in the downstream, so we can remove the page array directly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: add help function META_MAPPINGGu Zheng
Introduce help function META_MAPPING() to get the cache meta blocks' address space. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-14f2fs: use spinlock rather than mutex for better speedGu Zheng
With the 2 previous changes, all the long time operations are moved out of the protection region, so here we can use spinlock rather than mutex (orphan_inode_mutex) for lower overhead. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-14f2fs: move alloc new orphan node out of lock protection regionGu Zheng
Move alloc new orphan node out of lock protection region. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-14f2fs: move grabing orphan pages out of protection regionGu Zheng
Move grabing orphan block page out of protection region, and grab all the orphan block pages ahead. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove unnecessary code pointed by Chao Yu] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-26f2fs: convert max_orphans to a field of f2fs_sb_infoGu Zheng
Previously, we need to calculate the max orphan num when we try to acquire an orphan inode, but it's a stable value since the super block was inited. So converting it to a field of f2fs_sb_info and use it directly when needed seems a better choose. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: write dirty meta pages collectivelyJaegeuk Kim
This patch enhances writing dirty meta pages collectively in background. During the file data writes, it'd better avoid to write small dirty meta pages frequently. So let's give a chance to collect a number of dirty meta pages for a while. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: refactor bio->rw handlingJaegeuk Kim
This patch introduces f2fs_io_info to mitigate the complex parameter list. struct f2fs_io_info { enum page_type type; /* contains DATA/NODE/META/META_FLUSH */ int rw; /* contains R/RS/W/WS */ int rw_flag; /* contains REQ_META/REQ_PRIO */ } 1. f2fs_write_data_pages - DATA - WRITE_SYNC is set when wbc->WB_SYNC_ALL. 2. sync_node_pages - NODE - WRITE_SYNC all the time 3. sync_meta_pages - META - WRITE_SYNC all the time - REQ_META | REQ_PRIO all the time ** f2fs_submit_merged_bio() handles META_FLUSH. 4. ra_nat_pages, ra_sit_pages, ra_sum_pages - META - READ_SYNC Cc: Fan Li <fanofcode.li@samsung.com> Cc: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: add unlikely() macro for compiler more aggressivelyJaegeuk Kim
This patch adds unlikely() macro into the most of codes. The basic rule is to add that when: - checking unusual errors, - checking page mappings, - and the other unlikely conditions. Change log from v1: - Don't add unlikely for the NULL test and error test: advised by Andi Kleen. Cc: Chao Yu <chao2.yu@samsung.com> Cc: Andi Kleen <andi@firstfloor.org> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: add unlikely() macro for compiler optimizationChao Yu
As we know, some of our branch condition will rarely be true. So we could add 'unlikely' to let compiler optimize these code, by this way we could drop unneeded 'jump' assemble code to improve performance. change log: o add *unlikely* as many as possible across the whole source files at once suggested by Jaegeuk Kim. Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: refactor bio-related operationsJaegeuk Kim
This patch integrates redundant bio operations on read and write IOs. 1. Move bio-related codes to the top of data.c. 2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read bios additionally. 3. Introduce __submit_merged_bio to submit the merged bio. 4. Change f2fs_readpage to f2fs_submit_page_bio. 5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and submit_write_page. Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: convert recover_orphan_inodes to voidChao Yu
The recover_orphan_inodes() returns no error all the time, so we don't need to check its errors. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: add description] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: avoid to calculate incorrect max orphan numberChao Yu
Because we will write node summaries when do_checkpoint with umount flag, our number of max orphan blocks should minus NR_CURSEG_NODE_TYPE additional. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Shu Tan <shu.tan@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: bug fix on bit overflow from 32bits to 64bitsJaegeuk Kim
This patch fixes some bit overflows by the shift operations. Dan Carpenter reported potential bugs on bit overflows as follows. fs/f2fs/segment.c:910 submit_write_page() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/checkpoint.c:429 get_valid_checkpoint() warn: should '1 << ()' be a 64 bit type? fs/f2fs/data.c:408 f2fs_readpage() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/data.c:457 submit_read_page() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/data.c:525 get_data_block_ro() warn: should 'i << blkbits' be a 64 bit type? Bug-Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: fix a potential out of range issueGu Zheng
Fix a potential out of range issue introduced by commit: 22fb72225a f2fs: simplify write_orphan_inodes for better readable Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: send REQ_META or REQ_PRIO when reading meta areaChangman Lee
Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT etc. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: move the list_head initialization into the lock protection regionGu Zheng
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: simplify write_orphan_inodes for better readableGu Zheng
Simplify write_orphan_inodes for better readable. Because we hold the orphan_inode_mutex, so it's safe to use list_for_each_entry instead of list_for_each_safe. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-11-08f2fs: cleanup waiting routine for writeback pages in cpChangman Lee
use genernal method supported by kernel o changes from v1 If any waiter exists at end io, wake up it. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-29f2fs: add an option to avoid unnecessary BUG_ONsJaegeuk Kim
If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS in your kernel config. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-25f2fs: add tracepoint for set_page_dirtyJaegeuk Kim
This patch adds a tracepoint for set_page_dirty. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-25f2fs: use bool for booleansHaicheng Li
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-25f2fs: clean up several status-related operationsJaegeuk Kim
This patch cleans up improper definitions that update some status information. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-22f2fs: introduce f2fs_kmem_cache_alloc to hide the unfailed, kmem cache ↵Gu Zheng
allocation Introduce the unfailed version of kmem_cache_alloc named f2fs_kmem_cache_alloc to hide the retry routine and make the code a bit cleaner. v2: Fix the wrong use of 'retry' tag pointed out by Gao feng. Use more neat code to remove redundant tag suggested by Haicheng Li. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-18f2fs: avoid to write during the recoveryJaegeuk Kim
This patch enhances the recovery routine not to write any data/node/meta until its completion. If any writes are sent to the disk, it could contaminate the written history that will be used for further recovery. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-18f2fs: avoid wait if IO end up when do_checkpoint for better performanceGu Zheng
Previously, do_checkpoint() will call congestion_wait() for waiting the pages (previous submitted node/meta/data pages) to be written back. Because congestion_wait() will set a regular period (e.g. HZ / 50 ) for waiting, and no additional wake up mechanism was introduced if IO ends up before regular period costed. Yuan Zhong found there is a situation that after the pages have been written back, but the checkpoint thread still wait for congestion_wait to exit. So here we store checkpoint task into f2fs_sb when doing checkpoint, it'll wait for IO completes if there's IO going on, and in the end IO path, wake up checkpoint task when IO ends up. Thanks to Yuan Zhong's pre work about this problem. Reported-by: Yuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-07f2fs: use rw_sem instead of fs_lock(locks mutex)Gu Zheng
The fs_locks is used to block other ops(ex, recovery) when doing checkpoint. And each other operate routine(besides checkpoint) needs to acquire a fs_lock, there is a terrible problem here, if these are too many concurrency threads acquiring fs_lock, so that they will block each other and may lead to some performance problem, but this is not the phenomenon we want to see. Though there are some optimization patches introduced to enhance the usage of fs_lock, but the thorough solution is using a *rw_sem* to replace the fs_lock. Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other, this can avoid the problem described above completely. Because of the weakness of rw_sem, the above change may introduce a potential problem that the checkpoint thread might get starved if other threads are intensively locking the read semaphore for I/O.(Pointed out by Xu Jin) In order to avoid this, a wait_list is introduced, the appending read semaphore ops will be dropped into the wait_list if checkpoint thread is waiting for write semaphore, and will be waked up when checkpoint thread gives up write semaphore. Thanks to Kim's previous review and test, and will be very glad to see other guys' performance tests about this patch. V2: -fix the potential starvation problem. -use more suitable func name suggested by Xu Jin. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: adjust minor coding standard] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-09-25f2fs: don't let the orphan inode counter underflowRuss W. Knize
Accounting errors from buggy code calling the acquire/release/remove orphan inode interfaces can cause n_orphans to underflow, which will then cause acquire_orphan_inode() to return -ENOSPC on the next operation. This commit guards against that condition. Signed-off-by: Russ Knize <rknize@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-08-09f2fs: introduce cur_cp_version function to reduce code sizeJaegeuk Kim
This patch introduces a new inline function, cur_cp_version, to reduce redundant codes. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-30f2fs: fix handling orphan inodesJaegeuk Kim
This patch fixes mishandling of the sbi->n_orphans variable. If users request lots of f2fs_unlink(), check_orphan_space() could be contended. In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink() would fall into the wrong state which results in the failure of add_orphan_inode(). So, let's increment sbi->n_orphans virtually prior to the actual orphan inode stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode or remove_orphan_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-30f2fs: use list_for_each rather than list_for_each_safe, in remove_orphan_inode()Gu Zheng
As we remove the target single node, so list_for_each is enought, in order to clean up, we use list_for_each_entry instead. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02f2fs: fix crc endian conversionJaegeuk Kim
While calculating CRC for the checkpoint block, we use __u32, but when storing the crc value to the disk, we use __le32. Let's fix the inconsistency. Reported-and-Tested-by: Oded Gabbay <ogabbay@advaoptical.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-07f2fs: fix iget/iput of dir during recoveryJaegeuk Kim
It is possible that iput is skipped after iget during the recovery. In recover_dentry(), dir = f2fs_iget(); ... if (de && inode->i_ino == le32_to_cpu(de->ino)) goto out; In this case, this dir is not able to be added in dirty_dir_inode_list. The actual linking is done only when set_page_dirty() is called. So let's add this newly got inode into the list explicitly, and put it at the end of the recovery routine. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix incorrect iputs during the dentry recoveryJaegeuk Kim
- iget/iput flow in the dentry recovery process 1. *dir* = f2fs_iget 2. set FI_DELAY_IPUT to *dir* 3. add *dir* to the dirty_dir_list - __f2fs_add_link - recover_dentry) 4. iput *dir* by remove_dirty_dir_inode - sync_dirty_dir_inodes - write_chekcpoint If *dir*'s i_count is not 1 (i.e., root dir), remove_dirty_dir_inode is called later and then iput is triggered again due to the FI_DELAY_IPUT flag. So, let's unset the flag properly once iput is triggered. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: iput only if whole data blocks are flushedJaegeuk Kim
If there remains some unwritten blocks from the recovery, we should not call iput on that directory inode. Otherwise, we can loose some dentry blocks after the recovery. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>