summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2026-06-22 18:44:48 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2026-06-22 18:44:48 -0700
commit502d801f0ab03e4f32f9a33d203154ce84887921 (patch)
tree8dd98de794f62fae7a0a5117ed232c6edc478fe2
parent4708cac0e22cfd217f48f7cec3c35e5922efcccd (diff)
parent803d09a554055aba160a62abd1e4b1260b899dc1 (diff)
Merge tag 'erofs-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang: "The most notable change is the removal of the fscache backend: it has been deprecated for almost two years, mainly because EROFS file-backed mounts and fanotify pre-content hooks (together with erofs-utils) now provide better functionality and simpler codebase. In addition, fscache has depended on netfslib for years, which is undesirable for EROFS since it is a local filesystem. More details in [1]. In addition, sparse support has been added to the pcluster layout, which is helpful for large sparse AI datasets, and map requests for chunk-based inodes have been optimized to be more efficient as well. There are also the usual fixes and cleanups. Summary: - Report more consecutive chunks of the same type for each iomap request - Add sparse support for the pcluster layout - Update the EROFS documentation overview - Remove the deprecated fscache backend - Various fixes and cleanups" Link: https://lore.kernel.org/r/20260622013622.934174-1-hsiangkao@linux.alibaba.com [1] * tag 'erofs-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: handle 48-bit blocks_hi for compressed inodes erofs: remove fscache backend entirely erofs: simplify RCU read critical sections erofs: add sparse support to pcluster layout erofs: add folio order to trace_erofs_read_folio erofs: introduce erofs_map_chunks() erofs: call erofs_exit_ishare() before rcu_barrier() erofs: update the overview of the documentation erofs: clean up erofs_ishare_fill_inode()
-rw-r--r--Documentation/filesystems/erofs.rst138
-rw-r--r--fs/erofs/Kconfig21
-rw-r--r--fs/erofs/Makefile1
-rw-r--r--fs/erofs/data.c135
-rw-r--r--fs/erofs/erofs_fs.h2
-rw-r--r--fs/erofs/fscache.c664
-rw-r--r--fs/erofs/inode.c7
-rw-r--r--fs/erofs/internal.h72
-rw-r--r--fs/erofs/ishare.c47
-rw-r--r--fs/erofs/super.c98
-rw-r--r--fs/erofs/zdata.c38
-rw-r--r--fs/erofs/zmap.c33
-rw-r--r--include/trace/events/erofs.h9
13 files changed, 231 insertions, 1034 deletions
diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
index fe06308e546c..4230884fb359 100644
--- a/Documentation/filesystems/erofs.rst
+++ b/Documentation/filesystems/erofs.rst
@@ -7,83 +7,90 @@ EROFS - Enhanced Read-Only File System
Overview
========
-EROFS filesystem stands for Enhanced Read-Only File System. It aims to form a
-generic read-only filesystem solution for various read-only use cases instead
-of just focusing on storage space saving without considering any side effects
-of runtime performance.
-
-It is designed to meet the needs of flexibility, feature extendability and user
-payload friendly, etc. Apart from those, it is still kept as a simple
-random-access friendly high-performance filesystem to get rid of unneeded I/O
-amplification and memory-resident overhead compared to similar approaches.
-
-It is implemented to be a better choice for the following scenarios:
-
- - read-only storage media or
-
- - part of a fully trusted read-only solution, which means it needs to be
+EROFS (Enhanced Read-Only File System) is a modern, efficient, and secure
+read-only kernel filesystem designed for various use cases including immutable
+system images, container images, application sandbox images, and dataset
+distribution.
+
+An immutable image filesystem can be regarded as an enhanced archive format
+which allows golden images to be built once and mounted everywhere -- images are
+bit-for-bit identical across all deployments and can be verified, audited, or
+shared without concerns about runtime modifications (in this model, all user
+writes should be redirected into another trusted filesystem, for example, via
+overlayfs for copy-on-write-style redirection, by design).
+
+EROFS is a dedicated implementation of the image filesystem idea above, with a
+flexible, hierarchical on-disk design so that needed features can be enabled on
+demand. Filesystem data in the core format is strictly block-aligned in order
+to perform optimally on all kinds of storage media, including block devices and
+memory-backed devices. The on-disk format is easy to parse and purposely avoids
+the unnecessary metadata redundancy found in generic writable filesystems, which
+can suffer from extra inconsistency issues -- making it ideal for security
+auditing and untrusted remote access. In addition, designs such as inline data,
+inline/shared extended attributes, and optimized (de)compression provide better
+space efficiency while maintaining high performance.
+
+In short, EROFS aims to be a better fit for the following scenarios:
+
+ - As part of a secure immutable storage solution, where it needs to be
immutable and bit-for-bit identical to the official golden image for
- their releases due to security or other considerations and
-
- - hope to minimize extra storage space with guaranteed end-to-end performance
- by using compact layout, transparent file compression and direct access,
- especially for those embedded devices with limited memory and high-density
- hosts with numerous containers.
+ each individual copy, in order to meet security, data sharing, and/or
+ other requirements;
-Here are the main features of EROFS:
+ - Minimizing storage overhead with guaranteed end-to-end performance
+ by using compact (meta)data layout, optimized transparent data compression,
+ deduplication and direct access, especially for those embedded devices with
+ limited memory and high-density hosts with numerous containers.
- - Little endian on-disk design;
+Here is the list of highlights:
- - Block-based distribution and file-based distribution over fscache are
- supported;
+ - Little endian on-disk design with 48-bit block addressing, supporting up
+ to 1 EiB filesystem capacity with 4 KiB block size;
- - Support multiple devices to refer to external blobs, which can be used
- for container images;
+ - Two compact inode metadata layouts for space and performance efficiency:
- - 32-bit block addresses for each device, therefore 16TiB address space at
- most with 4KiB block size for now;
+ ======================== ======== ======================================
+ compact extended
+ ======================== ======== ======================================
+ Inode core metadata size 32 bytes 64 bytes
+ Max file size 4 GiB 16 EiB (also limited by max. vol size)
+ Max uids/gids 65536 4294967296
+ Nanosecond timestamps no yes
+ Max hardlinks 65536 4294967296
+ ======================== ======== ======================================
- - Two inode layouts for different requirements:
+ - Support tailpacking inline data for better space efficiency and reduce
+ unneeded I/O amplification;
- ===================== ============ ======================================
- compact (v1) extended (v2)
- ===================== ============ ======================================
- Inode metadata size 32 bytes 64 bytes
- Max file size 4 GiB 16 EiB (also limited by max. vol size)
- Max uids/gids 65536 4294967296
- Per-inode timestamp no yes (64 + 32-bit timestamp)
- Max hardlinks 65536 4294967296
- Metadata reserved 8 bytes 18 bytes
- ===================== ============ ======================================
+ - Block-based and file-backed distribution are both supported;
- - Support extended attributes as an option;
+ - Multiple devices to reference external data blobs: inode data can be
+ optionally placed into external blobs, which enables image layering and data
+ sharing among different filesystems;
- - Support a bloom filter that speeds up negative extended attribute lookups;
+ - Inline and shared extended attributes with an optional bloom filter that
+ speeds up negative extended attribute lookups;
- - Support POSIX.1e ACLs by using extended attributes;
+ - POSIX.1e ACLs by using extended attributes;
- - Support transparent data compression as an option:
- LZ4, MicroLZMA, DEFLATE and Zstandard algorithms can be used on a per-file
- basis; In addition, inplace decompression is also supported to avoid bounce
- compressed buffers and unnecessary page cache thrashing.
+ - Transparent data compression as an option: Supported algorithms (LZ4,
+ MicroLZMA, DEFLATE and Zstandard) can be selected on a per-inode basis.
+ Both the on-disk metadata and decompression runtime have been heavily
+ optimized to minimize the overhead for better performance.
- - Support chunk-based data deduplication and rolling-hash compressed data
- deduplication;
+ - Merging tail-end data into a special inode as fragments;
- - Support tailpacking inline compared to byte-addressed unaligned metadata
- or smaller block size alternatives;
+ - Chunk-based deduplication and rolling-hash compressed data deduplication;
- - Support merging tail-end data into a special inode as fragments.
+ - Direct I/O and FSDAX support on uncompressed inodes for use cases such as
+ secure containers, loop devices, and ramdisks that do not need page caching;
- - Support large folios to make use of THPs (Transparent Hugepages);
+ - Page cache sharing among inodes with identical content fingerprints on
+ the same machine.
- - Support direct I/O on uncompressed files to avoid double caching for loop
- devices;
+For more detailed information, please refer to our documentation site:
- - Support FSDAX on uncompressed images for secure containers and ramdisks in
- order to get rid of unnecessary page cache.
-
- - Support file-based on-demand loading with the Fscache infrastructure.
+- https://erofs.docs.kernel.org
The following git tree provides the file system user-space tools under
development, such as a formatting tool (mkfs.erofs), an on-disk consistency &
@@ -91,10 +98,6 @@ compatibility checking tool (fsck.erofs), and a debugging tool (dump.erofs):
- git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
-For more information, please also refer to the documentation site:
-
-- https://erofs.docs.kernel.org
-
Bugs and patches are welcome, please kindly help us and send to the following
linux-erofs mailing list:
@@ -127,12 +130,9 @@ dax A legacy option which is an alias for ``dax=always``.
device=%s Specify a path to an extra device to be used together.
directio (For file-backed mounts) Use direct I/O to access backing
files, and asynchronous I/O will be enabled if supported.
-fsid=%s Specify a filesystem image ID for Fscache back-end.
-domain_id=%s Specify a trusted domain ID for fscache mode so that
- different images with the same blobs, identified by blob IDs,
- can share storage within the same trusted domain.
- Also used for different filesystems with inode page sharing
- enabled to share page cache within the trusted domain.
+domain_id=%s Specify a trusted domain ID. Filesystems sharing the same
+ domain ID can share page cache across mounts when inode
+ page sharing is enabled. (not shown in mountinfo output)
fsoffset=%llu Specify block-aligned filesystem offset for the primary device.
inode_share Enable inode page sharing for this filesystem. Inodes with
identical content within the same domain ID can share the
diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
index 97c48ebe8458..4789b1077d8c 100644
--- a/fs/erofs/Kconfig
+++ b/fs/erofs/Kconfig
@@ -3,13 +3,11 @@
config EROFS_FS
tristate "EROFS filesystem support"
depends on BLOCK
- select CACHEFILES if EROFS_FS_ONDEMAND
select CRC32
select CRYPTO if EROFS_FS_ZIP_ACCEL
select CRYPTO_DEFLATE if EROFS_FS_ZIP_ACCEL
select FS_IOMAP
select LZ4_DECOMPRESS if EROFS_FS_ZIP
- select NETFS_SUPPORT if EROFS_FS_ONDEMAND
select XXHASH if EROFS_FS_XATTR
select XZ_DEC if EROFS_FS_ZIP_LZMA
select XZ_DEC_MICROLZMA if EROFS_FS_ZIP_LZMA
@@ -109,9 +107,6 @@ config EROFS_FS_BACKED_BY_FILE
be used to simplify error-prone lifetime management of unnecessary
virtual block devices.
- Note that this feature, along with ongoing fanotify pre-content
- hooks, will eventually replace "EROFS over fscache."
-
If you don't want to enable this feature, say N.
config EROFS_FS_ZIP
@@ -172,20 +167,6 @@ config EROFS_FS_ZIP_ACCEL
If unsure, say N.
-config EROFS_FS_ONDEMAND
- bool "EROFS fscache-based on-demand read support (deprecated)"
- depends on EROFS_FS
- select FSCACHE
- select CACHEFILES_ONDEMAND
- help
- This permits EROFS to use fscache-backed data blobs with on-demand
- read support.
-
- It is now deprecated and scheduled to be removed from the kernel
- after fanotify pre-content hooks are landed.
-
- If unsure, say N.
-
config EROFS_FS_PCPU_KTHREAD
bool "EROFS per-cpu decompression kthread workers"
depends on EROFS_FS_ZIP
@@ -207,7 +188,7 @@ config EROFS_FS_PCPU_KTHREAD_HIPRI
config EROFS_FS_PAGE_CACHE_SHARE
bool "EROFS page cache share support (experimental)"
- depends on EROFS_FS && EROFS_FS_XATTR && !EROFS_FS_ONDEMAND
+ depends on EROFS_FS && EROFS_FS_XATTR
help
This enables page cache sharing among inodes with identical
content fingerprints on the same machine.
diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index a80e1762b607..30423496786f 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -9,5 +9,4 @@ erofs-$(CONFIG_EROFS_FS_ZIP_DEFLATE) += decompressor_deflate.o
erofs-$(CONFIG_EROFS_FS_ZIP_ZSTD) += decompressor_zstd.o
erofs-$(CONFIG_EROFS_FS_ZIP_ACCEL) += decompressor_crypto.o
erofs-$(CONFIG_EROFS_FS_BACKED_BY_FILE) += fileio.o
-erofs-$(CONFIG_EROFS_FS_ONDEMAND) += fscache.o
erofs-$(CONFIG_EROFS_FS_PAGE_CACHE_SHARE) += ishare.o
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 44da21c9d777..9aa48c8d67d1 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -80,9 +80,7 @@ int erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb,
if (erofs_is_fileio_mode(sbi)) {
buf->file = sbi->dif0.file; /* some fs like FUSE needs it */
buf->mapping = buf->file->f_mapping;
- } else if (erofs_is_fscache_mode(sb))
- buf->mapping = sbi->dif0.fscache->inode->i_mapping;
- else
+ } else
buf->mapping = sb->s_bdev->bd_mapping;
return 0;
}
@@ -98,17 +96,73 @@ void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb,
return erofs_bread(buf, offset, true);
}
-int erofs_map_blocks(struct inode *inode, struct erofs_map_blocks *map)
+static int erofs_map_chunks(struct inode *inode, struct erofs_map_blocks *map)
{
struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
struct super_block *sb = inode->i_sb;
- unsigned int unit, blksz = sb->s_blocksize;
struct erofs_inode *vi = EROFS_I(inode);
struct erofs_inode_chunk_index *idx;
- erofs_blk_t startblk, addrmask;
- bool tailpacking;
+ unsigned int unit = vi->chunkformat & EROFS_CHUNK_FORMAT_INDEXES ?
+ sizeof(*idx) : EROFS_BLOCK_MAP_ENTRY_SIZE;
+ erofs_blk_t addrmask = (vi->chunkformat & EROFS_CHUNK_FORMAT_48BIT) ?
+ BIT_ULL(48) - 1 : BIT_ULL(32) - 1;
+ u64 nr = map->m_la >> vi->chunkbits, chunksize = 1ULL << vi->chunkbits;
+ erofs_off_t pos = ALIGN(erofs_iloc(inode) + vi->inode_isize +
+ vi->xattr_isize, unit) + unit * nr;
+ /* m_llen will be clamped to EOF in the end */
+ erofs_off_t endpos = round_up(pos + 1, sb->s_blocksize);
+ u64 last, addr;
+
+ idx = erofs_read_metabuf(&buf, sb, pos, erofs_inode_in_metabox(inode));
+ if (IS_ERR(idx))
+ return PTR_ERR(idx);
+
+ map->m_la = nr << vi->chunkbits;
+ map->m_llen = 0;
+ nr = 0;
+ do {
+ if (unit == EROFS_BLOCK_MAP_ENTRY_SIZE) {
+ addr = le32_to_cpu(((__le32 *)idx)[nr]);
+ if (addr == (u32)EROFS_NULL_ADDR)
+ addr = EROFS_NULL_ADDR;
+ } else {
+ addr = (((u64)le16_to_cpu(idx[nr].startblk_hi) << 32) |
+ le32_to_cpu(idx[nr].startblk_lo)) & addrmask;
+ if (addr ^ (EROFS_NULL_ADDR & addrmask))
+ addr |= (u64)(le16_to_cpu(idx[nr].device_id) &
+ EROFS_SB(sb)->device_id_mask) << 48;
+ else
+ addr = EROFS_NULL_ADDR;
+ }
+ if (!nr) {
+ last = addr;
+ continue;
+ }
+ /* expand and account the prior chunk here */
+ map->m_llen += chunksize;
+ if (last != EROFS_NULL_ADDR)
+ last += erofs_blknr(sb, chunksize);
+ } while (addr == last && pos + (++nr) * unit < endpos);
+
+ if (last != EROFS_NULL_ADDR) {
+ map->m_pa = erofs_pos(sb, last & addrmask) - map->m_llen;
+ map->m_deviceid = last >> 48;
+ map->m_flags = EROFS_MAP_MAPPED;
+ }
+ if (addr == last)
+ map->m_llen += chunksize;
+ map->m_llen = min_t(erofs_off_t, map->m_llen,
+ round_up(inode->i_size - map->m_la, sb->s_blocksize));
+ erofs_put_metabuf(&buf);
+ return 0;
+}
+
+int erofs_map_blocks(struct inode *inode, struct erofs_map_blocks *map)
+{
+ struct super_block *sb = inode->i_sb;
+ struct erofs_inode *vi = EROFS_I(inode);
+ bool tailinline = (vi->datalayout == EROFS_INODE_FLAT_INLINE);
erofs_off_t pos;
- u64 chunknr;
int err = 0;
trace_erofs_map_blocks_enter(inode, map, 0);
@@ -116,13 +170,10 @@ int erofs_map_blocks(struct inode *inode, struct erofs_map_blocks *map)
map->m_flags = 0;
if (map->m_la >= inode->i_size)
goto out;
-
- if (vi->datalayout != EROFS_INODE_CHUNK_BASED) {
- tailpacking = (vi->datalayout == EROFS_INODE_FLAT_INLINE);
- if (!tailpacking && vi->startblk == EROFS_NULL_ADDR)
- goto out;
- pos = erofs_pos(sb, erofs_iblks(inode) - tailpacking);
-
+ if (vi->datalayout == EROFS_INODE_CHUNK_BASED) {
+ err = erofs_map_chunks(inode, map);
+ } else if (tailinline || vi->startblk != EROFS_NULL_ADDR) {
+ pos = erofs_pos(sb, erofs_iblks(inode) - tailinline);
map->m_flags = EROFS_MAP_MAPPED;
if (map->m_la < pos) {
map->m_pa = erofs_pos(sb, vi->startblk) + map->m_la;
@@ -132,57 +183,15 @@ int erofs_map_blocks(struct inode *inode, struct erofs_map_blocks *map)
vi->xattr_isize + erofs_blkoff(sb, map->m_la);
map->m_llen = inode->i_size - map->m_la;
map->m_flags |= EROFS_MAP_META;
- }
- goto out;
- }
-
- if (vi->chunkformat & EROFS_CHUNK_FORMAT_INDEXES)
- unit = sizeof(*idx); /* chunk index */
- else
- unit = EROFS_BLOCK_MAP_ENTRY_SIZE; /* block map */
-
- chunknr = map->m_la >> vi->chunkbits;
- pos = ALIGN(erofs_iloc(inode) + vi->inode_isize +
- vi->xattr_isize, unit) + unit * chunknr;
-
- idx = erofs_read_metabuf(&buf, sb, pos, erofs_inode_in_metabox(inode));
- if (IS_ERR(idx)) {
- err = PTR_ERR(idx);
- goto out;
- }
- map->m_la = chunknr << vi->chunkbits;
- map->m_llen = min_t(erofs_off_t, 1UL << vi->chunkbits,
- round_up(inode->i_size - map->m_la, blksz));
- if (vi->chunkformat & EROFS_CHUNK_FORMAT_INDEXES) {
- addrmask = (vi->chunkformat & EROFS_CHUNK_FORMAT_48BIT) ?
- BIT_ULL(48) - 1 : BIT_ULL(32) - 1;
- startblk = (((u64)le16_to_cpu(idx->startblk_hi) << 32) |
- le32_to_cpu(idx->startblk_lo)) & addrmask;
- if ((startblk ^ EROFS_NULL_ADDR) & addrmask) {
- map->m_deviceid = le16_to_cpu(idx->device_id) &
- EROFS_SB(sb)->device_id_mask;
- map->m_pa = erofs_pos(sb, startblk);
- map->m_flags = EROFS_MAP_MAPPED;
- }
- } else {
- startblk = le32_to_cpu(*(__le32 *)idx);
- if (startblk != (u32)EROFS_NULL_ADDR) {
- map->m_pa = erofs_pos(sb, startblk);
- map->m_flags = EROFS_MAP_MAPPED;
+ if (erofs_blkoff(sb, map->m_pa) + map->m_llen >
+ sb->s_blocksize) {
+ erofs_err(sb, "inline data across blocks @ nid %llu", vi->nid);
+ return -EFSCORRUPTED;
+ }
}
}
- erofs_put_metabuf(&buf);
out:
- if (!err) {
- map->m_plen = map->m_llen;
- /* inline data should be located in the same meta block */
- if ((map->m_flags & EROFS_MAP_META) &&
- erofs_blkoff(sb, map->m_pa) + map->m_plen > blksz) {
- erofs_err(sb, "inline data across blocks @ nid %llu", vi->nid);
- DBG_BUGON(1);
- return -EFSCORRUPTED;
- }
- }
+ map->m_plen = err ? 0 : map->m_llen;
trace_erofs_map_blocks_exit(inode, map, 0, err);
return err;
}
diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 7871b16c1d33..16ec4fd33ac6 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -396,6 +396,8 @@ enum {
/* (noncompact only, HEAD) This pcluster refers to partial decompressed data */
#define Z_EROFS_LI_PARTIAL_REF (1 << 15)
+/* (noncompact only, HEAD) This pcluster can also be regarded as a HOLE */
+#define Z_EROFS_LI_HOLE (1 << 14)
/* Set on 1st non-head lcluster to store compressed block counti (in blocks) */
#define Z_EROFS_LI_D0_CBLKCNT (1 << 11)
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
deleted file mode 100644
index 685c68774379..000000000000
--- a/fs/erofs/fscache.c
+++ /dev/null
@@ -1,664 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Copyright (C) 2022, Alibaba Cloud
- * Copyright (C) 2022, Bytedance Inc. All rights reserved.
- */
-#include <linux/fscache.h>
-#include "internal.h"
-
-static DEFINE_MUTEX(erofs_domain_list_lock);
-static DEFINE_MUTEX(erofs_domain_cookies_lock);
-static LIST_HEAD(erofs_domain_list);
-static LIST_HEAD(erofs_domain_cookies_list);
-static struct vfsmount *erofs_pseudo_mnt;
-
-struct erofs_fscache_io {
- struct netfs_cache_resources cres;
- struct iov_iter iter;
- netfs_io_terminated_t end_io;
- void *private;
- refcount_t ref;
-};
-
-struct erofs_fscache_rq {
- struct address_space *mapping; /* The mapping being accessed */
- loff_t start; /* Start position */
- size_t len; /* Length of the request */
- size_t submitted; /* Length of submitted */
- short error; /* 0 or error that occurred */
- refcount_t ref;
-};
-
-static bool erofs_fscache_io_put(struct erofs_fscache_io *io)
-{
- if (!refcount_dec_and_test(&io->ref))
- return false;
- if (io->cres.ops)
- io->cres.ops->end_operation(&io->cres);
- kfree(io);
- return true;
-}
-
-static void erofs_fscache_req_complete(struct erofs_fscache_rq *req)
-{
- struct folio *folio;
- bool failed = req->error;
- pgoff_t start_page = req->start / PAGE_SIZE;
- pgoff_t last_page = ((req->start + req->len) / PAGE_SIZE) - 1;
-
- XA_STATE(xas, &req->mapping->i_pages, start_page);
-
- rcu_read_lock();
- xas_for_each(&xas, folio, last_page) {
- if (xas_retry(&xas, folio))
- continue;
- if (!failed)
- folio_mark_uptodate(folio);
- folio_unlock(folio);
- }
- rcu_read_unlock();
-}
-
-static void erofs_fscache_req_put(struct erofs_fscache_rq *req)
-{
- if (!refcount_dec_and_test(&req->ref))
- return;
- erofs_fscache_req_complete(req);
- kfree(req);
-}
-
-static struct erofs_fscache_rq *erofs_fscache_req_alloc(struct address_space *mapping,
- loff_t start, size_t len)
-{
- struct erofs_fscache_rq *req = kzalloc_obj(*req);
-
- if (!req)
- return NULL;
- req->mapping = mapping;
- req->start = start;
- req->len = len;
- refcount_set(&req->ref, 1);
- return req;
-}
-
-static void erofs_fscache_req_io_put(struct erofs_fscache_io *io)
-{
- struct erofs_fscache_rq *req = io->private;
-
- if (erofs_fscache_io_put(io))
- erofs_fscache_req_put(req);
-}
-
-static void erofs_fscache_req_end_io(void *priv, ssize_t transferred_or_error)
-{
- struct erofs_fscache_io *io = priv;
- struct erofs_fscache_rq *req = io->private;
-
- if (IS_ERR_VALUE(transferred_or_error))
- req->error = transferred_or_error;
- erofs_fscache_req_io_put(io);
-}
-
-static struct erofs_fscache_io *erofs_fscache_req_io_alloc(struct erofs_fscache_rq *req)
-{
- struct erofs_fscache_io *io = kzalloc_obj(*io);
-
- if (!io)
- return NULL;
- io->end_io = erofs_fscache_req_end_io;
- io->private = req;
- refcount_inc(&req->ref);
- refcount_set(&io->ref, 1);
- return io;
-}
-
-/*
- * Read data from fscache described by cookie at pstart physical address
- * offset, and fill the read data into buffer described by io->iter.
- */
-static int erofs_fscache_read_io_async(struct fscache_cookie *cookie,
- loff_t pstart, struct erofs_fscache_io *io)
-{
- enum netfs_io_source source;
- struct netfs_cache_resources *cres = &io->cres;
- struct iov_iter *iter = &io->iter;
- int ret;
-
- ret = fscache_begin_read_operation(cres, cookie);
- if (ret)
- return ret;
-
- while (iov_iter_count(iter)) {
- size_t orig_count = iov_iter_count(iter), len = orig_count;
- unsigned long flags = 1 << NETFS_SREQ_ONDEMAND;
-
- source = cres->ops->prepare_ondemand_read(cres,
- pstart, &len, LLONG_MAX, &flags, 0);
- if (WARN_ON(len == 0))
- source = NETFS_INVALID_READ;
- if (source != NETFS_READ_FROM_CACHE) {
- erofs_err(NULL, "prepare_ondemand_read failed (source %d)", source);
- return -EIO;
- }
-
- iov_iter_truncate(iter, len);
- refcount_inc(&io->ref);
- ret = fscache_read(cres, pstart, iter, NETFS_READ_HOLE_FAIL,
- io->end_io, io);
- if (ret == -EIOCBQUEUED)
- ret = 0;
- if (ret) {
- erofs_err(NULL, "fscache_read failed (ret %d)", ret);
- return ret;
- }
- if (WARN_ON(iov_iter_count(iter)))
- return -EIO;
-
- iov_iter_reexpand(iter, orig_count - len);
- pstart += len;
- }
- return 0;
-}
-
-struct erofs_fscache_bio {
- struct erofs_fscache_io io;
- struct bio bio; /* w/o bdev to share bio_add_page/endio() */
- struct bio_vec bvecs[BIO_MAX_VECS];
-};
-
-static void erofs_fscache_bio_endio(void *priv, ssize_t transferred_or_error)
-{
- struct erofs_fscache_bio *io = priv;
-
- if (IS_ERR_VALUE(transferred_or_error))
- io->bio.bi_status = errno_to_blk_status(transferred_or_error);
- bio_endio(&io->bio);
- BUILD_BUG_ON(offsetof(struct erofs_fscache_bio, io) != 0);
- erofs_fscache_io_put(&io->io);
-}
-
-struct bio *erofs_fscache_bio_alloc(struct erofs_map_dev *mdev)
-{
- struct erofs_fscache_bio *io;
-
- io = kmalloc_obj(*io, GFP_KERNEL | __GFP_NOFAIL);
- bio_init(&io->bio, NULL, io->bvecs, BIO_MAX_VECS, REQ_OP_READ);
- io->io.private = mdev->m_dif->fscache->cookie;
- io->io.end_io = erofs_fscache_bio_endio;
- refcount_set(&io->io.ref, 1);
- return &io->bio;
-}
-
-void erofs_fscache_submit_bio(struct bio *bio)
-{
- struct erofs_fscache_bio *io = container_of(bio,
- struct erofs_fscache_bio, bio);
- int ret;
-
- iov_iter_bvec(&io->io.iter, ITER_DEST, io->bvecs, bio->bi_vcnt,
- bio->bi_iter.bi_size);
- ret = erofs_fscache_read_io_async(io->io.private,
- bio->bi_iter.bi_sector << 9, &io->io);
- erofs_fscache_io_put(&io->io);
- if (!ret)
- return;
- bio->bi_status = errno_to_blk_status(ret);
- bio_endio(bio);
-}
-
-static int erofs_fscache_meta_read_folio(struct file *data, struct folio *folio)
-{
- struct erofs_fscache *ctx = folio->mapping->host->i_private;
- int ret = -ENOMEM;
- struct erofs_fscache_rq *req;
- struct erofs_fscache_io *io;
-
- req = erofs_fscache_req_alloc(folio->mapping,
- folio_pos(folio), folio_size(folio));
- if (!req) {
- folio_unlock(folio);
- return ret;
- }
-
- io = erofs_fscache_req_io_alloc(req);
- if (!io) {
- req->error = ret;
- goto out;
- }
- iov_iter_xarray(&io->iter, ITER_DEST, &folio->mapping->i_pages,
- folio_pos(folio), folio_size(folio));
-
- ret = erofs_fscache_read_io_async(ctx->cookie, folio_pos(folio), io);
- if (ret)
- req->error = ret;
-
- erofs_fscache_req_io_put(io);
-out:
- erofs_fscache_req_put(req);
- return ret;
-}
-
-static int erofs_fscache_data_read_slice(struct erofs_fscache_rq *req)
-{
- struct address_space *mapping = req->mapping;
- struct inode *inode = mapping->host;
- struct super_block *sb = inode->i_sb;
- struct erofs_fscache_io *io;
- struct erofs_map_blocks map;
- struct erofs_map_dev mdev;
- loff_t pos = req->start + req->submitted;
- size_t count;
- int ret;
-
- map.m_la = pos;
- ret = erofs_map_blocks(inode, &map);
- if (ret)
- return ret;
-
- if (map.m_flags & EROFS_MAP_META) {
- struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
- struct iov_iter iter;
- size_t size = map.m_llen;
- void *src;
-
- src = erofs_read_metabuf(&buf, sb, map.m_pa,
- erofs_inode_in_metabox(inode));
- if (IS_ERR(src))
- return PTR_ERR(src);
-
- iov_iter_xarray(&iter, ITER_DEST, &mapping->i_pages, pos, PAGE_SIZE);
- if (copy_to_iter(src, size, &iter) != size) {
- erofs_put_metabuf(&buf);
- return -EFAULT;
- }
- iov_iter_zero(PAGE_SIZE - size, &iter);
- erofs_put_metabuf(&buf);
- req->submitted += PAGE_SIZE;
- return 0;
- }
-
- count = req->len - req->submitted;
- if (!(map.m_flags & EROFS_MAP_MAPPED)) {
- struct iov_iter iter;
-
- iov_iter_xarray(&iter, ITER_DEST, &mapping->i_pages, pos, count);
- iov_iter_zero(count, &iter);
- req->submitted += count;
- return 0;
- }
-
- count = min_t(size_t, map.m_llen - (pos - map.m_la), count);
- DBG_BUGON(!count || count % PAGE_SIZE);
-
- mdev = (struct erofs_map_dev) {
- .m_deviceid = map.m_deviceid,
- .m_pa = map.m_pa,
- };
- ret = erofs_map_dev(sb, &mdev);
- if (ret)
- return ret;
-
- io = erofs_fscache_req_io_alloc(req);
- if (!io)
- return -ENOMEM;
- iov_iter_xarray(&io->iter, ITER_DEST, &mapping->i_pages, pos, count);
- ret = erofs_fscache_read_io_async(mdev.m_dif->fscache->cookie,
- mdev.m_pa + (pos - map.m_la), io);
- erofs_fscache_req_io_put(io);
-
- req->submitted += count;
- return ret;
-}
-
-static int erofs_fscache_data_read(struct erofs_fscache_rq *req)
-{
- int ret;
-
- do {
- ret = erofs_fscache_data_read_slice(req);
- if (ret)
- req->error = ret;
- } while (!ret && req->submitted < req->len);
- return ret;
-}
-
-static int erofs_fscache_read_folio(struct file *file, struct folio *folio)
-{
- struct erofs_fscache_rq *req;
- int ret;
-
- req = erofs_fscache_req_alloc(folio->mapping,
- folio_pos(folio), folio_size(folio));
- if (!req) {
- folio_unlock(folio);
- return -ENOMEM;
- }
-
- ret = erofs_fscache_data_read(req);
- erofs_fscache_req_put(req);
- return ret;
-}
-
-static void erofs_fscache_readahead(struct readahead_control *rac)
-{
- struct erofs_fscache_rq *req;
-
- if (!readahead_count(rac))
- return;
-
- req = erofs_fscache_req_alloc(rac->mapping,
- readahead_pos(rac), readahead_length(rac));
- if (!req)
- return;
-
- /* The request completion will drop refs on the folios. */
- while (readahead_folio(rac))
- ;
-
- erofs_fscache_data_read(req);
- erofs_fscache_req_put(req);
-}
-
-static const struct address_space_operations erofs_fscache_meta_aops = {
- .read_folio = erofs_fscache_meta_read_folio,
-};
-
-const struct address_space_operations erofs_fscache_access_aops = {
- .read_folio = erofs_fscache_read_folio,
- .readahead = erofs_fscache_readahead,
-};
-
-static void erofs_fscache_domain_put(struct erofs_domain *domain)
-{
- mutex_lock(&erofs_domain_list_lock);
- if (refcount_dec_and_test(&domain->ref)) {
- list_del(&domain->list);
- if (list_empty(&erofs_domain_list)) {
- kern_unmount(erofs_pseudo_mnt);
- erofs_pseudo_mnt = NULL;
- }
- fscache_relinquish_volume(domain->volume, NULL, false);
- mutex_unlock(&erofs_domain_list_lock);
- kfree_sensitive(domain->domain_id);
- kfree(domain);
- return;
- }
- mutex_unlock(&erofs_domain_list_lock);
-}
-
-static int erofs_fscache_register_volume(struct super_block *sb)
-{
- struct erofs_sb_info *sbi = EROFS_SB(sb);
- char *domain_id = sbi->domain_id;
- struct fscache_volume *volume;
- char *name;
- int ret = 0;
-
- name = kasprintf(GFP_KERNEL, "erofs,%s",
- domain_id ? domain_id : sbi->fsid);
- if (!name)
- return -ENOMEM;
-
- volume = fscache_acquire_volume(name, NULL, NULL, 0);
- if (IS_ERR_OR_NULL(volume)) {
- erofs_err(sb, "failed to register volume for %s", name);
- ret = volume ? PTR_ERR(volume) : -EOPNOTSUPP;
- volume = NULL;
- }
-
- sbi->volume = volume;
- kfree(name);
- return ret;
-}
-
-static int erofs_fscache_init_domain(struct super_block *sb)
-{
- int err;
- struct erofs_domain *domain;
- struct erofs_sb_info *sbi = EROFS_SB(sb);
-
- domain = kzalloc_obj(struct erofs_domain);
- if (!domain)
- return -ENOMEM;
-
- domain->domain_id = kstrdup(sbi->domain_id, GFP_KERNEL);
- if (!domain->domain_id) {
- kfree(domain);
- return -ENOMEM;
- }
-
- err = erofs_fscache_register_volume(sb);
- if (err)
- goto out;
-
- if (!erofs_pseudo_mnt) {
- struct vfsmount *mnt = kern_mount(&erofs_anon_fs_type);
- if (IS_ERR(mnt)) {
- err = PTR_ERR(mnt);
- goto out;
- }
- erofs_pseudo_mnt = mnt;
- }
-
- domain->volume = sbi->volume;
- refcount_set(&domain->ref, 1);
- list_add(&domain->list, &erofs_domain_list);
- sbi->domain = domain;
- return 0;
-out:
- kfree_sensitive(domain->domain_id);
- kfree(domain);
- return err;
-}
-
-static int erofs_fscache_register_domain(struct super_block *sb)
-{
- int err;
- struct erofs_domain *domain;
- struct erofs_sb_info *sbi = EROFS_SB(sb);
-
- mutex_lock(&erofs_domain_list_lock);
- list_for_each_entry(domain, &erofs_domain_list, list) {
- if (!strcmp(domain->domain_id, sbi->domain_id)) {
- sbi->domain = domain;
- sbi->volume = domain->volume;
- refcount_inc(&domain->ref);
- mutex_unlock(&erofs_domain_list_lock);
- return 0;
- }
- }
- err = erofs_fscache_init_domain(sb);
- mutex_unlock(&erofs_domain_list_lock);
- return err;
-}
-
-static struct erofs_fscache *erofs_fscache_acquire_cookie(struct super_block *sb,
- char *name, unsigned int flags)
-{
- struct fscache_volume *volume = EROFS_SB(sb)->volume;
- struct erofs_fscache *ctx;
- struct fscache_cookie *cookie;
- struct super_block *isb;
- struct inode *inode;
- int ret;
-
- ctx = kzalloc_obj(*ctx);
- if (!ctx)
- return ERR_PTR(-ENOMEM);
- INIT_LIST_HEAD(&ctx->node);
- refcount_set(&ctx->ref, 1);
-
- cookie = fscache_acquire_cookie(volume, FSCACHE_ADV_WANT_CACHE_SIZE,
- name, strlen(name), NULL, 0, 0);
- if (!cookie) {
- erofs_err(sb, "failed to get cookie for %s", name);
- ret = -EINVAL;
- goto err;
- }
- fscache_use_cookie(cookie, false);
-
- /*
- * Allocate anonymous inode in global pseudo mount for shareable blobs,
- * so that they are accessible among erofs fs instances.
- */
- isb = flags & EROFS_REG_COOKIE_SHARE ? erofs_pseudo_mnt->mnt_sb : sb;
- inode = new_inode(isb);
- if (!inode) {
- erofs_err(sb, "failed to get anon inode for %s", name);
- ret = -ENOMEM;
- goto err_cookie;
- }
-
- inode->i_size = OFFSET_MAX;
- inode->i_mapping->a_ops = &erofs_fscache_meta_aops;
- mapping_set_gfp_mask(inode->i_mapping, GFP_KERNEL);
- inode->i_blkbits = EROFS_SB(sb)->blkszbits;
- inode->i_private = ctx;
-
- ctx->cookie = cookie;
- ctx->inode = inode;
- return ctx;
-
-err_cookie:
- fscache_unuse_cookie(cookie, NULL, NULL);
- fscache_relinquish_cookie(cookie, false);
-err:
- kfree(ctx);
- return ERR_PTR(ret);
-}
-
-static void erofs_fscache_relinquish_cookie(struct erofs_fscache *ctx)
-{
- fscache_unuse_cookie(ctx->cookie, NULL, NULL);
- fscache_relinquish_cookie(ctx->cookie, false);
- iput(ctx->inode);
- kfree(ctx->name);
- kfree(ctx);
-}
-
-static struct erofs_fscache *erofs_domain_init_cookie(struct super_block *sb,
- char *name, unsigned int flags)
-{
- struct erofs_fscache *ctx;
- struct erofs_domain *domain = EROFS_SB(sb)->domain;
-
- ctx = erofs_fscache_acquire_cookie(sb, name, flags);
- if (IS_ERR(ctx))
- return ctx;
-
- ctx->name = kstrdup(name, GFP_KERNEL);
- if (!ctx->name) {
- erofs_fscache_relinquish_cookie(ctx);
- return ERR_PTR(-ENOMEM);
- }
-
- refcount_inc(&domain->ref);
- ctx->domain = domain;
- list_add(&ctx->node, &erofs_domain_cookies_list);
- return ctx;
-}
-
-static struct erofs_fscache *erofs_domain_register_cookie(struct super_block *sb,
- char *name, unsigned int flags)
-{
- struct erofs_fscache *ctx;
- struct erofs_domain *domain = EROFS_SB(sb)->domain;
-
- flags |= EROFS_REG_COOKIE_SHARE;
- mutex_lock(&erofs_domain_cookies_lock);
- list_for_each_entry(ctx, &erofs_domain_cookies_list, node) {
- if (ctx->domain != domain || strcmp(ctx->name, name))
- continue;
- if (!(flags & EROFS_REG_COOKIE_NEED_NOEXIST)) {
- refcount_inc(&ctx->ref);
- } else {
- erofs_err(sb, "%s already exists in domain %s", name,
- domain->domain_id);
- ctx = ERR_PTR(-EEXIST);
- }
- mutex_unlock(&erofs_domain_cookies_lock);
- return ctx;
- }
- ctx = erofs_domain_init_cookie(sb, name, flags);
- mutex_unlock(&erofs_domain_cookies_lock);
- return ctx;
-}
-
-struct erofs_fscache *erofs_fscache_register_cookie(struct super_block *sb,
- char *name,
- unsigned int flags)
-{
- if (EROFS_SB(sb)->domain_id)
- return erofs_domain_register_cookie(sb, name, flags);
- return erofs_fscache_acquire_cookie(sb, name, flags);
-}
-
-void erofs_fscache_unregister_cookie(struct erofs_fscache *ctx)
-{
- struct erofs_domain *domain = NULL;
-
- if (!ctx)
- return;
- if (!ctx->domain)
- return erofs_fscache_relinquish_cookie(ctx);
-
- mutex_lock(&erofs_domain_cookies_lock);
- if (refcount_dec_and_test(&ctx->ref)) {
- domain = ctx->domain;
- list_del(&ctx->node);
- erofs_fscache_relinquish_cookie(ctx);
- }
- mutex_unlock(&erofs_domain_cookies_lock);
- if (domain)
- erofs_fscache_domain_put(domain);
-}
-
-int erofs_fscache_register_fs(struct super_block *sb)
-{
- int ret;
- struct erofs_sb_info *sbi = EROFS_SB(sb);
- struct erofs_fscache *fscache;
- unsigned int flags = 0;
-
- if (sbi->domain_id)
- ret = erofs_fscache_register_domain(sb);
- else
- ret = erofs_fscache_register_volume(sb);
- if (ret)
- return ret;
-
- /*
- * When shared domain is enabled, using NEED_NOEXIST to guarantee
- * the primary data blob (aka fsid) is unique in the shared domain.
- *
- * For non-shared-domain case, fscache_acquire_volume() invoked by
- * erofs_fscache_register_volume() has already guaranteed
- * the uniqueness of primary data blob.
- *
- * Acquired domain/volume will be relinquished in kill_sb() on error.
- */
- if (sbi->domain_id)
- flags |= EROFS_REG_COOKIE_NEED_NOEXIST;
- fscache = erofs_fscache_register_cookie(sb, sbi->fsid, flags);
- if (IS_ERR(fscache))
- return PTR_ERR(fscache);
-
- sbi->dif0.fscache = fscache;
- return 0;
-}
-
-void erofs_fscache_unregister_fs(struct super_block *sb)
-{
- struct erofs_sb_info *sbi = EROFS_SB(sb);
-
- erofs_fscache_unregister_cookie(sbi->dif0.fscache);
-
- if (sbi->domain)
- erofs_fscache_domain_put(sbi->domain);
- else
- fscache_relinquish_volume(sbi->volume, NULL, false);
-
- sbi->dif0.fscache = NULL;
- sbi->volume = NULL;
- sbi->domain = NULL;
-}
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index a188c570087a..45afe5c50de8 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -191,8 +191,9 @@ static int erofs_read_inode(struct inode *inode)
err = -EFSCORRUPTED;
goto err_out;
} else {
- inode->i_blocks = le32_to_cpu(copied.i_u.blocks_lo) <<
- (sb->s_blocksize_bits - 9);
+ inode->i_blocks = (le32_to_cpu(copied.i_u.blocks_lo) |
+ ((u64)le16_to_cpu(copied.i_nb.blocks_hi) << 32)) <<
+ (sb->s_blocksize_bits - 9);
}
if (vi->datalayout == EROFS_INODE_CHUNK_BASED) {
@@ -255,7 +256,7 @@ static int erofs_fill_inode(struct inode *inode)
}
mapping_set_large_folios(inode->i_mapping);
- aops = erofs_get_aops(inode, false);
+ aops = erofs_get_aops(inode);
if (IS_ERR(aops))
return PTR_ERR(aops);
inode->i_mapping->a_ops = aops;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 4792490161ec..580f8d9f14e7 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -23,6 +23,8 @@
__printf(2, 3) void _erofs_printk(struct super_block *sb, const char *fmt, ...);
#define erofs_err(sb, fmt, ...) \
_erofs_printk(sb, KERN_ERR fmt "\n", ##__VA_ARGS__)
+#define erofs_warn(sb, fmt, ...) \
+ _erofs_printk(sb, KERN_WARNING fmt "\n", ##__VA_ARGS__)
#define erofs_info(sb, fmt, ...) \
_erofs_printk(sb, KERN_INFO fmt "\n", ##__VA_ARGS__)
@@ -41,7 +43,6 @@ typedef u64 erofs_blk_t;
struct erofs_device_info {
char *path;
- struct erofs_fscache *fscache;
struct file *file;
struct dax_device *dax_dev;
u64 fsoff, dax_part_off;
@@ -78,24 +79,6 @@ struct erofs_sb_lz4_info {
u16 max_pclusterblks;
};
-struct erofs_domain {
- refcount_t ref;
- struct list_head list;
- struct fscache_volume *volume;
- char *domain_id;
-};
-
-struct erofs_fscache {
- struct fscache_cookie *cookie;
- struct inode *inode; /* anonymous inode for the blob */
-
- /* used for share domain mode */
- struct erofs_domain *domain;
- struct list_head node;
- refcount_t ref;
- char *name;
-};
-
struct erofs_xattr_prefix_item {
struct erofs_xattr_long_prefix *prefix;
u8 infix_len;
@@ -160,10 +143,6 @@ struct erofs_sb_info {
struct completion s_kobj_unregister;
erofs_off_t dir_ra_bytes;
- /* fscache support */
- struct fscache_volume *volume;
- struct erofs_domain *domain;
- char *fsid;
char *domain_id;
};
@@ -189,12 +168,6 @@ static inline bool erofs_is_fileio_mode(struct erofs_sb_info *sbi)
extern struct file_system_type erofs_anon_fs_type;
-static inline bool erofs_is_fscache_mode(struct super_block *sb)
-{
- return IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) &&
- !erofs_is_fileio_mode(EROFS_SB(sb)) && !sb->s_bdev;
-}
-
enum {
EROFS_ZIP_CACHE_DISABLED,
EROFS_ZIP_CACHE_READAHEAD,
@@ -411,11 +384,9 @@ struct erofs_map_dev {
};
extern const struct super_operations erofs_sops;
-
extern const struct address_space_operations erofs_aops;
extern const struct address_space_operations erofs_fileio_aops;
extern const struct address_space_operations z_erofs_aops;
-extern const struct address_space_operations erofs_fscache_access_aops;
extern const struct inode_operations erofs_generic_iops;
extern const struct inode_operations erofs_symlink_iops;
@@ -428,10 +399,6 @@ extern const struct file_operations erofs_ishare_fops;
extern const struct iomap_ops z_erofs_iomap_report_ops;
-/* flags for erofs_fscache_register_cookie() */
-#define EROFS_REG_COOKIE_SHARE 0x0001
-#define EROFS_REG_COOKIE_NEED_NOEXIST 0x0002
-
void *erofs_read_metadata(struct super_block *sb, struct erofs_buf *buf,
erofs_off_t *offset, int *lengthp);
void erofs_unmap_metabuf(struct erofs_buf *buf);
@@ -471,7 +438,7 @@ static inline void *erofs_vm_map_ram(struct page **pages, unsigned int count)
}
static inline const struct address_space_operations *
-erofs_get_aops(struct inode *realinode, bool no_fscache)
+erofs_get_aops(struct inode *realinode)
{
if (erofs_inode_is_data_compressed(EROFS_I(realinode)->datalayout)) {
if (!IS_ENABLED(CONFIG_EROFS_FS_ZIP))
@@ -481,9 +448,6 @@ erofs_get_aops(struct inode *realinode, bool no_fscache)
"EXPERIMENTAL EROFS subpage compressed block support in use. Use at your own risk!");
return &z_erofs_aops;
}
- if (IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && !no_fscache &&
- erofs_is_fscache_mode(realinode->i_sb))
- return &erofs_fscache_access_aops;
if (IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) &&
erofs_is_fileio_mode(EROFS_SB(realinode->i_sb)))
return &erofs_fileio_aops;
@@ -546,36 +510,6 @@ static inline struct bio *erofs_fileio_bio_alloc(struct erofs_map_dev *mdev) { r
static inline void erofs_fileio_submit_bio(struct bio *bio) {}
#endif
-#ifdef CONFIG_EROFS_FS_ONDEMAND
-int erofs_fscache_register_fs(struct super_block *sb);
-void erofs_fscache_unregister_fs(struct super_block *sb);
-
-struct erofs_fscache *erofs_fscache_register_cookie(struct super_block *sb,
- char *name, unsigned int flags);
-void erofs_fscache_unregister_cookie(struct erofs_fscache *fscache);
-struct bio *erofs_fscache_bio_alloc(struct erofs_map_dev *mdev);
-void erofs_fscache_submit_bio(struct bio *bio);
-#else
-static inline int erofs_fscache_register_fs(struct super_block *sb)
-{
- return -EOPNOTSUPP;
-}
-static inline void erofs_fscache_unregister_fs(struct super_block *sb) {}
-
-static inline
-struct erofs_fscache *erofs_fscache_register_cookie(struct super_block *sb,
- char *name, unsigned int flags)
-{
- return ERR_PTR(-EOPNOTSUPP);
-}
-
-static inline void erofs_fscache_unregister_cookie(struct erofs_fscache *fscache)
-{
-}
-static inline struct bio *erofs_fscache_bio_alloc(struct erofs_map_dev *mdev) { return NULL; }
-static inline void erofs_fscache_submit_bio(struct bio *bio) {}
-#endif
-
#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
int __init erofs_init_ishare(void);
void erofs_exit_ishare(void);
diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
index 6ed66b17359b..0868c12fc15b 100644
--- a/fs/erofs/ishare.c
+++ b/fs/erofs/ishare.c
@@ -40,49 +40,42 @@ static int erofs_ishare_iget5_set(struct inode *inode, void *data)
bool erofs_ishare_fill_inode(struct inode *inode)
{
struct erofs_sb_info *sbi = EROFS_SB(inode->i_sb);
- struct erofs_inode *vi = EROFS_I(inode);
const struct address_space_operations *aops;
+ struct erofs_inode *vi = EROFS_I(inode);
struct erofs_inode_fingerprint fp;
- struct inode *sharedinode;
- unsigned long hash;
+ struct inode *si;
- aops = erofs_get_aops(inode, true);
+ aops = erofs_get_aops(inode);
if (IS_ERR(aops))
return false;
if (erofs_xattr_fill_inode_fingerprint(&fp, inode, sbi->domain_id))
return false;
- hash = xxh32(fp.opaque, fp.size, 0);
- sharedinode = iget5_locked(erofs_ishare_mnt->mnt_sb, hash,
- erofs_ishare_iget5_eq, erofs_ishare_iget5_set,
- &fp);
- if (!sharedinode) {
- kfree(fp.opaque);
- return false;
- }
- if (inode_state_read_once(sharedinode) & I_NEW) {
- sharedinode->i_mapping->a_ops = aops;
- sharedinode->i_size = vi->vfs_inode.i_size;
- unlock_new_inode(sharedinode);
+ si = iget5_locked(erofs_ishare_mnt->mnt_sb,
+ xxh32(fp.opaque, fp.size, 0),
+ erofs_ishare_iget5_eq, erofs_ishare_iget5_set, &fp);
+ if (si && (inode_state_read_once(si) & I_NEW)) {
+ si->i_mapping->a_ops = aops;
+ si->i_size = inode->i_size;
+ unlock_new_inode(si);
} else {
kfree(fp.opaque);
- if (aops != sharedinode->i_mapping->a_ops) {
- iput(sharedinode);
+ if (!si || aops != si->i_mapping->a_ops) {
+ iput(si);
return false;
}
- if (sharedinode->i_size != vi->vfs_inode.i_size) {
- _erofs_printk(inode->i_sb, KERN_WARNING
- "size(%lld:%lld) not matches for the same fingerprint\n",
- vi->vfs_inode.i_size, sharedinode->i_size);
- iput(sharedinode);
+ if (si->i_size != inode->i_size) {
+ erofs_warn(inode->i_sb, "i_size mismatch (%lld != %lld) for the same fingerprint",
+ inode->i_size, si->i_size);
+ iput(si);
return false;
}
}
- vi->sharedinode = sharedinode;
+ vi->sharedinode = si;
INIT_LIST_HEAD(&vi->ishare_list);
- spin_lock(&EROFS_I(sharedinode)->ishare_lock);
- list_add(&vi->ishare_list, &EROFS_I(sharedinode)->ishare_list);
- spin_unlock(&EROFS_I(sharedinode)->ishare_lock);
+ spin_lock(&EROFS_I(si)->ishare_lock);
+ list_add(&vi->ishare_list, &EROFS_I(si)->ishare_list);
+ spin_unlock(&EROFS_I(si)->ishare_lock);
return true;
}
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 802add6652fd..86fa5c6a0c70 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -126,7 +126,6 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
struct erofs_device_info *dif, erofs_off_t *pos)
{
struct erofs_sb_info *sbi = EROFS_SB(sb);
- struct erofs_fscache *fscache;
struct erofs_deviceslot *dis;
struct file *file;
bool _48bit;
@@ -145,12 +144,7 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
return -ENOMEM;
}
- if (erofs_is_fscache_mode(sb)) {
- fscache = erofs_fscache_register_cookie(sb, dif->path, 0);
- if (IS_ERR(fscache))
- return PTR_ERR(fscache);
- dif->fscache = fscache;
- } else if (!sbi->devs->flatdev) {
+ if (!sbi->devs->flatdev) {
file = erofs_is_fileio_mode(sbi) ?
filp_open(dif->path, O_RDONLY | O_LARGEFILE, 0) :
bdev_file_open_by_path(dif->path,
@@ -216,7 +210,7 @@ static int erofs_scan_devices(struct super_block *sb,
if (!ondisk_extradevs)
return 0;
- if (!sbi->devs->extra_devices && !erofs_is_fscache_mode(sb))
+ if (!sbi->devs->extra_devices)
sbi->devs->flatdev = true;
sbi->device_id_mask = roundup_pow_of_two(ondisk_extradevs + 1) - 1;
@@ -372,8 +366,6 @@ static int erofs_read_superblock(struct super_block *sb)
erofs_info(sb, "EXPERIMENTAL 48-bit layout support in use. Use at your own risk!");
if (erofs_sb_has_metabox(sbi))
erofs_info(sb, "EXPERIMENTAL metadata compression support in use. Use at your own risk!");
- if (erofs_is_fscache_mode(sb))
- erofs_info(sb, "[deprecated] fscache-based on-demand read feature in use. Use at your own risk!");
out:
erofs_put_metabuf(&buf);
return ret;
@@ -393,8 +385,7 @@ static void erofs_default_options(struct erofs_sb_info *sbi)
enum {
Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum,
- Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, Opt_fsoffset,
- Opt_inode_share,
+ Opt_device, Opt_domain_id, Opt_directio, Opt_fsoffset, Opt_inode_share,
};
static const struct constant_table erofs_param_cache_strategy[] = {
@@ -418,7 +409,6 @@ static const struct fs_parameter_spec erofs_fs_parameters[] = {
fsparam_flag("dax", Opt_dax),
fsparam_enum("dax", Opt_dax_enum, erofs_dax_param_enums),
fsparam_string("device", Opt_device),
- fsparam_string("fsid", Opt_fsid),
fsparam_string("domain_id", Opt_domain_id),
fsparam_flag_no("directio", Opt_directio),
fsparam_u64("fsoffset", Opt_fsoffset),
@@ -509,25 +499,14 @@ static int erofs_fc_parse_param(struct fs_context *fc,
}
++sbi->devs->extra_devices;
break;
-#ifdef CONFIG_EROFS_FS_ONDEMAND
- case Opt_fsid:
- kfree(sbi->fsid);
- sbi->fsid = kstrdup(param->string, GFP_KERNEL);
- if (!sbi->fsid)
- return -ENOMEM;
- break;
-#endif
-#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
case Opt_domain_id:
- kfree_sensitive(sbi->domain_id);
- sbi->domain_id = no_free_ptr(param->string);
- break;
-#else
- case Opt_fsid:
- case Opt_domain_id:
- errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name);
+ if (!IS_ENABLED(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)) {
+ errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name);
+ } else {
+ kfree_sensitive(sbi->domain_id);
+ sbi->domain_id = no_free_ptr(param->string);
+ }
break;
-#endif
case Opt_directio:
if (!IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE))
errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name);
@@ -620,12 +599,7 @@ static void erofs_set_sysfs_name(struct super_block *sb)
{
struct erofs_sb_info *sbi = EROFS_SB(sb);
- if (sbi->domain_id && sbi->fsid)
- super_set_sysfs_name_generic(sb, "%s,%s", sbi->domain_id,
- sbi->fsid);
- else if (sbi->fsid)
- super_set_sysfs_name_generic(sb, "%s", sbi->fsid);
- else if (erofs_is_fileio_mode(sbi))
+ if (erofs_is_fileio_mode(sbi))
super_set_sysfs_name_generic(sb, "%s",
bdi_dev_name(sb->s_bdi));
else
@@ -680,11 +654,6 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
sb->s_blocksize = PAGE_SIZE;
sb->s_blocksize_bits = PAGE_SHIFT;
- if (erofs_is_fscache_mode(sb)) {
- err = erofs_fscache_register_fs(sb);
- if (err)
- return err;
- }
err = super_setup_bdi(sb);
if (err)
return err;
@@ -703,11 +672,6 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
return err;
if (sb->s_blocksize_bits != sbi->blkszbits) {
- if (erofs_is_fscache_mode(sb)) {
- errorfc(fc, "unsupported blksize for fscache mode");
- return -EINVAL;
- }
-
if (erofs_is_fileio_mode(sbi)) {
sb->s_blocksize = 1 << sbi->blkszbits;
sb->s_blocksize_bits = sbi->blkszbits;
@@ -716,14 +680,9 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
return -EINVAL;
}
}
-
- if (sbi->dif0.fsoff) {
- if (sbi->dif0.fsoff & (sb->s_blocksize - 1))
- return invalfc(fc, "fsoffset %llu is not aligned to block size %lu",
- sbi->dif0.fsoff, sb->s_blocksize);
- if (erofs_is_fscache_mode(sb))
- return invalfc(fc, "cannot use fsoffset in fscache mode");
- }
+ if (sbi->dif0.fsoff & (sb->s_blocksize - 1))
+ return invalfc(fc, "fsoffset %llu is not aligned to block size %lu",
+ sbi->dif0.fsoff, sb->s_blocksize);
if (test_opt(&sbi->opt, DAX_ALWAYS) && sbi->blkszbits != PAGE_SHIFT) {
erofs_info(sb, "unsupported blocksize for DAX");
@@ -793,16 +752,13 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
static int erofs_fc_get_tree(struct fs_context *fc)
{
- struct erofs_sb_info *sbi = fc->s_fs_info;
int ret;
- if (IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid)
- return get_tree_nodev(fc, erofs_fc_fill_super);
-
ret = get_tree_bdev_flags(fc, erofs_fc_fill_super,
IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) ?
GET_TREE_BDEV_QUIET_LOOKUP : 0);
if (IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && ret == -ENOTBLK) {
+ struct erofs_sb_info *sbi = fc->s_fs_info;
struct file *file;
if (!fc->source)
@@ -827,8 +783,8 @@ static int erofs_fc_reconfigure(struct fs_context *fc)
DBG_BUGON(!sb_rdonly(sb));
- if (new_sbi->fsid || new_sbi->domain_id)
- erofs_info(sb, "ignoring reconfiguration for fsid|domain_id.");
+ if (new_sbi->domain_id)
+ erofs_info(sb, "ignoring reconfiguration for domain_id.");
if (test_opt(&new_sbi->opt, POSIX_ACL))
fc->sb_flags |= SB_POSIXACL;
@@ -848,8 +804,6 @@ static int erofs_release_device_info(int id, void *ptr, void *data)
fs_put_dax(dif->dax_dev, NULL);
if (dif->file)
fput(dif->file);
- erofs_fscache_unregister_cookie(dif->fscache);
- dif->fscache = NULL;
kfree(dif->path);
kfree(dif);
return 0;
@@ -867,7 +821,6 @@ static void erofs_free_dev_context(struct erofs_dev_context *devs)
static void erofs_sb_free(struct erofs_sb_info *sbi)
{
erofs_free_dev_context(sbi->devs);
- kfree(sbi->fsid);
kfree_sensitive(sbi->domain_id);
if (sbi->dif0.file)
fput(sbi->dif0.file);
@@ -928,14 +881,12 @@ static void erofs_kill_sb(struct super_block *sb)
{
struct erofs_sb_info *sbi = EROFS_SB(sb);
- if ((IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid) ||
- sbi->dif0.file)
+ if (sbi->dif0.file)
kill_anon_super(sb);
else
kill_block_super(sb);
erofs_drop_internal_inodes(sbi);
fs_put_dax(sbi->dif0.dax_dev, NULL);
- erofs_fscache_unregister_fs(sb);
erofs_sb_free(sbi);
sb->s_fs_info = NULL;
}
@@ -950,7 +901,6 @@ static void erofs_put_super(struct super_block *sb)
erofs_drop_internal_inodes(sbi);
erofs_free_dev_context(sbi->devs);
sbi->devs = NULL;
- erofs_fscache_unregister_fs(sb);
}
static struct file_system_type erofs_fs_type = {
@@ -962,14 +912,12 @@ static struct file_system_type erofs_fs_type = {
};
MODULE_ALIAS_FS("erofs");
-#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
+#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
static void erofs_free_anon_inode(struct inode *inode)
{
struct erofs_inode *vi = EROFS_I(inode);
-#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
kfree(vi->fingerprint.opaque);
-#endif
kmem_cache_free(erofs_inode_cachep, vi);
}
@@ -1048,11 +996,11 @@ shrinker_err:
static void __exit erofs_module_exit(void)
{
unregister_filesystem(&erofs_fs_type);
+ erofs_exit_ishare();
- /* Ensure all RCU free inodes / pclusters are safe to be destroyed. */
+ /* ensure all delayed rcu free inodes & pclusters are flushed */
rcu_barrier();
- erofs_exit_ishare();
erofs_exit_sysfs();
z_erofs_exit_subsystem();
erofs_exit_shrinker();
@@ -1099,12 +1047,6 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
seq_puts(seq, ",dax=never");
if (erofs_is_fileio_mode(sbi) && test_opt(opt, DIRECT_IO))
seq_puts(seq, ",directio");
- if (IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND)) {
- if (sbi->fsid)
- seq_printf(seq, ",fsid=%s", sbi->fsid);
- if (sbi->domain_id)
- seq_printf(seq, ",domain_id=%s", sbi->domain_id);
- }
if (sbi->dif0.fsoff)
seq_printf(seq, ",fsoffset=%llu", sbi->dif0.fsoff);
if (test_opt(opt, INODE_SHARE))
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index c6240dccbb0f..74520e910259 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -806,6 +806,7 @@ static int z_erofs_pcluster_begin(struct z_erofs_frontend *fe)
struct super_block *sb = fe->inode->i_sb;
struct z_erofs_pcluster *pcl = NULL;
void *ptr = NULL;
+ bool needretry;
int ret;
DBG_BUGON(fe->pcl);
@@ -825,19 +826,16 @@ static int z_erofs_pcluster_begin(struct z_erofs_frontend *fe)
}
ptr = map->buf.page;
} else {
- while (1) {
+ do {
rcu_read_lock();
pcl = xa_load(&EROFS_SB(sb)->managed_pslots, map->m_pa);
- if (!pcl || z_erofs_get_pcluster(pcl)) {
- DBG_BUGON(pcl && map->m_pa != pcl->pos);
- rcu_read_unlock();
- break;
- }
+ needretry = pcl && !z_erofs_get_pcluster(pcl);
rcu_read_unlock();
- }
+ } while (needretry);
}
if (pcl) {
+ DBG_BUGON(map->m_pa != pcl->pos);
fe->pcl = pcl;
ret = -EEXIST;
} else {
@@ -1459,21 +1457,19 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io,
if (sbi->sync_decompress == EROFS_SYNC_DECOMPRESS_AUTO)
sbi->sync_decompress = EROFS_SYNC_DECOMPRESS_FORCE_ON;
#ifdef CONFIG_EROFS_FS_PCPU_KTHREAD
- struct kthread_worker *worker;
+ scoped_guard(rcu) {
+ struct kthread_worker *worker;
- rcu_read_lock();
- worker = rcu_dereference(
+ worker = rcu_dereference(
z_erofs_pcpu_workers[raw_smp_processor_id()]);
- if (!worker) {
- INIT_WORK(&io->u.work, z_erofs_decompressqueue_work);
- queue_work(z_erofs_workqueue, &io->u.work);
- } else {
- kthread_queue_work(worker, &io->u.kthread_work);
+ if (worker) {
+ kthread_queue_work(worker, &io->u.kthread_work);
+ return;
+ }
}
- rcu_read_unlock();
-#else
- queue_work(z_erofs_workqueue, &io->u.work);
+ INIT_WORK(&io->u.work, z_erofs_decompressqueue_work);
#endif
+ queue_work(z_erofs_workqueue, &io->u.work);
return;
}
gfp_flag = memalloc_noio_save();
@@ -1714,8 +1710,6 @@ static void z_erofs_submit_queue(struct z_erofs_frontend *f,
drain_io:
if (erofs_is_fileio_mode(EROFS_SB(sb)))
erofs_fileio_submit_bio(bio);
- else if (erofs_is_fscache_mode(sb))
- erofs_fscache_submit_bio(bio);
else
submit_bio(bio);
@@ -1744,8 +1738,6 @@ drain_io:
if (!bio) {
if (erofs_is_fileio_mode(EROFS_SB(sb)))
bio = erofs_fileio_bio_alloc(&mdev);
- else if (erofs_is_fscache_mode(sb))
- bio = erofs_fscache_bio_alloc(&mdev);
else
bio = bio_alloc(mdev.m_bdev, BIO_MAX_VECS,
REQ_OP_READ, GFP_NOIO);
@@ -1774,8 +1766,6 @@ drain_io:
if (bio) {
if (erofs_is_fileio_mode(EROFS_SB(sb)))
erofs_fileio_submit_bio(bio);
- else if (erofs_is_fscache_mode(sb))
- erofs_fscache_submit_bio(bio);
else
submit_bio(bio);
}
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index e1a02a2c8406..bab521613552 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -15,8 +15,9 @@ struct z_erofs_maprecorder {
u8 type, headtype;
u16 clusterofs;
u16 delta[2];
- erofs_blk_t pblk, compressedblks;
+ erofs_blk_t pblk;
erofs_off_t nextpackoff;
+ int compressedblks;
bool partialref, in_mbox;
};
@@ -54,7 +55,12 @@ static int z_erofs_load_full_lcluster(struct z_erofs_maprecorder *m, u64 lcn)
} else {
m->partialref = !!(advise & Z_EROFS_LI_PARTIAL_REF);
m->clusterofs = le16_to_cpu(di->di_clusterofs);
- m->pblk = le32_to_cpu(di->di_u.blkaddr);
+ if (advise & Z_EROFS_LI_HOLE) {
+ m->compressedblks = 0;
+ m->pblk = EROFS_NULL_ADDR;
+ } else {
+ m->pblk = le32_to_cpu(di->di_u.blkaddr);
+ }
}
return 0;
}
@@ -309,9 +315,10 @@ static int z_erofs_get_extent_compressedlen(struct z_erofs_maprecorder *m,
((m->headtype == Z_EROFS_LCLUSTER_TYPE_PLAIN ||
m->headtype == Z_EROFS_LCLUSTER_TYPE_HEAD2) && !bigpcl2) ||
(lcn << vi->z_lclusterbits) >= inode->i_size)
- m->compressedblks = 1;
+ if (m->compressedblks < 0)
+ m->compressedblks = 1;
- if (m->compressedblks)
+ if (m->compressedblks >= 0)
goto out;
err = z_erofs_load_lcluster_from_disk(m, lcn, false);
@@ -329,19 +336,22 @@ static int z_erofs_get_extent_compressedlen(struct z_erofs_maprecorder *m,
DBG_BUGON(lcn == initial_lcn &&
m->type == Z_EROFS_LCLUSTER_TYPE_NONHEAD);
- if (m->type == Z_EROFS_LCLUSTER_TYPE_NONHEAD && m->delta[0] != 1) {
+ if (m->type != Z_EROFS_LCLUSTER_TYPE_NONHEAD) {
+ /*
+ * if the 1st NONHEAD lcluster is actually PLAIN or HEAD type
+ * rather than CBLKCNT, it's a 1 block-sized pcluster.
+ */
+ if (m->compressedblks < 0)
+ m->compressedblks = 1;
+ } else if (m->delta[0] != 1 || m->compressedblks < 0) {
erofs_err(sb, "bogus CBLKCNT @ lcn %llu of nid %llu", lcn, vi->nid);
DBG_BUGON(1);
return -EFSCORRUPTED;
}
- /*
- * if the 1st NONHEAD lcluster is actually PLAIN or HEAD type rather
- * than CBLKCNT, it's a 1 block-sized pcluster.
- */
- if (m->type != Z_EROFS_LCLUSTER_TYPE_NONHEAD || !m->compressedblks)
- m->compressedblks = 1;
out:
+ if (!m->compressedblks)
+ m->map->m_flags &= ~EROFS_MAP_MAPPED;
m->map->m_plen = erofs_pos(sb, m->compressedblks);
return 0;
}
@@ -395,6 +405,7 @@ static int z_erofs_map_blocks_fo(struct inode *inode,
.inode = inode,
.map = map,
.in_mbox = erofs_inode_in_metabox(inode),
+ .compressedblks = -1,
};
unsigned int endoff;
unsigned long initial_lcn;
diff --git a/include/trace/events/erofs.h b/include/trace/events/erofs.h
index cd0e3fd8c23f..0a178cb10fb1 100644
--- a/include/trace/events/erofs.h
+++ b/include/trace/events/erofs.h
@@ -90,7 +90,7 @@ TRACE_EVENT(erofs_read_folio,
__field(erofs_nid_t, nid )
__field(int, dir )
__field(pgoff_t, index )
- __field(int, uptodate)
+ __field(unsigned int, order )
__field(bool, raw )
),
@@ -99,16 +99,15 @@ TRACE_EVENT(erofs_read_folio,
__entry->nid = EROFS_I(inode)->nid;
__entry->dir = S_ISDIR(inode->i_mode);
__entry->index = folio->index;
- __entry->uptodate = folio_test_uptodate(folio);
+ __entry->order = folio_order(folio);
__entry->raw = raw;
),
- TP_printk("dev = (%d,%d), nid = %llu, %s, index = %lu, uptodate = %d "
- "raw = %d",
+ TP_printk("dev = (%d,%d), nid = %llu, %s, index = %lu, order = %u, raw = %d",
show_dev_nid(__entry),
show_file_type(__entry->dir),
(unsigned long)__entry->index,
- __entry->uptodate,
+ __entry->order,
__entry->raw)
);