当前位置:   article > 正文

erofs 阅读_cannot find valid erofs superblock

cannot find valid erofs superblock

Overview
========

EROFS file-system stands for Enhanced Read-Only File System. Different
from other read-only file systems, it aims to be designed for flexibility,
scalability, but be kept simple and high performance.

It is designed as a better filesystem solution for the following scenarios:

 - read-only storage media or

 - part of a fully trusted read-only solution, which means it needs to be
   immutable and bit-for-bit identical to the official golden image for
   their releases due to security and other considerations and

 - hope to save some extra storage space with guaranteed end-to-end performance
   by using reduced metadata and transparent file compression, especially
   for those embedded devices with limited memory (ex, smartphone);

Here is the main features of EROFS:

 - Little endian on-disk design;

 - Currently 4KB block size (nobh) and therefore maximum 16TB address space;

 - Metadata & data could be mixed by design;

 - 2 inode versions for different requirements:

   =====================  ============  =====================================
                          compact (v1)  extended (v2)
   =====================  ============  =====================================
   Inode metadata size    32 bytes      64 bytes
   Max file size          4 GB          16 EB (also limited by max. vol size)
   Max uids/gids          65536         4294967296
   File change time       no            yes (64 + 32-bit timestamp)
   Max hardlinks          65536         4294967296
   Metadata reserved      4 bytes       14 bytes
   =====================  ============  =====================================

 - Support extended attributes (xattrs) as an option;

 - Support xattr inline and tail-end data inline for all files;

 - Support POSIX.1e ACLs by using xattrs;

 - Support transparent data compression as an option:
   LZ4 algorithm with the fixed-sized output compression for high performance.

The following git tree provides the file system user-space tools under
development (ex, formatting tool mkfs.erofs):

- git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git

Bugs and patches are welcome, please kindly help us and send to the following
linux-erofs mailing list:

- linux-erofs mailing list   <linux-erofs@lists.ozlabs.org>

Mount options
=============

===================    =========================================================
(no)user_xattr         Setup Extended User Attributes. Note: xattr is enabled
                       by default if CONFIG_EROFS_FS_XATTR is selected.
(no)acl                Setup POSIX Access Control List. Note: acl is enabled
                       by default if CONFIG_EROFS_FS_POSIX_ACL is selected.
cache_strategy=%s      Select a strategy for cached decompression from now on:

               ==========  =============================================
                         disabled  In-place I/O decompression only;
                        readahead  Cache the last incomplete compressed physical
                                   cluster for further reading. It still does
                                   in-place I/O decompression for the rest
                                   compressed physical clusters;
                       readaround  Cache the both ends of incomplete compressed
                                   physical clusters for further reading.
                                   It still does in-place I/O decompression
                                   for the rest compressed physical clusters.
               ==========  =============================================
dax={always,never}     Use direct access (no page cache).  See
                       Documentation/filesystems/dax.rst.
dax                    A legacy option which is an alias for ``dax=always``.
===================    =========================================================

On-disk details
===============

Summary
-------
Different from other read-only file systems, an EROFS volume is designed
to be as simple as possible::

                                |-> aligned with the block size
   ____________________________________________________________
  | |SB| | ... | Metadata | ... | Data | Metadata | ... | Data |
  |_|__|_|_____|__________|_____|______|__________|_____|______|
  0 +1K

All data areas should be aligned with the block size, but metadata areas
may not. All metadatas can be now observed in two different spaces (views):

 1. Inode metadata space

    Each valid inode should be aligned with an inode slot, which is a fixed
    value (32 bytes) and designed to be kept in line with compact inode size.

    Each inode can be directly found with the following formula:
         inode offset = meta_blkaddr * block_size + 32 * nid

    ::

                                 |-> aligned with 8B
                                            |-> followed closely
     + meta_blkaddr blocks                                      |-> another slot
       _____________________________________________________________________
     |  ...   | inode |  xattrs  | extents  | data inline | ... | inode ...
     |________|_______|(optional)|(optional)|__(optional)_|_____|__________
              |-> aligned with the inode slot size
                   .                   .
                 .                         .
               .                              .
             .                                    .
           .                                         .
         .                                              .
       .____________________________________________________|-> aligned with 4B
       | xattr_ibody_header | shared xattrs | inline xattrs |
       |____________________|_______________|_______________|
       |->    12 bytes    <-|->x * 4 bytes<-|               .
                           .                .                 .
                     .                      .                   .
                .                           .                     .
            ._______________________________.______________________.
            | id | id | id | id |  ... | id | ent | ... | ent| ... |
            |____|____|____|____|______|____|_____|_____|____|_____|
                                            |-> aligned with 4B
                                                        |-> aligned with 4B

    Inode could be 32 or 64 bytes, which can be distinguished from a common
    field which all inode versions have -- i_format::

        __________________               __________________
       |     i_format     |             |     i_format     |
       |__________________|             |__________________|
       |        ...       |             |        ...       |
       |                  |             |                  |
       |__________________| 32 bytes    |                  |
                                        |                  |
                                        |__________________| 64 bytes

    Xattrs, extents, data inline are followed by the corresponding inode with
    proper alignment, and they could be optional for different data mappings.
    _currently_ total 5 data layouts are supported:

    ==  ====================================================================
     0  flat file data without data inline (no extent);
     1  fixed-sized output data compression (with non-compacted indexes);
     2  flat file data with tail packing data inline (no extent);
     3  fixed-sized output data compression (with compacted indexes, v5.3+);
     4  chunk-based file (v5.15+).
    ==  ====================================================================

    The size of the optional xattrs is indicated by i_xattr_count in inode
    header. Large xattrs or xattrs shared by many different files can be
    stored in shared xattrs metadata rather than inlined right after inode.

 2. Shared xattrs metadata space

    Shared xattrs space is similar to the above inode space, started with
    a specific block indicated by xattr_blkaddr, organized one by one with
    proper align.

    Each share xattr can also be directly found by the following formula:
         xattr offset = xattr_blkaddr * block_size + 4 * xattr_id

::

                           |-> aligned by  4 bytes
    + xattr_blkaddr blocks                     |-> aligned with 4 bytes
     _________________________________________________________________________
    |  ...   | xattr_entry |  xattr data | ... |  xattr_entry | xattr data  ...
    |________|_____________|_____________|_____|______________|_______________

Directories
-----------
All directories are now organized in a compact on-disk format. Note that
each directory block is divided into index and name areas in order to support
random file lookup, and all directory entries are _strictly_ recorded in
alphabetical order in order to support improved prefix binary search
algorithm (could refer to the related source code).

::

                  ___________________________
                 /                           |
                /              ______________|________________
               /              /              | nameoff1       | nameoffN-1
  ____________.______________._______________v________________v__________
 | dirent | dirent | ... | dirent | filename | filename | ... | filename |
 |___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
      \                           ^
       \                          |                           * could have
        \                         |                             trailing '\0'
         \________________________| nameoff0
                             Directory block

Note that apart from the offset of the first filename, nameoff0 also indicates
the total number of directory entries in this block since it is no need to
introduce another on-disk field at all.

Chunk-based file
----------------
In order to support chunk-based data deduplication, a new inode data layout has
been supported since Linux v5.15: Files are split in equal-sized data chunks
with ``extents`` area of the inode metadata indicating how to get the chunk
data: these can be simply as a 4-byte block address array or in the 8-byte
chunk index form (see struct erofs_inode_chunk_index in erofs_fs.h for more
details.)

By the way, chunk-based files are all uncompressed for now.

Data compression
----------------
EROFS implements LZ4 fixed-sized output compression which generates fixed-sized
compressed data blocks from variable-sized input in contrast to other existing
fixed-sized input solutions. Relatively higher compression ratios can be gotten
by using fixed-sized output compression since nowadays popular data compression
algorithms are mostly LZ77-based and such fixed-sized output approach can be
benefited from the historical dictionary (aka. sliding window).

In details, original (uncompressed) data is turned into several variable-sized
extents and in the meanwhile, compressed into physical clusters (pclusters).
In order to record each variable-sized extent, logical clusters (lclusters) are
introduced as the basic unit of compress indexes to indicate whether a new
extent is generated within the range (HEAD) or not (NONHEAD). Lclusters are now
fixed in block size, as illustrated below::

          |<-    variable-sized extent    ->|<-       VLE         ->|
        clusterofs                        clusterofs              clusterofs
          |                                 |                       |
 _________v_________________________________v_______________________v________
 ... |    .         |              |        .     |              |  .   ...
 ____|____._________|______________|________.___ _|______________|__.________
     |-> lcluster <-|-> lcluster <-|-> lcluster <-|-> lcluster <-|
          (HEAD)        (NONHEAD)       (HEAD)        (NONHEAD)    .
           .             CBLKCNT            .                    .
            .                               .                  .
             .                              .                .
       _______._____________________________.______________._________________
          ... |              |              |              | ...
       _______|______________|______________|______________|_________________
              |->      big pcluster       <-|-> pcluster <-|

A physical cluster can be seen as a container of physical compressed blocks
which contains compressed data. Previously, only lcluster-sized (4KB) pclusters
were supported. After big pcluster feature is introduced (available since
Linux v5.13), pcluster can be a multiple of lcluster size.

For each HEAD lcluster, clusterofs is recorded to indicate where a new extent
starts and blkaddr is used to seek the compressed data. For each NONHEAD
lcluster, delta0 and delta1 are available instead of blkaddr to indicate the
distance to its HEAD lcluster and the next HEAD lcluster. A PLAIN lcluster is
also a HEAD lcluster except that its data is uncompressed. See the comments
around "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.

If big pcluster is enabled, pcluster size in lclusters needs to be recorded as
well. Let the delta0 of the first NONHEAD lcluster store the compressed block
count with a special flag as a new called CBLKCNT NONHEAD lcluster. It's easy
to understand its delta0 is constantly 1, as illustrated below::

   __________________________________________________________
  | HEAD |  NONHEAD  | NONHEAD | ... | NONHEAD | HEAD | HEAD |
  |__:___|_(CBLKCNT)_|_________|_____|_________|__:___|____:_|
     |<----- a big pcluster (with CBLKCNT) ------>|<--  -->|
           a lcluster-sized pcluster (without CBLKCNT) ^

If another HEAD follows a HEAD lcluster, there is no room to record CBLKCNT,
but it's easy to know the size of such pcluster is 1 lcluster as well.

drivers\staging\erofs\erofs_fs.h

  1. /* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0
  2. *
  3. * linux/drivers/staging/erofs/erofs_fs.h
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is dual-licensed; you may select either the GNU General Public
  10. * License version 2 or Apache License, Version 2.0. See the file COPYING
  11. * in the main directory of the Linux distribution for more details.
  12. */
  13. #ifndef __EROFS_FS_H
  14. #define __EROFS_FS_H
  15. /* Enhanced(Extended) ROM File System */
  16. #define EROFS_SUPER_MAGIC_V1 0xE0F5E1E2
  17. #define EROFS_SUPER_OFFSET 1024
  18. /*
  19. * Any bits that aren't in EROFS_ALL_REQUIREMENTS should be
  20. * incompatible with this kernel version.
  21. */
  22. #define EROFS_ALL_REQUIREMENTS 0
  23. struct erofs_super_block {
  24. /* 0 */__le32 magic; /* in the little endian */
  25. /* 4 */__le32 checksum; /* crc32c(super_block) */
  26. /* 8 */__le32 features; /* (aka. feature_compat) */
  27. /* 12 */__u8 blkszbits; /* support block_size == PAGE_SIZE only */
  28. /* 13 */__u8 reserved;
  29. /* 14 */__le16 root_nid;
  30. /* 16 */__le64 inos; /* total valid ino # (== f_files - f_favail) */
  31. /* 24 */__le64 build_time; /* inode v1 time derivation */
  32. /* 32 */__le32 build_time_nsec;
  33. /* 36 */__le32 blocks; /* used for statfs */
  34. /* 40 */__le32 meta_blkaddr;
  35. /* 44 */__le32 xattr_blkaddr;
  36. /* 48 */__u8 uuid[16]; /* 128-bit uuid for volume */
  37. /* 64 */__u8 volume_name[16]; /* volume name */
  38. /* 80 */__le32 requirements; /* (aka. feature_incompat) */
  39. /* 84 */__u8 reserved2[44];
  40. } __packed; /* 128 bytes */
  41. #define __EROFS_BIT(_prefix, _cur, _pre) enum { \
  42. _prefix ## _cur ## _BIT = _prefix ## _pre ## _BIT + \
  43. _prefix ## _pre ## _BITS }
  44. /*
  45. * erofs inode data mapping:
  46. * 0 - inode plain without inline data A:
  47. * inode, [xattrs], ... | ... | no-holed data
  48. * 1 - inode VLE compression B:
  49. * inode, [xattrs], extents ... | ...
  50. * 2 - inode plain with inline data C:
  51. * inode, [xattrs], last_inline_data, ... | ... | no-holed data
  52. * 3~7 - reserved
  53. */
  54. enum {
  55. EROFS_INODE_LAYOUT_PLAIN,
  56. EROFS_INODE_LAYOUT_COMPRESSION,
  57. EROFS_INODE_LAYOUT_INLINE,
  58. EROFS_INODE_LAYOUT_MAX
  59. };
  60. #define EROFS_I_VERSION_BITS 1
  61. #define EROFS_I_DATA_MAPPING_BITS 3
  62. #define EROFS_I_VERSION_BIT 0
  63. __EROFS_BIT(EROFS_I_, DATA_MAPPING, VERSION);
  64. #define EROFS_I_ALL \
  65. ((1 << (EROFS_I_DATA_MAPPING_BIT + EROFS_I_DATA_MAPPING_BITS)) - 1)
  66. struct erofs_inode_v1 {
  67. /* 0 */__le16 i_advise;
  68. /* 1 header + n-1 * 4 bytes inline xattr to keep continuity */
  69. /* 2 */__le16 i_xattr_icount;
  70. /* 4 */__le16 i_mode;
  71. /* 6 */__le16 i_nlink;
  72. /* 8 */__le32 i_size;
  73. /* 12 */__le32 i_reserved;
  74. /* 16 */union {
  75. /* file total compressed blocks for data mapping 1 */
  76. __le32 compressed_blocks;
  77. __le32 raw_blkaddr;
  78. /* for device files, used to indicate old/new device # */
  79. __le32 rdev;
  80. } i_u __packed;
  81. /* 20 */__le32 i_ino; /* only used for 32-bit stat compatibility */
  82. /* 24 */__le16 i_uid;
  83. /* 26 */__le16 i_gid;
  84. /* 28 */__le32 i_checksum;
  85. } __packed;
  86. /* 32 bytes on-disk inode */
  87. #define EROFS_INODE_LAYOUT_V1 0
  88. /* 64 bytes on-disk inode */
  89. #define EROFS_INODE_LAYOUT_V2 1
  90. struct erofs_inode_v2 {
  91. __le16 i_advise;
  92. /* 1 header + n-1 * 4 bytes inline xattr to keep continuity */
  93. __le16 i_xattr_icount;
  94. __le16 i_mode;
  95. __le16 i_reserved; /* 8 bytes */
  96. __le64 i_size; /* 16 bytes */
  97. union {
  98. /* file total compressed blocks for data mapping 1 */
  99. __le32 compressed_blocks;
  100. __le32 raw_blkaddr;
  101. /* for device files, used to indicate old/new device # */
  102. __le32 rdev;
  103. } i_u __packed;
  104. /* only used for 32-bit stat compatibility */
  105. __le32 i_ino; /* 24 bytes */
  106. __le32 i_uid;
  107. __le32 i_gid;
  108. __le64 i_ctime; /* 32 bytes */
  109. __le32 i_ctime_nsec;
  110. __le32 i_nlink;
  111. __u8 i_reserved2[12];
  112. __le32 i_checksum; /* 64 bytes */
  113. } __packed;
  114. #define EROFS_MAX_SHARED_XATTRS (128)
  115. /* h_shared_count between 129 ... 255 are special # */
  116. #define EROFS_SHARED_XATTR_EXTENT (255)
  117. /*
  118. * inline xattrs (n == i_xattr_icount):
  119. * erofs_xattr_ibody_header(1) + (n - 1) * 4 bytes
  120. * 12 bytes / \
  121. * / \
  122. * /-----------------------\
  123. * | erofs_xattr_entries+ |
  124. * +-----------------------+
  125. * inline xattrs must starts in erofs_xattr_ibody_header,
  126. * for read-only fs, no need to introduce h_refcount
  127. */
  128. struct erofs_xattr_ibody_header {
  129. __le32 h_checksum;
  130. __u8 h_shared_count;
  131. __u8 h_reserved[7];
  132. __le32 h_shared_xattrs[0]; /* shared xattr id array */
  133. } __packed;
  134. /* Name indexes */
  135. #define EROFS_XATTR_INDEX_USER 1
  136. #define EROFS_XATTR_INDEX_POSIX_ACL_ACCESS 2
  137. #define EROFS_XATTR_INDEX_POSIX_ACL_DEFAULT 3
  138. #define EROFS_XATTR_INDEX_TRUSTED 4
  139. #define EROFS_XATTR_INDEX_LUSTRE 5
  140. #define EROFS_XATTR_INDEX_SECURITY 6
  141. /* xattr entry (for both inline & shared xattrs) */
  142. struct erofs_xattr_entry {
  143. __u8 e_name_len; /* length of name */
  144. __u8 e_name_index; /* attribute name index */
  145. __le16 e_value_size; /* size of attribute value */
  146. /* followed by e_name and e_value */
  147. char e_name[0]; /* attribute name */
  148. } __packed;
  149. #define ondisk_xattr_ibody_size(count) ({\
  150. u32 __count = le16_to_cpu(count); \
  151. ((__count) == 0) ? 0 : \
  152. sizeof(struct erofs_xattr_ibody_header) + \
  153. sizeof(__u32) * ((__count) - 1); })
  154. #define EROFS_XATTR_ALIGN(size) round_up(size, sizeof(struct erofs_xattr_entry))
  155. #define EROFS_XATTR_ENTRY_SIZE(entry) EROFS_XATTR_ALIGN( \
  156. sizeof(struct erofs_xattr_entry) + \
  157. (entry)->e_name_len + le16_to_cpu((entry)->e_value_size))
  158. /* have to be aligned with 8 bytes on disk */
  159. struct erofs_extent_header {
  160. __le32 eh_checksum;
  161. __le32 eh_reserved[3];
  162. } __packed;
  163. /*
  164. * Z_EROFS Variable-sized Logical Extent cluster type:
  165. * 0 - literal (uncompressed) cluster
  166. * 1 - compressed cluster (for the head logical cluster)
  167. * 2 - compressed cluster (for the other logical clusters)
  168. *
  169. * In detail,
  170. * 0 - literal (uncompressed) cluster,
  171. * di_advise = 0
  172. * di_clusterofs = the literal data offset of the cluster
  173. * di_blkaddr = the blkaddr of the literal cluster
  174. *
  175. * 1 - compressed cluster (for the head logical cluster)
  176. * di_advise = 1
  177. * di_clusterofs = the decompressed data offset of the cluster
  178. * di_blkaddr = the blkaddr of the compressed cluster
  179. *
  180. * 2 - compressed cluster (for the other logical clusters)
  181. * di_advise = 2
  182. * di_clusterofs =
  183. * the decompressed data offset in its own head cluster
  184. * di_u.delta[0] = distance to its corresponding head cluster
  185. * di_u.delta[1] = distance to its corresponding tail cluster
  186. * (di_advise could be 0, 1 or 2)
  187. */
  188. #define Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS 2
  189. #define Z_EROFS_VLE_DI_CLUSTER_TYPE_BIT 0
  190. struct z_erofs_vle_decompressed_index {
  191. __le16 di_advise;
  192. /* where to decompress in the head cluster */
  193. __le16 di_clusterofs;
  194. union {
  195. /* for the head cluster */
  196. __le32 blkaddr;
  197. /*
  198. * for the rest clusters
  199. * eg. for 4k page-sized cluster, maximum 4K*64k = 256M)
  200. * [0] - pointing to the head cluster
  201. * [1] - pointing to the tail cluster
  202. */
  203. __le16 delta[2];
  204. } di_u __packed; /* 8 bytes */
  205. } __packed;
  206. #define Z_EROFS_VLE_EXTENT_ALIGN(size) round_up(size, \
  207. sizeof(struct z_erofs_vle_decompressed_index))
  208. /* dirent sorts in alphabet order, thus we can do binary search */
  209. struct erofs_dirent {
  210. __le64 nid; /* 0, node number */
  211. __le16 nameoff; /* 8, start offset of file name */
  212. __u8 file_type; /* 10, file type */
  213. __u8 reserved; /* 11, reserved */
  214. } __packed;
  215. /* file types used in inode_info->flags */
  216. enum {
  217. EROFS_FT_UNKNOWN,
  218. EROFS_FT_REG_FILE,
  219. EROFS_FT_DIR,
  220. EROFS_FT_CHRDEV,
  221. EROFS_FT_BLKDEV,
  222. EROFS_FT_FIFO,
  223. EROFS_FT_SOCK,
  224. EROFS_FT_SYMLINK,
  225. EROFS_FT_MAX
  226. };
  227. #define EROFS_NAME_LEN 255
  228. /* check the EROFS on-disk layout strictly at compile time */
  229. static inline void erofs_check_ondisk_layout_definitions(void)
  230. {
  231. BUILD_BUG_ON(sizeof(struct erofs_super_block) != 128);
  232. BUILD_BUG_ON(sizeof(struct erofs_inode_v1) != 32);
  233. BUILD_BUG_ON(sizeof(struct erofs_inode_v2) != 64);
  234. BUILD_BUG_ON(sizeof(struct erofs_xattr_ibody_header) != 12);
  235. BUILD_BUG_ON(sizeof(struct erofs_xattr_entry) != 4);
  236. BUILD_BUG_ON(sizeof(struct erofs_extent_header) != 16);
  237. BUILD_BUG_ON(sizeof(struct z_erofs_vle_decompressed_index) != 8);
  238. BUILD_BUG_ON(sizeof(struct erofs_dirent) != 12);
  239. }
  240. #endif

internal.h

  1. /* SPDX-License-Identifier: GPL-2.0
  2. *
  3. * linux/drivers/staging/erofs/internal.h
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #ifndef __INTERNAL_H
  14. #define __INTERNAL_H
  15. #include <linux/fs.h>
  16. #include <linux/dcache.h>
  17. #include <linux/mm.h>
  18. #include <linux/pagemap.h>
  19. #include <linux/bio.h>
  20. #include <linux/buffer_head.h>
  21. #include <linux/cleancache.h>
  22. #include <linux/slab.h>
  23. #include <linux/vmalloc.h>
  24. #include "erofs_fs.h"
  25. /* redefine pr_fmt "erofs: " */
  26. #undef pr_fmt
  27. #define pr_fmt(fmt) "erofs: " fmt
  28. #define errln(x, ...) pr_err(x "\n", ##__VA_ARGS__)
  29. #define infoln(x, ...) pr_info(x "\n", ##__VA_ARGS__)
  30. #ifdef CONFIG_EROFS_FS_DEBUG
  31. #define debugln(x, ...) pr_debug(x "\n", ##__VA_ARGS__)
  32. #define dbg_might_sleep might_sleep
  33. #define DBG_BUGON BUG_ON
  34. #else
  35. #define debugln(x, ...) ((void)0)
  36. #define dbg_might_sleep() ((void)0)
  37. #define DBG_BUGON(x) ((void)(x))
  38. #endif
  39. #ifdef CONFIG_EROFS_FAULT_INJECTION
  40. enum {
  41. FAULT_KMALLOC,
  42. FAULT_MAX,
  43. };
  44. extern char *erofs_fault_name[FAULT_MAX];
  45. #define IS_FAULT_SET(fi, type) ((fi)->inject_type & (1 << (type)))
  46. struct erofs_fault_info {
  47. atomic_t inject_ops;
  48. unsigned int inject_rate;
  49. unsigned int inject_type;
  50. };
  51. #endif
  52. #ifdef CONFIG_EROFS_FS_ZIP_CACHE_BIPOLAR
  53. #define EROFS_FS_ZIP_CACHE_LVL (2)
  54. #elif defined(EROFS_FS_ZIP_CACHE_UNIPOLAR)
  55. #define EROFS_FS_ZIP_CACHE_LVL (1)
  56. #else
  57. #define EROFS_FS_ZIP_CACHE_LVL (0)
  58. #endif
  59. #if (!defined(EROFS_FS_HAS_MANAGED_CACHE) && (EROFS_FS_ZIP_CACHE_LVL > 0))
  60. #define EROFS_FS_HAS_MANAGED_CACHE
  61. #endif
  62. /* EROFS_SUPER_MAGIC_V1 to represent the whole file system */
  63. #define EROFS_SUPER_MAGIC EROFS_SUPER_MAGIC_V1
  64. typedef u64 erofs_nid_t;
  65. struct erofs_sb_info {
  66. /* list for all registered superblocks, mainly for shrinker */
  67. struct list_head list;
  68. struct mutex umount_mutex;
  69. u32 blocks;
  70. u32 meta_blkaddr;
  71. #ifdef CONFIG_EROFS_FS_XATTR
  72. u32 xattr_blkaddr;
  73. #endif
  74. /* inode slot unit size in bit shift */
  75. unsigned char islotbits;
  76. #ifdef CONFIG_EROFS_FS_ZIP
  77. /* cluster size in bit shift */
  78. unsigned char clusterbits;
  79. /* the dedicated workstation for compression */
  80. struct radix_tree_root workstn_tree;
  81. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  82. struct inode *managed_cache;
  83. #endif
  84. #endif
  85. u32 build_time_nsec;
  86. u64 build_time;
  87. /* what we really care is nid, rather than ino.. */
  88. erofs_nid_t root_nid;
  89. /* used for statfs, f_files - f_favail */
  90. u64 inos;
  91. u8 uuid[16]; /* 128-bit uuid for volume */
  92. u8 volume_name[16]; /* volume name */
  93. u32 requirements;
  94. char *dev_name;
  95. unsigned int mount_opt;
  96. unsigned int shrinker_run_no;
  97. #ifdef CONFIG_EROFS_FAULT_INJECTION
  98. struct erofs_fault_info fault_info; /* For fault injection */
  99. #endif
  100. };
  101. #ifdef CONFIG_EROFS_FAULT_INJECTION
  102. #define erofs_show_injection_info(type) \
  103. infoln("inject %s in %s of %pS", erofs_fault_name[type], \
  104. __func__, __builtin_return_address(0))
  105. static inline bool time_to_inject(struct erofs_sb_info *sbi, int type)
  106. {
  107. struct erofs_fault_info *ffi = &sbi->fault_info;
  108. if (!ffi->inject_rate)
  109. return false;
  110. if (!IS_FAULT_SET(ffi, type))
  111. return false;
  112. atomic_inc(&ffi->inject_ops);
  113. if (atomic_read(&ffi->inject_ops) >= ffi->inject_rate) {
  114. atomic_set(&ffi->inject_ops, 0);
  115. return true;
  116. }
  117. return false;
  118. }
  119. #endif
  120. static inline void *erofs_kmalloc(struct erofs_sb_info *sbi,
  121. size_t size, gfp_t flags)
  122. {
  123. #ifdef CONFIG_EROFS_FAULT_INJECTION
  124. if (time_to_inject(sbi, FAULT_KMALLOC)) {
  125. erofs_show_injection_info(FAULT_KMALLOC);
  126. return NULL;
  127. }
  128. #endif
  129. return kmalloc(size, flags);
  130. }
  131. #define EROFS_SB(sb) ((struct erofs_sb_info *)(sb)->s_fs_info)
  132. #define EROFS_I_SB(inode) ((struct erofs_sb_info *)(inode)->i_sb->s_fs_info)
  133. /* Mount flags set via mount options or defaults */
  134. #define EROFS_MOUNT_XATTR_USER 0x00000010
  135. #define EROFS_MOUNT_POSIX_ACL 0x00000020
  136. #define EROFS_MOUNT_FAULT_INJECTION 0x00000040
  137. #define clear_opt(sbi, option) ((sbi)->mount_opt &= ~EROFS_MOUNT_##option)
  138. #define set_opt(sbi, option) ((sbi)->mount_opt |= EROFS_MOUNT_##option)
  139. #define test_opt(sbi, option) ((sbi)->mount_opt & EROFS_MOUNT_##option)
  140. #ifdef CONFIG_EROFS_FS_ZIP
  141. #define erofs_workstn_lock(sbi) xa_lock(&(sbi)->workstn_tree)
  142. #define erofs_workstn_unlock(sbi) xa_unlock(&(sbi)->workstn_tree)
  143. /* basic unit of the workstation of a super_block */
  144. struct erofs_workgroup {
  145. /* the workgroup index in the workstation */
  146. pgoff_t index;
  147. /* overall workgroup reference count */
  148. atomic_t refcount;
  149. };
  150. #define EROFS_LOCKED_MAGIC (INT_MIN | 0xE0F510CCL)
  151. #if defined(CONFIG_SMP)
  152. static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
  153. int val)
  154. {
  155. preempt_disable();
  156. if (val != atomic_cmpxchg(&grp->refcount, val, EROFS_LOCKED_MAGIC)) {
  157. preempt_enable();
  158. return false;
  159. }
  160. return true;
  161. }
  162. static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
  163. int orig_val)
  164. {
  165. /*
  166. * other observers should notice all modifications
  167. * in the freezing period.
  168. */
  169. smp_mb();
  170. atomic_set(&grp->refcount, orig_val);
  171. preempt_enable();
  172. }
  173. static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
  174. {
  175. return atomic_cond_read_relaxed(&grp->refcount,
  176. VAL != EROFS_LOCKED_MAGIC);
  177. }
  178. #else
  179. static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
  180. int val)
  181. {
  182. preempt_disable();
  183. /* no need to spin on UP platforms, let's just disable preemption. */
  184. if (val != atomic_read(&grp->refcount)) {
  185. preempt_enable();
  186. return false;
  187. }
  188. return true;
  189. }
  190. static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
  191. int orig_val)
  192. {
  193. preempt_enable();
  194. }
  195. static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
  196. {
  197. int v = atomic_read(&grp->refcount);
  198. /* workgroup is never freezed on uniprocessor systems */
  199. DBG_BUGON(v == EROFS_LOCKED_MAGIC);
  200. return v;
  201. }
  202. #endif
  203. static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
  204. {
  205. int o;
  206. repeat:
  207. o = erofs_wait_on_workgroup_freezed(grp);
  208. if (unlikely(o <= 0))
  209. return -1;
  210. if (unlikely(atomic_cmpxchg(&grp->refcount, o, o + 1) != o))
  211. goto repeat;
  212. *ocnt = o;
  213. return 0;
  214. }
  215. #define __erofs_workgroup_get(grp) atomic_inc(&(grp)->refcount)
  216. #define __erofs_workgroup_put(grp) atomic_dec(&(grp)->refcount)
  217. extern int erofs_workgroup_put(struct erofs_workgroup *grp);
  218. extern struct erofs_workgroup *erofs_find_workgroup(
  219. struct super_block *sb, pgoff_t index, bool *tag);
  220. extern int erofs_register_workgroup(struct super_block *sb,
  221. struct erofs_workgroup *grp, bool tag);
  222. extern unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
  223. unsigned long nr_shrink, bool cleanup);
  224. static inline void erofs_workstation_cleanup_all(struct super_block *sb)
  225. {
  226. erofs_shrink_workstation(EROFS_SB(sb), ~0UL, true);
  227. }
  228. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  229. #define EROFS_UNALLOCATED_CACHED_PAGE ((void *)0x5F0EF00D)
  230. extern int erofs_try_to_free_all_cached_pages(struct erofs_sb_info *sbi,
  231. struct erofs_workgroup *egrp);
  232. extern int erofs_try_to_free_cached_page(struct address_space *mapping,
  233. struct page *page);
  234. #endif
  235. #endif
  236. /* we strictly follow PAGE_SIZE and no buffer head yet */
  237. #define LOG_BLOCK_SIZE PAGE_SHIFT
  238. #undef LOG_SECTORS_PER_BLOCK
  239. #define LOG_SECTORS_PER_BLOCK (PAGE_SHIFT - 9)
  240. #undef SECTORS_PER_BLOCK
  241. #define SECTORS_PER_BLOCK (1 << SECTORS_PER_BLOCK)
  242. #define EROFS_BLKSIZ (1 << LOG_BLOCK_SIZE)
  243. #if (EROFS_BLKSIZ % 4096 || !EROFS_BLKSIZ)
  244. #error erofs cannot be used in this platform
  245. #endif
  246. #define ROOT_NID(sb) ((sb)->root_nid)
  247. #ifdef CONFIG_EROFS_FS_ZIP
  248. /* hard limit of pages per compressed cluster */
  249. #define Z_EROFS_CLUSTER_MAX_PAGES (CONFIG_EROFS_FS_CLUSTER_PAGE_LIMIT)
  250. /* page count of a compressed cluster */
  251. #define erofs_clusterpages(sbi) ((1 << (sbi)->clusterbits) / PAGE_SIZE)
  252. #endif
  253. typedef u64 erofs_off_t;
  254. /* data type for filesystem-wide blocks number */
  255. typedef u32 erofs_blk_t;
  256. #define erofs_blknr(addr) ((addr) / EROFS_BLKSIZ)
  257. #define erofs_blkoff(addr) ((addr) % EROFS_BLKSIZ)
  258. #define blknr_to_addr(nr) ((erofs_off_t)(nr) * EROFS_BLKSIZ)
  259. static inline erofs_off_t iloc(struct erofs_sb_info *sbi, erofs_nid_t nid)
  260. {
  261. return blknr_to_addr(sbi->meta_blkaddr) + (nid << sbi->islotbits);
  262. }
  263. /* atomic flag definitions */
  264. #define EROFS_V_EA_INITED_BIT 0
  265. /* bitlock definitions (arranged in reverse order) */
  266. #define EROFS_V_BL_XATTR_BIT (BITS_PER_LONG - 1)
  267. struct erofs_vnode {
  268. erofs_nid_t nid;
  269. /* atomic flags (including bitlocks) */
  270. unsigned long flags;
  271. unsigned char data_mapping_mode;
  272. /* inline size in bytes */
  273. unsigned char inode_isize;
  274. unsigned short xattr_isize;
  275. unsigned xattr_shared_count;
  276. unsigned *xattr_shared_xattrs;
  277. erofs_blk_t raw_blkaddr;
  278. /* the corresponding vfs inode */
  279. struct inode vfs_inode;
  280. };
  281. #define EROFS_V(ptr) \
  282. container_of(ptr, struct erofs_vnode, vfs_inode)
  283. #define __inode_advise(x, bit, bits) \
  284. (((x) >> (bit)) & ((1 << (bits)) - 1))
  285. #define __inode_version(advise) \
  286. __inode_advise(advise, EROFS_I_VERSION_BIT, \
  287. EROFS_I_VERSION_BITS)
  288. #define __inode_data_mapping(advise) \
  289. __inode_advise(advise, EROFS_I_DATA_MAPPING_BIT,\
  290. EROFS_I_DATA_MAPPING_BITS)
  291. static inline unsigned long inode_datablocks(struct inode *inode)
  292. {
  293. /* since i_size cannot be changed */
  294. return DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
  295. }
  296. static inline bool is_inode_layout_plain(struct inode *inode)
  297. {
  298. return EROFS_V(inode)->data_mapping_mode == EROFS_INODE_LAYOUT_PLAIN;
  299. }
  300. static inline bool is_inode_layout_compression(struct inode *inode)
  301. {
  302. return EROFS_V(inode)->data_mapping_mode ==
  303. EROFS_INODE_LAYOUT_COMPRESSION;
  304. }
  305. static inline bool is_inode_layout_inline(struct inode *inode)
  306. {
  307. return EROFS_V(inode)->data_mapping_mode == EROFS_INODE_LAYOUT_INLINE;
  308. }
  309. extern const struct super_operations erofs_sops;
  310. extern const struct inode_operations erofs_dir_iops;
  311. extern const struct file_operations erofs_dir_fops;
  312. extern const struct address_space_operations erofs_raw_access_aops;
  313. #ifdef CONFIG_EROFS_FS_ZIP
  314. extern const struct address_space_operations z_erofs_vle_normalaccess_aops;
  315. #endif
  316. /*
  317. * Logical to physical block mapping, used by erofs_map_blocks()
  318. *
  319. * Different with other file systems, it is used for 2 access modes:
  320. *
  321. * 1) RAW access mode:
  322. *
  323. * Users pass a valid (m_lblk, m_lofs -- usually 0) pair,
  324. * and get the valid m_pblk, m_pofs and the longest m_len(in bytes).
  325. *
  326. * Note that m_lblk in the RAW access mode refers to the number of
  327. * the compressed ondisk block rather than the uncompressed
  328. * in-memory block for the compressed file.
  329. *
  330. * m_pofs equals to m_lofs except for the inline data page.
  331. *
  332. * 2) Normal access mode:
  333. *
  334. * If the inode is not compressed, it has no difference with
  335. * the RAW access mode. However, if the inode is compressed,
  336. * users should pass a valid (m_lblk, m_lofs) pair, and get
  337. * the needed m_pblk, m_pofs, m_len to get the compressed data
  338. * and the updated m_lblk, m_lofs which indicates the start
  339. * of the corresponding uncompressed data in the file.
  340. */
  341. enum {
  342. BH_Zipped = BH_PrivateStart,
  343. };
  344. /* Has a disk mapping */
  345. #define EROFS_MAP_MAPPED (1 << BH_Mapped)
  346. /* Located in metadata (could be copied from bd_inode) */
  347. #define EROFS_MAP_META (1 << BH_Meta)
  348. /* The extent has been compressed */
  349. #define EROFS_MAP_ZIPPED (1 << BH_Zipped)
  350. struct erofs_map_blocks {
  351. erofs_off_t m_pa, m_la;
  352. u64 m_plen, m_llen;
  353. unsigned int m_flags;
  354. };
  355. /* Flags used by erofs_map_blocks() */
  356. #define EROFS_GET_BLOCKS_RAW 0x0001
  357. /* data.c */
  358. static inline struct bio *prepare_bio(
  359. struct super_block *sb,
  360. erofs_blk_t blkaddr, unsigned nr_pages,
  361. bio_end_io_t endio)
  362. {
  363. gfp_t gfp = GFP_NOIO;
  364. struct bio *bio = bio_alloc(gfp, nr_pages);
  365. if (unlikely(bio == NULL) &&
  366. (current->flags & PF_MEMALLOC)) {
  367. do {
  368. nr_pages /= 2;
  369. if (unlikely(!nr_pages)) {
  370. bio = bio_alloc(gfp | __GFP_NOFAIL, 1);
  371. BUG_ON(bio == NULL);
  372. break;
  373. }
  374. bio = bio_alloc(gfp, nr_pages);
  375. } while (bio == NULL);
  376. }
  377. bio->bi_end_io = endio;
  378. bio_set_dev(bio, sb->s_bdev);
  379. bio->bi_iter.bi_sector = blkaddr << LOG_SECTORS_PER_BLOCK;
  380. return bio;
  381. }
  382. static inline void __submit_bio(struct bio *bio, unsigned op, unsigned op_flags)
  383. {
  384. bio_set_op_attrs(bio, op, op_flags);
  385. submit_bio(bio);
  386. }
  387. extern struct page *erofs_get_meta_page(struct super_block *sb,
  388. erofs_blk_t blkaddr, bool prio);
  389. extern int erofs_map_blocks(struct inode *, struct erofs_map_blocks *, int);
  390. extern int erofs_map_blocks_iter(struct inode *, struct erofs_map_blocks *,
  391. struct page **, int);
  392. struct erofs_map_blocks_iter {
  393. struct erofs_map_blocks map;
  394. struct page *mpage;
  395. };
  396. static inline struct page *
  397. erofs_get_inline_page(struct inode *inode,
  398. erofs_blk_t blkaddr)
  399. {
  400. return erofs_get_meta_page(inode->i_sb,
  401. blkaddr, S_ISDIR(inode->i_mode));
  402. }
  403. /* inode.c */
  404. extern struct inode *erofs_iget(struct super_block *sb,
  405. erofs_nid_t nid, bool dir);
  406. /* dir.c */
  407. int erofs_namei(struct inode *dir, struct qstr *name,
  408. erofs_nid_t *nid, unsigned *d_type);
  409. /* xattr.c */
  410. #ifdef CONFIG_EROFS_FS_XATTR
  411. extern const struct xattr_handler *erofs_xattr_handlers[];
  412. #endif
  413. /* symlink */
  414. #ifdef CONFIG_EROFS_FS_XATTR
  415. extern const struct inode_operations erofs_symlink_xattr_iops;
  416. extern const struct inode_operations erofs_fast_symlink_xattr_iops;
  417. extern const struct inode_operations erofs_special_inode_operations;
  418. #endif
  419. static inline void set_inode_fast_symlink(struct inode *inode)
  420. {
  421. #ifdef CONFIG_EROFS_FS_XATTR
  422. inode->i_op = &erofs_fast_symlink_xattr_iops;
  423. #else
  424. inode->i_op = &simple_symlink_inode_operations;
  425. #endif
  426. }
  427. static inline bool is_inode_fast_symlink(struct inode *inode)
  428. {
  429. #ifdef CONFIG_EROFS_FS_XATTR
  430. return inode->i_op == &erofs_fast_symlink_xattr_iops;
  431. #else
  432. return inode->i_op == &simple_symlink_inode_operations;
  433. #endif
  434. }
  435. static inline void *erofs_vmap(struct page **pages, unsigned int count)
  436. {
  437. #ifdef CONFIG_EROFS_FS_USE_VM_MAP_RAM
  438. int i = 0;
  439. while (1) {
  440. void *addr = vm_map_ram(pages, count, -1, PAGE_KERNEL);
  441. /* retry two more times (totally 3 times) */
  442. if (addr != NULL || ++i >= 3)
  443. return addr;
  444. vm_unmap_aliases();
  445. }
  446. return NULL;
  447. #else
  448. return vmap(pages, count, VM_MAP, PAGE_KERNEL);
  449. #endif
  450. }
  451. static inline void erofs_vunmap(const void *mem, unsigned int count)
  452. {
  453. #ifdef CONFIG_EROFS_FS_USE_VM_MAP_RAM
  454. vm_unmap_ram(mem, count);
  455. #else
  456. vunmap(mem);
  457. #endif
  458. }
  459. /* utils.c */
  460. extern struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp);
  461. extern void erofs_register_super(struct super_block *sb);
  462. extern void erofs_unregister_super(struct super_block *sb);
  463. extern unsigned long erofs_shrink_count(struct shrinker *shrink,
  464. struct shrink_control *sc);
  465. extern unsigned long erofs_shrink_scan(struct shrinker *shrink,
  466. struct shrink_control *sc);
  467. #ifndef lru_to_page
  468. #define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
  469. #endif
  470. #endif

 lz4defs.h

  1. #ifndef __LZ4DEFS_H__
  2. #define __LZ4DEFS_H__
  3. /*
  4. * lz4defs.h -- common and architecture specific defines for the kernel usage
  5. * LZ4 - Fast LZ compression algorithm
  6. * Copyright (C) 2011-2016, Yann Collet.
  7. * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
  8. * Redistribution and use in source and binary forms, with or without
  9. * modification, are permitted provided that the following conditions are
  10. * met:
  11. * * Redistributions of source code must retain the above copyright
  12. * notice, this list of conditions and the following disclaimer.
  13. * * Redistributions in binary form must reproduce the above
  14. * copyright notice, this list of conditions and the following disclaimer
  15. * in the documentation and/or other materials provided with the
  16. * distribution.
  17. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  18. * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  19. * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  20. * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  21. * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  22. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  23. * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  24. * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  25. * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  26. * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  27. * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  28. * You can contact the author at :
  29. * - LZ4 homepage : http://www.lz4.org
  30. * - LZ4 source repository : https://github.com/lz4/lz4
  31. *
  32. * Changed for kernel usage by:
  33. * Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
  34. */
  35. #include <asm/unaligned.h>
  36. #include <linux/string.h> /* memset, memcpy */
  37. #define FORCE_INLINE __always_inline
  38. /*-************************************
  39. * Basic Types
  40. **************************************/
  41. #include <linux/types.h>
  42. typedef uint8_t BYTE;
  43. typedef uint16_t U16;
  44. typedef uint32_t U32;
  45. typedef int32_t S32;
  46. typedef uint64_t U64;
  47. typedef uintptr_t uptrval;
  48. /*-************************************
  49. * Architecture specifics
  50. **************************************/
  51. #if defined(CONFIG_64BIT)
  52. #define LZ4_ARCH64 1
  53. #else
  54. #define LZ4_ARCH64 0
  55. #endif
  56. #if defined(__LITTLE_ENDIAN)
  57. #define LZ4_LITTLE_ENDIAN 1
  58. #else
  59. #define LZ4_LITTLE_ENDIAN 0
  60. #endif
  61. /*-************************************
  62. * Constants
  63. **************************************/
  64. #define MINMATCH 4
  65. #define WILDCOPYLENGTH 8
  66. #define LASTLITERALS 5
  67. #define MFLIMIT (WILDCOPYLENGTH + MINMATCH)
  68. /* Increase this value ==> compression run slower on incompressible data */
  69. #define LZ4_SKIPTRIGGER 6
  70. #define HASH_UNIT sizeof(size_t)
  71. #define KB (1 << 10)
  72. #define MB (1 << 20)
  73. #define GB (1U << 30)
  74. #define MAXD_LOG 16
  75. #define MAX_DISTANCE ((1 << MAXD_LOG) - 1)
  76. #define STEPSIZE sizeof(size_t)
  77. #define ML_BITS 4
  78. #define ML_MASK ((1U << ML_BITS) - 1)
  79. #define RUN_BITS (8 - ML_BITS)
  80. #define RUN_MASK ((1U << RUN_BITS) - 1)
  81. /*-************************************
  82. * Reading and writing into memory
  83. **************************************/
  84. static FORCE_INLINE U16 LZ4_read16(const void *ptr)
  85. {
  86. return get_unaligned((const U16 *)ptr);
  87. }
  88. static FORCE_INLINE U32 LZ4_read32(const void *ptr)
  89. {
  90. return get_unaligned((const U32 *)ptr);
  91. }
  92. static FORCE_INLINE size_t LZ4_read_ARCH(const void *ptr)
  93. {
  94. return get_unaligned((const size_t *)ptr);
  95. }
  96. static FORCE_INLINE void LZ4_write16(void *memPtr, U16 value)
  97. {
  98. put_unaligned(value, (U16 *)memPtr);
  99. }
  100. static FORCE_INLINE void LZ4_write32(void *memPtr, U32 value)
  101. {
  102. put_unaligned(value, (U32 *)memPtr);
  103. }
  104. static FORCE_INLINE U16 LZ4_readLE16(const void *memPtr)
  105. {
  106. return get_unaligned_le16(memPtr);
  107. }
  108. static FORCE_INLINE void LZ4_writeLE16(void *memPtr, U16 value)
  109. {
  110. return put_unaligned_le16(value, memPtr);
  111. }
  112. static FORCE_INLINE void LZ4_copy8(void *dst, const void *src)
  113. {
  114. #if LZ4_ARCH64
  115. U64 a = get_unaligned((const U64 *)src);
  116. put_unaligned(a, (U64 *)dst);
  117. #else
  118. U32 a = get_unaligned((const U32 *)src);
  119. U32 b = get_unaligned((const U32 *)src + 1);
  120. put_unaligned(a, (U32 *)dst);
  121. put_unaligned(b, (U32 *)dst + 1);
  122. #endif
  123. }
  124. /*
  125. * customized variant of memcpy,
  126. * which can overwrite up to 7 bytes beyond dstEnd
  127. */
  128. static FORCE_INLINE void LZ4_wildCopy(void *dstPtr,
  129. const void *srcPtr, void *dstEnd)
  130. {
  131. BYTE *d = (BYTE *)dstPtr;
  132. const BYTE *s = (const BYTE *)srcPtr;
  133. BYTE *const e = (BYTE *)dstEnd;
  134. do {
  135. LZ4_copy8(d, s);
  136. d += 8;
  137. s += 8;
  138. } while (d < e);
  139. }
  140. static FORCE_INLINE unsigned int LZ4_NbCommonBytes(register size_t val)
  141. {
  142. #if LZ4_LITTLE_ENDIAN
  143. return __ffs(val) >> 3;
  144. #else
  145. return (BITS_PER_LONG - 1 - __fls(val)) >> 3;
  146. #endif
  147. }
  148. static FORCE_INLINE unsigned int LZ4_count(
  149. const BYTE *pIn,
  150. const BYTE *pMatch,
  151. const BYTE *pInLimit)
  152. {
  153. const BYTE *const pStart = pIn;
  154. while (likely(pIn < pInLimit - (STEPSIZE - 1))) {
  155. size_t const diff = LZ4_read_ARCH(pMatch) ^ LZ4_read_ARCH(pIn);
  156. if (!diff) {
  157. pIn += STEPSIZE;
  158. pMatch += STEPSIZE;
  159. continue;
  160. }
  161. pIn += LZ4_NbCommonBytes(diff);
  162. return (unsigned int)(pIn - pStart);
  163. }
  164. #if LZ4_ARCH64
  165. if ((pIn < (pInLimit - 3))
  166. && (LZ4_read32(pMatch) == LZ4_read32(pIn))) {
  167. pIn += 4;
  168. pMatch += 4;
  169. }
  170. #endif
  171. if ((pIn < (pInLimit - 1))
  172. && (LZ4_read16(pMatch) == LZ4_read16(pIn))) {
  173. pIn += 2;
  174. pMatch += 2;
  175. }
  176. if ((pIn < pInLimit) && (*pMatch == *pIn))
  177. pIn++;
  178. return (unsigned int)(pIn - pStart);
  179. }
  180. typedef enum { noLimit = 0, limitedOutput = 1 } limitedOutput_directive;
  181. typedef enum { byPtr, byU32, byU16 } tableType_t;
  182. typedef enum { noDict = 0, withPrefix64k, usingExtDict } dict_directive;
  183. typedef enum { noDictIssue = 0, dictSmall } dictIssue_directive;
  184. typedef enum { endOnOutputSize = 0, endOnInputSize = 1 } endCondition_directive;
  185. typedef enum { full = 0, partial = 1 } earlyEnd_directive;
  186. #endif

  1. /* SPDX-License-Identifier: GPL-2.0
  2. *
  3. * linux/drivers/staging/erofs/xattr.h
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #ifndef __EROFS_XATTR_H
  14. #define __EROFS_XATTR_H
  15. #include "internal.h"
  16. #include <linux/posix_acl_xattr.h>
  17. #include <linux/xattr.h>
  18. /* Attribute not found */
  19. #define ENOATTR ENODATA
  20. static inline unsigned inlinexattr_header_size(struct inode *inode)
  21. {
  22. return sizeof(struct erofs_xattr_ibody_header)
  23. + sizeof(u32) * EROFS_V(inode)->xattr_shared_count;
  24. }
  25. static inline erofs_blk_t
  26. xattrblock_addr(struct erofs_sb_info *sbi, unsigned xattr_id)
  27. {
  28. #ifdef CONFIG_EROFS_FS_XATTR
  29. return sbi->xattr_blkaddr +
  30. xattr_id * sizeof(__u32) / EROFS_BLKSIZ;
  31. #else
  32. return 0;
  33. #endif
  34. }
  35. static inline unsigned
  36. xattrblock_offset(struct erofs_sb_info *sbi, unsigned xattr_id)
  37. {
  38. return (xattr_id * sizeof(__u32)) % EROFS_BLKSIZ;
  39. }
  40. extern const struct xattr_handler erofs_xattr_user_handler;
  41. extern const struct xattr_handler erofs_xattr_trusted_handler;
  42. #ifdef CONFIG_EROFS_FS_SECURITY
  43. extern const struct xattr_handler erofs_xattr_security_handler;
  44. #endif
  45. static inline const struct xattr_handler *erofs_xattr_handler(unsigned index)
  46. {
  47. static const struct xattr_handler *xattr_handler_map[] = {
  48. [EROFS_XATTR_INDEX_USER] = &erofs_xattr_user_handler,
  49. #ifdef CONFIG_EROFS_FS_POSIX_ACL
  50. [EROFS_XATTR_INDEX_POSIX_ACL_ACCESS] = &posix_acl_access_xattr_handler,
  51. [EROFS_XATTR_INDEX_POSIX_ACL_DEFAULT] =
  52. &posix_acl_default_xattr_handler,
  53. #endif
  54. [EROFS_XATTR_INDEX_TRUSTED] = &erofs_xattr_trusted_handler,
  55. #ifdef CONFIG_EROFS_FS_SECURITY
  56. [EROFS_XATTR_INDEX_SECURITY] = &erofs_xattr_security_handler,
  57. #endif
  58. };
  59. return index && index < ARRAY_SIZE(xattr_handler_map) ?
  60. xattr_handler_map[index] : NULL;
  61. }
  62. #ifdef CONFIG_EROFS_FS_XATTR
  63. extern const struct inode_operations erofs_generic_xattr_iops;
  64. extern const struct inode_operations erofs_dir_xattr_iops;
  65. int erofs_getxattr(struct inode *, int, const char *, void *, size_t);
  66. ssize_t erofs_listxattr(struct dentry *, char *, size_t);
  67. #else
  68. static int __maybe_unused erofs_getxattr(struct inode *inode, int index,
  69. const char *name,
  70. void *buffer, size_t buffer_size)
  71. {
  72. return -ENOTSUPP;
  73. }
  74. static ssize_t __maybe_unused erofs_listxattr(struct dentry *dentry,
  75. char *buffer, size_t buffer_size)
  76. {
  77. return -ENOTSUPP;
  78. }
  79. #endif
  80. #endif

unzip_vle.h

  1. /* SPDX-License-Identifier: GPL-2.0
  2. *
  3. * linux/drivers/staging/erofs/unzip_vle.h
  4. *
  5. * Copyright (C) 2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #ifndef __EROFS_FS_UNZIP_VLE_H
  14. #define __EROFS_FS_UNZIP_VLE_H
  15. #include "internal.h"
  16. #include "unzip_pagevec.h"
  17. /*
  18. * - 0x5A110C8D ('sallocated', Z_EROFS_MAPPING_STAGING) -
  19. * used for temporary allocated pages (via erofs_allocpage),
  20. * in order to seperate those from NULL mapping (eg. truncated pages)
  21. */
  22. #define Z_EROFS_MAPPING_STAGING ((void *)0x5A110C8D)
  23. #define z_erofs_is_stagingpage(page) \
  24. ((page)->mapping == Z_EROFS_MAPPING_STAGING)
  25. static inline bool z_erofs_gather_if_stagingpage(struct list_head *page_pool,
  26. struct page *page)
  27. {
  28. if (z_erofs_is_stagingpage(page)) {
  29. list_add(&page->lru, page_pool);
  30. return true;
  31. }
  32. return false;
  33. }
  34. /*
  35. * Structure fields follow one of the following exclusion rules.
  36. *
  37. * I: Modifiable by initialization/destruction paths and read-only
  38. * for everyone else.
  39. *
  40. */
  41. #define Z_EROFS_VLE_INLINE_PAGEVECS 3
  42. struct z_erofs_vle_work {
  43. struct mutex lock;
  44. /* I: decompression offset in page */
  45. unsigned short pageofs;
  46. unsigned short nr_pages;
  47. /* L: queued pages in pagevec[] */
  48. unsigned vcnt;
  49. union {
  50. /* L: pagevec */
  51. erofs_vtptr_t pagevec[Z_EROFS_VLE_INLINE_PAGEVECS];
  52. struct rcu_head rcu;
  53. };
  54. };
  55. #define Z_EROFS_VLE_WORKGRP_FMT_PLAIN 0
  56. #define Z_EROFS_VLE_WORKGRP_FMT_LZ4 1
  57. #define Z_EROFS_VLE_WORKGRP_FMT_MASK 1
  58. typedef struct z_erofs_vle_workgroup *z_erofs_vle_owned_workgrp_t;
  59. struct z_erofs_vle_workgroup {
  60. struct erofs_workgroup obj;
  61. struct z_erofs_vle_work work;
  62. /* next owned workgroup */
  63. z_erofs_vle_owned_workgrp_t next;
  64. /* compressed pages (including multi-usage pages) */
  65. struct page *compressed_pages[Z_EROFS_CLUSTER_MAX_PAGES];
  66. unsigned int llen, flags;
  67. };
  68. /* let's avoid the valid 32-bit kernel addresses */
  69. /* the chained workgroup has't submitted io (still open) */
  70. #define Z_EROFS_VLE_WORKGRP_TAIL ((void *)0x5F0ECAFE)
  71. /* the chained workgroup has already submitted io */
  72. #define Z_EROFS_VLE_WORKGRP_TAIL_CLOSED ((void *)0x5F0EDEAD)
  73. #define Z_EROFS_VLE_WORKGRP_NIL (NULL)
  74. #define z_erofs_vle_workgrp_fmt(grp) \
  75. ((grp)->flags & Z_EROFS_VLE_WORKGRP_FMT_MASK)
  76. static inline void z_erofs_vle_set_workgrp_fmt(
  77. struct z_erofs_vle_workgroup *grp,
  78. unsigned int fmt)
  79. {
  80. grp->flags = fmt | (grp->flags & ~Z_EROFS_VLE_WORKGRP_FMT_MASK);
  81. }
  82. /* definitions if multiref is disabled */
  83. #define z_erofs_vle_grab_primary_work(grp) (&(grp)->work)
  84. #define z_erofs_vle_grab_work(grp, pageofs) (&(grp)->work)
  85. #define z_erofs_vle_work_workgroup(wrk, primary) \
  86. ((primary) ? container_of(wrk, \
  87. struct z_erofs_vle_workgroup, work) : \
  88. ({ BUG(); (void *)NULL; }))
  89. #define Z_EROFS_WORKGROUP_SIZE sizeof(struct z_erofs_vle_workgroup)
  90. struct z_erofs_vle_unzip_io {
  91. atomic_t pending_bios;
  92. z_erofs_vle_owned_workgrp_t head;
  93. union {
  94. wait_queue_head_t wait;
  95. struct work_struct work;
  96. } u;
  97. };
  98. struct z_erofs_vle_unzip_io_sb {
  99. struct z_erofs_vle_unzip_io io;
  100. struct super_block *sb;
  101. };
  102. #define Z_EROFS_ONLINEPAGE_COUNT_BITS 2
  103. #define Z_EROFS_ONLINEPAGE_COUNT_MASK ((1 << Z_EROFS_ONLINEPAGE_COUNT_BITS) - 1)
  104. #define Z_EROFS_ONLINEPAGE_INDEX_SHIFT (Z_EROFS_ONLINEPAGE_COUNT_BITS)
  105. /*
  106. * waiters (aka. ongoing_packs): # to unlock the page
  107. * sub-index: 0 - for partial page, >= 1 full page sub-index
  108. */
  109. typedef atomic_t z_erofs_onlinepage_t;
  110. /* type punning */
  111. union z_erofs_onlinepage_converter {
  112. z_erofs_onlinepage_t *o;
  113. unsigned long *v;
  114. };
  115. static inline unsigned z_erofs_onlinepage_index(struct page *page)
  116. {
  117. union z_erofs_onlinepage_converter u;
  118. BUG_ON(!PagePrivate(page));
  119. u.v = &page_private(page);
  120. return atomic_read(u.o) >> Z_EROFS_ONLINEPAGE_INDEX_SHIFT;
  121. }
  122. static inline void z_erofs_onlinepage_init(struct page *page)
  123. {
  124. union {
  125. z_erofs_onlinepage_t o;
  126. unsigned long v;
  127. /* keep from being unlocked in advance */
  128. } u = { .o = ATOMIC_INIT(1) };
  129. set_page_private(page, u.v);
  130. smp_wmb();
  131. SetPagePrivate(page);
  132. }
  133. static inline void z_erofs_onlinepage_fixup(struct page *page,
  134. uintptr_t index, bool down)
  135. {
  136. union z_erofs_onlinepage_converter u = { .v = &page_private(page) };
  137. int orig, orig_index, val;
  138. repeat:
  139. orig = atomic_read(u.o);
  140. orig_index = orig >> Z_EROFS_ONLINEPAGE_INDEX_SHIFT;
  141. if (orig_index) {
  142. if (!index)
  143. return;
  144. DBG_BUGON(orig_index != index);
  145. }
  146. val = (index << Z_EROFS_ONLINEPAGE_INDEX_SHIFT) |
  147. ((orig & Z_EROFS_ONLINEPAGE_COUNT_MASK) + (unsigned int)down);
  148. if (atomic_cmpxchg(u.o, orig, val) != orig)
  149. goto repeat;
  150. }
  151. static inline void z_erofs_onlinepage_endio(struct page *page)
  152. {
  153. union z_erofs_onlinepage_converter u;
  154. unsigned v;
  155. BUG_ON(!PagePrivate(page));
  156. u.v = &page_private(page);
  157. v = atomic_dec_return(u.o);
  158. if (!(v & Z_EROFS_ONLINEPAGE_COUNT_MASK)) {
  159. ClearPagePrivate(page);
  160. if (!PageError(page))
  161. SetPageUptodate(page);
  162. unlock_page(page);
  163. }
  164. debugln("%s, page %p value %x", __func__, page, atomic_read(u.o));
  165. }
  166. #define Z_EROFS_VLE_VMAP_ONSTACK_PAGES \
  167. min_t(unsigned int, THREAD_SIZE / 8 / sizeof(struct page *), 96U)
  168. #define Z_EROFS_VLE_VMAP_GLOBAL_PAGES 2048
  169. /* unzip_vle_lz4.c */
  170. extern int z_erofs_vle_plain_copy(struct page **compressed_pages,
  171. unsigned clusterpages, struct page **pages,
  172. unsigned nr_pages, unsigned short pageofs);
  173. extern int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages,
  174. unsigned clusterpages, struct page **pages,
  175. unsigned int outlen, unsigned short pageofs);
  176. extern int z_erofs_vle_unzip_vmap(struct page **compressed_pages,
  177. unsigned clusterpages, void *vaddr, unsigned llen,
  178. unsigned short pageofs, bool overlapped);
  179. #endif

xattr.h
 

  1. /* SPDX-License-Identifier: GPL-2.0
  2. *
  3. * linux/drivers/staging/erofs/xattr.h
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #ifndef __EROFS_XATTR_H
  14. #define __EROFS_XATTR_H
  15. #include "internal.h"
  16. #include <linux/posix_acl_xattr.h>
  17. #include <linux/xattr.h>
  18. /* Attribute not found */
  19. #define ENOATTR ENODATA
  20. static inline unsigned inlinexattr_header_size(struct inode *inode)
  21. {
  22. return sizeof(struct erofs_xattr_ibody_header)
  23. + sizeof(u32) * EROFS_V(inode)->xattr_shared_count;
  24. }
  25. static inline erofs_blk_t
  26. xattrblock_addr(struct erofs_sb_info *sbi, unsigned xattr_id)
  27. {
  28. #ifdef CONFIG_EROFS_FS_XATTR
  29. return sbi->xattr_blkaddr +
  30. xattr_id * sizeof(__u32) / EROFS_BLKSIZ;
  31. #else
  32. return 0;
  33. #endif
  34. }
  35. static inline unsigned
  36. xattrblock_offset(struct erofs_sb_info *sbi, unsigned xattr_id)
  37. {
  38. return (xattr_id * sizeof(__u32)) % EROFS_BLKSIZ;
  39. }
  40. extern const struct xattr_handler erofs_xattr_user_handler;
  41. extern const struct xattr_handler erofs_xattr_trusted_handler;
  42. #ifdef CONFIG_EROFS_FS_SECURITY
  43. extern const struct xattr_handler erofs_xattr_security_handler;
  44. #endif
  45. static inline const struct xattr_handler *erofs_xattr_handler(unsigned index)
  46. {
  47. static const struct xattr_handler *xattr_handler_map[] = {
  48. [EROFS_XATTR_INDEX_USER] = &erofs_xattr_user_handler,
  49. #ifdef CONFIG_EROFS_FS_POSIX_ACL
  50. [EROFS_XATTR_INDEX_POSIX_ACL_ACCESS] = &posix_acl_access_xattr_handler,
  51. [EROFS_XATTR_INDEX_POSIX_ACL_DEFAULT] =
  52. &posix_acl_default_xattr_handler,
  53. #endif
  54. [EROFS_XATTR_INDEX_TRUSTED] = &erofs_xattr_trusted_handler,
  55. #ifdef CONFIG_EROFS_FS_SECURITY
  56. [EROFS_XATTR_INDEX_SECURITY] = &erofs_xattr_security_handler,
  57. #endif
  58. };
  59. return index && index < ARRAY_SIZE(xattr_handler_map) ?
  60. xattr_handler_map[index] : NULL;
  61. }
  62. #ifdef CONFIG_EROFS_FS_XATTR
  63. extern const struct inode_operations erofs_generic_xattr_iops;
  64. extern const struct inode_operations erofs_dir_xattr_iops;
  65. int erofs_getxattr(struct inode *, int, const char *, void *, size_t);
  66. ssize_t erofs_listxattr(struct dentry *, char *, size_t);
  67. #else
  68. static int __maybe_unused erofs_getxattr(struct inode *inode, int index,
  69. const char *name,
  70. void *buffer, size_t buffer_size)
  71. {
  72. return -ENOTSUPP;
  73. }
  74. static ssize_t __maybe_unused erofs_listxattr(struct dentry *dentry,
  75. char *buffer, size_t buffer_size)
  76. {
  77. return -ENOTSUPP;
  78. }
  79. #endif
  80. #endif

data.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/data.c
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "internal.h"
  14. #include <linux/prefetch.h>
  15. #include <trace/events/erofs.h>
  16. static inline void read_endio(struct bio *bio)
  17. {
  18. int i;
  19. struct bio_vec *bvec;
  20. const blk_status_t err = bio->bi_status;
  21. bio_for_each_segment_all(bvec, bio, i) {
  22. struct page *page = bvec->bv_page;
  23. /* page is already locked */
  24. DBG_BUGON(PageUptodate(page));
  25. if (unlikely(err))
  26. SetPageError(page);
  27. else
  28. SetPageUptodate(page);
  29. unlock_page(page);
  30. /* page could be reclaimed now */
  31. }
  32. bio_put(bio);
  33. }
  34. /* prio -- true is used for dir */
  35. struct page *erofs_get_meta_page(struct super_block *sb,
  36. erofs_blk_t blkaddr, bool prio)
  37. {
  38. struct inode *bd_inode = sb->s_bdev->bd_inode;
  39. struct address_space *mapping = bd_inode->i_mapping;
  40. struct page *page;
  41. repeat:
  42. page = find_or_create_page(mapping, blkaddr,
  43. /*
  44. * Prefer looping in the allocator rather than here,
  45. * at least that code knows what it's doing.
  46. */
  47. mapping_gfp_constraint(mapping, ~__GFP_FS) | __GFP_NOFAIL);
  48. BUG_ON(!page || !PageLocked(page));
  49. if (!PageUptodate(page)) {
  50. struct bio *bio;
  51. int err;
  52. bio = prepare_bio(sb, blkaddr, 1, read_endio);
  53. err = bio_add_page(bio, page, PAGE_SIZE, 0);
  54. BUG_ON(err != PAGE_SIZE);
  55. __submit_bio(bio, REQ_OP_READ,
  56. REQ_META | (prio ? REQ_PRIO : 0));
  57. lock_page(page);
  58. /* the page has been truncated by others? */
  59. if (unlikely(page->mapping != mapping)) {
  60. unlock_page(page);
  61. put_page(page);
  62. goto repeat;
  63. }
  64. /* more likely a read error */
  65. if (unlikely(!PageUptodate(page))) {
  66. unlock_page(page);
  67. put_page(page);
  68. page = ERR_PTR(-EIO);
  69. }
  70. }
  71. return page;
  72. }
  73. static int erofs_map_blocks_flatmode(struct inode *inode,
  74. struct erofs_map_blocks *map,
  75. int flags)
  76. {
  77. int err = 0;
  78. erofs_blk_t nblocks, lastblk;
  79. u64 offset = map->m_la;
  80. struct erofs_vnode *vi = EROFS_V(inode);
  81. trace_erofs_map_blocks_flatmode_enter(inode, map, flags);
  82. nblocks = DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
  83. lastblk = nblocks - is_inode_layout_inline(inode);
  84. if (unlikely(offset >= inode->i_size)) {
  85. /* leave out-of-bound access unmapped */
  86. map->m_flags = 0;
  87. map->m_plen = 0;
  88. goto out;
  89. }
  90. /* there is no hole in flatmode */
  91. map->m_flags = EROFS_MAP_MAPPED;
  92. if (offset < blknr_to_addr(lastblk)) {
  93. map->m_pa = blknr_to_addr(vi->raw_blkaddr) + map->m_la;
  94. map->m_plen = blknr_to_addr(lastblk) - offset;
  95. } else if (is_inode_layout_inline(inode)) {
  96. /* 2 - inode inline B: inode, [xattrs], inline last blk... */
  97. struct erofs_sb_info *sbi = EROFS_SB(inode->i_sb);
  98. map->m_pa = iloc(sbi, vi->nid) + vi->inode_isize +
  99. vi->xattr_isize + erofs_blkoff(map->m_la);
  100. map->m_plen = inode->i_size - offset;
  101. /* inline data should locate in one meta block */
  102. if (erofs_blkoff(map->m_pa) + map->m_plen > PAGE_SIZE) {
  103. DBG_BUGON(1);
  104. err = -EIO;
  105. goto err_out;
  106. }
  107. map->m_flags |= EROFS_MAP_META;
  108. } else {
  109. errln("internal error @ nid: %llu (size %llu), m_la 0x%llx",
  110. vi->nid, inode->i_size, map->m_la);
  111. DBG_BUGON(1);
  112. err = -EIO;
  113. goto err_out;
  114. }
  115. out:
  116. map->m_llen = map->m_plen;
  117. err_out:
  118. trace_erofs_map_blocks_flatmode_exit(inode, map, flags, 0);
  119. return err;
  120. }
  121. #ifdef CONFIG_EROFS_FS_ZIP
  122. extern int z_erofs_map_blocks_iter(struct inode *,
  123. struct erofs_map_blocks *, struct page **, int);
  124. #endif
  125. int erofs_map_blocks_iter(struct inode *inode,
  126. struct erofs_map_blocks *map,
  127. struct page **mpage_ret, int flags)
  128. {
  129. /* by default, reading raw data never use erofs_map_blocks_iter */
  130. if (unlikely(!is_inode_layout_compression(inode))) {
  131. if (*mpage_ret != NULL)
  132. put_page(*mpage_ret);
  133. *mpage_ret = NULL;
  134. return erofs_map_blocks(inode, map, flags);
  135. }
  136. #ifdef CONFIG_EROFS_FS_ZIP
  137. return z_erofs_map_blocks_iter(inode, map, mpage_ret, flags);
  138. #else
  139. /* data compression is not available */
  140. return -ENOTSUPP;
  141. #endif
  142. }
  143. int erofs_map_blocks(struct inode *inode,
  144. struct erofs_map_blocks *map, int flags)
  145. {
  146. if (unlikely(is_inode_layout_compression(inode))) {
  147. struct page *mpage = NULL;
  148. int err;
  149. err = erofs_map_blocks_iter(inode, map, &mpage, flags);
  150. if (mpage != NULL)
  151. put_page(mpage);
  152. return err;
  153. }
  154. return erofs_map_blocks_flatmode(inode, map, flags);
  155. }
  156. static inline struct bio *erofs_read_raw_page(
  157. struct bio *bio,
  158. struct address_space *mapping,
  159. struct page *page,
  160. erofs_off_t *last_block,
  161. unsigned nblocks,
  162. bool ra)
  163. {
  164. struct inode *inode = mapping->host;
  165. erofs_off_t current_block = (erofs_off_t)page->index;
  166. int err;
  167. DBG_BUGON(!nblocks);
  168. if (PageUptodate(page)) {
  169. err = 0;
  170. goto has_updated;
  171. }
  172. if (cleancache_get_page(page) == 0) {
  173. err = 0;
  174. SetPageUptodate(page);
  175. goto has_updated;
  176. }
  177. /* note that for readpage case, bio also equals to NULL */
  178. if (bio != NULL &&
  179. /* not continuous */
  180. *last_block + 1 != current_block) {
  181. submit_bio_retry:
  182. __submit_bio(bio, REQ_OP_READ, 0);
  183. bio = NULL;
  184. }
  185. if (bio == NULL) {
  186. struct erofs_map_blocks map = {
  187. .m_la = blknr_to_addr(current_block),
  188. };
  189. erofs_blk_t blknr;
  190. unsigned blkoff;
  191. err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
  192. if (unlikely(err))
  193. goto err_out;
  194. /* zero out the holed page */
  195. if (unlikely(!(map.m_flags & EROFS_MAP_MAPPED))) {
  196. zero_user_segment(page, 0, PAGE_SIZE);
  197. SetPageUptodate(page);
  198. /* imply err = 0, see erofs_map_blocks */
  199. goto has_updated;
  200. }
  201. /* for RAW access mode, m_plen must be equal to m_llen */
  202. DBG_BUGON(map.m_plen != map.m_llen);
  203. blknr = erofs_blknr(map.m_pa);
  204. blkoff = erofs_blkoff(map.m_pa);
  205. /* deal with inline page */
  206. if (map.m_flags & EROFS_MAP_META) {
  207. void *vsrc, *vto;
  208. struct page *ipage;
  209. DBG_BUGON(map.m_plen > PAGE_SIZE);
  210. ipage = erofs_get_meta_page(inode->i_sb, blknr, 0);
  211. if (IS_ERR(ipage)) {
  212. err = PTR_ERR(ipage);
  213. goto err_out;
  214. }
  215. vsrc = kmap_atomic(ipage);
  216. vto = kmap_atomic(page);
  217. memcpy(vto, vsrc + blkoff, map.m_plen);
  218. memset(vto + map.m_plen, 0, PAGE_SIZE - map.m_plen);
  219. kunmap_atomic(vto);
  220. kunmap_atomic(vsrc);
  221. flush_dcache_page(page);
  222. SetPageUptodate(page);
  223. /* TODO: could we unlock the page earlier? */
  224. unlock_page(ipage);
  225. put_page(ipage);
  226. /* imply err = 0, see erofs_map_blocks */
  227. goto has_updated;
  228. }
  229. /* pa must be block-aligned for raw reading */
  230. DBG_BUGON(erofs_blkoff(map.m_pa));
  231. /* max # of continuous pages */
  232. if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
  233. nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
  234. if (nblocks > BIO_MAX_PAGES)
  235. nblocks = BIO_MAX_PAGES;
  236. bio = prepare_bio(inode->i_sb, blknr, nblocks, read_endio);
  237. }
  238. err = bio_add_page(bio, page, PAGE_SIZE, 0);
  239. /* out of the extent or bio is full */
  240. if (err < PAGE_SIZE)
  241. goto submit_bio_retry;
  242. *last_block = current_block;
  243. /* shift in advance in case of it followed by too many gaps */
  244. if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) {
  245. /* err should reassign to 0 after submitting */
  246. err = 0;
  247. goto submit_bio_out;
  248. }
  249. return bio;
  250. err_out:
  251. /* for sync reading, set page error immediately */
  252. if (!ra) {
  253. SetPageError(page);
  254. ClearPageUptodate(page);
  255. }
  256. has_updated:
  257. unlock_page(page);
  258. /* if updated manually, continuous pages has a gap */
  259. if (bio != NULL)
  260. submit_bio_out:
  261. __submit_bio(bio, REQ_OP_READ, 0);
  262. return unlikely(err) ? ERR_PTR(err) : NULL;
  263. }
  264. /*
  265. * since we dont have write or truncate flows, so no inode
  266. * locking needs to be held at the moment.
  267. */
  268. static int erofs_raw_access_readpage(struct file *file, struct page *page)
  269. {
  270. erofs_off_t last_block;
  271. struct bio *bio;
  272. trace_erofs_readpage(page, true);
  273. bio = erofs_read_raw_page(NULL, page->mapping,
  274. page, &last_block, 1, false);
  275. if (IS_ERR(bio))
  276. return PTR_ERR(bio);
  277. DBG_BUGON(bio); /* since we have only one bio -- must be NULL */
  278. return 0;
  279. }
  280. static int erofs_raw_access_readpages(struct file *filp,
  281. struct address_space *mapping,
  282. struct list_head *pages, unsigned nr_pages)
  283. {
  284. erofs_off_t last_block;
  285. struct bio *bio = NULL;
  286. gfp_t gfp = readahead_gfp_mask(mapping);
  287. struct page *page = list_last_entry(pages, struct page, lru);
  288. trace_erofs_readpages(mapping->host, page, nr_pages, true);
  289. for (; nr_pages; --nr_pages) {
  290. page = list_entry(pages->prev, struct page, lru);
  291. prefetchw(&page->flags);
  292. list_del(&page->lru);
  293. if (!add_to_page_cache_lru(page, mapping, page->index, gfp)) {
  294. bio = erofs_read_raw_page(bio, mapping, page,
  295. &last_block, nr_pages, true);
  296. /* all the page errors are ignored when readahead */
  297. if (IS_ERR(bio)) {
  298. pr_err("%s, readahead error at page %lu of nid %llu\n",
  299. __func__, page->index,
  300. EROFS_V(mapping->host)->nid);
  301. bio = NULL;
  302. }
  303. }
  304. /* pages could still be locked */
  305. put_page(page);
  306. }
  307. DBG_BUGON(!list_empty(pages));
  308. /* the rare case (end in gaps) */
  309. if (unlikely(bio != NULL))
  310. __submit_bio(bio, REQ_OP_READ, 0);
  311. return 0;
  312. }
  313. /* for uncompressed (aligned) files and raw access for other files */
  314. const struct address_space_operations erofs_raw_access_aops = {
  315. .readpage = erofs_raw_access_readpage,
  316. .readpages = erofs_raw_access_readpages,
  317. };

dir.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/dir.c
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "internal.h"
  14. static const unsigned char erofs_filetype_table[EROFS_FT_MAX] = {
  15. [EROFS_FT_UNKNOWN] = DT_UNKNOWN,
  16. [EROFS_FT_REG_FILE] = DT_REG,
  17. [EROFS_FT_DIR] = DT_DIR,
  18. [EROFS_FT_CHRDEV] = DT_CHR,
  19. [EROFS_FT_BLKDEV] = DT_BLK,
  20. [EROFS_FT_FIFO] = DT_FIFO,
  21. [EROFS_FT_SOCK] = DT_SOCK,
  22. [EROFS_FT_SYMLINK] = DT_LNK,
  23. };
  24. static void debug_one_dentry(unsigned char d_type, const char *de_name,
  25. unsigned int de_namelen)
  26. {
  27. #ifdef CONFIG_EROFS_FS_DEBUG
  28. /* since the on-disk name could not have the trailing '\0' */
  29. unsigned char dbg_namebuf[EROFS_NAME_LEN + 1];
  30. memcpy(dbg_namebuf, de_name, de_namelen);
  31. dbg_namebuf[de_namelen] = '\0';
  32. debugln("found dirent %s de_len %u d_type %d", dbg_namebuf,
  33. de_namelen, d_type);
  34. #endif
  35. }
  36. static int erofs_fill_dentries(struct dir_context *ctx,
  37. void *dentry_blk, unsigned *ofs,
  38. unsigned nameoff, unsigned maxsize)
  39. {
  40. struct erofs_dirent *de = dentry_blk;
  41. const struct erofs_dirent *end = dentry_blk + nameoff;
  42. de = dentry_blk + *ofs;
  43. while (de < end) {
  44. const char *de_name;
  45. unsigned int de_namelen;
  46. unsigned char d_type;
  47. if (de->file_type < EROFS_FT_MAX)
  48. d_type = erofs_filetype_table[de->file_type];
  49. else
  50. d_type = DT_UNKNOWN;
  51. nameoff = le16_to_cpu(de->nameoff);
  52. de_name = (char *)dentry_blk + nameoff;
  53. /* the last dirent in the block? */
  54. if (de + 1 >= end)
  55. de_namelen = strnlen(de_name, maxsize - nameoff);
  56. else
  57. de_namelen = le16_to_cpu(de[1].nameoff) - nameoff;
  58. /* a corrupted entry is found */
  59. if (unlikely(nameoff + de_namelen > maxsize ||
  60. de_namelen > EROFS_NAME_LEN)) {
  61. DBG_BUGON(1);
  62. return -EIO;
  63. }
  64. debug_one_dentry(d_type, de_name, de_namelen);
  65. if (!dir_emit(ctx, de_name, de_namelen,
  66. le64_to_cpu(de->nid), d_type))
  67. /* stoped by some reason */
  68. return 1;
  69. ++de;
  70. *ofs += sizeof(struct erofs_dirent);
  71. }
  72. *ofs = maxsize;
  73. return 0;
  74. }
  75. static int erofs_readdir(struct file *f, struct dir_context *ctx)
  76. {
  77. struct inode *dir = file_inode(f);
  78. struct address_space *mapping = dir->i_mapping;
  79. const size_t dirsize = i_size_read(dir);
  80. unsigned i = ctx->pos / EROFS_BLKSIZ;
  81. unsigned ofs = ctx->pos % EROFS_BLKSIZ;
  82. int err = 0;
  83. bool initial = true;
  84. while (ctx->pos < dirsize) {
  85. struct page *dentry_page;
  86. struct erofs_dirent *de;
  87. unsigned nameoff, maxsize;
  88. dentry_page = read_mapping_page(mapping, i, NULL);
  89. if (dentry_page == ERR_PTR(-ENOMEM)) {
  90. err = -ENOMEM;
  91. break;
  92. } else if (IS_ERR(dentry_page)) {
  93. errln("fail to readdir of logical block %u of nid %llu",
  94. i, EROFS_V(dir)->nid);
  95. err = PTR_ERR(dentry_page);
  96. break;
  97. }
  98. lock_page(dentry_page);
  99. de = (struct erofs_dirent *)kmap(dentry_page);
  100. nameoff = le16_to_cpu(de->nameoff);
  101. if (unlikely(nameoff < sizeof(struct erofs_dirent) ||
  102. nameoff >= PAGE_SIZE)) {
  103. errln("%s, invalid de[0].nameoff %u",
  104. __func__, nameoff);
  105. err = -EIO;
  106. goto skip_this;
  107. }
  108. maxsize = min_t(unsigned, dirsize - ctx->pos + ofs, PAGE_SIZE);
  109. /* search dirents at the arbitrary position */
  110. if (unlikely(initial)) {
  111. initial = false;
  112. ofs = roundup(ofs, sizeof(struct erofs_dirent));
  113. if (unlikely(ofs >= nameoff))
  114. goto skip_this;
  115. }
  116. err = erofs_fill_dentries(ctx, de, &ofs, nameoff, maxsize);
  117. skip_this:
  118. kunmap(dentry_page);
  119. unlock_page(dentry_page);
  120. put_page(dentry_page);
  121. ctx->pos = blknr_to_addr(i) + ofs;
  122. if (unlikely(err))
  123. break;
  124. ++i;
  125. ofs = 0;
  126. }
  127. return err < 0 ? err : 0;
  128. }
  129. const struct file_operations erofs_dir_fops = {
  130. .llseek = generic_file_llseek,
  131. .read = generic_read_dir,
  132. .iterate = erofs_readdir,
  133. };

inode.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/inode.c
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "xattr.h"
  14. #include <trace/events/erofs.h>
  15. /*
  16. * if inode is successfully read, return its inode page (or sometimes
  17. * the inode payload page if it's an extended inode) in order to fill
  18. * inline data if possible.
  19. */
  20. static struct page *read_inode(struct inode *inode, unsigned int *ofs)
  21. {
  22. struct super_block *sb = inode->i_sb;
  23. struct erofs_sb_info *sbi = EROFS_SB(sb);
  24. struct erofs_vnode *vi = EROFS_V(inode);
  25. const erofs_off_t inode_loc = iloc(sbi, vi->nid);
  26. erofs_blk_t blkaddr;
  27. struct page *page;
  28. struct erofs_inode_v1 *v1;
  29. struct erofs_inode_v2 *v2, *copied = NULL;
  30. unsigned int ifmt;
  31. int err;
  32. blkaddr = erofs_blknr(inode_loc);
  33. *ofs = erofs_blkoff(inode_loc);
  34. debugln("%s, reading inode nid %llu at %u of blkaddr %u",
  35. __func__, vi->nid, *ofs, blkaddr);
  36. page = erofs_get_meta_page(sb, blkaddr, false);
  37. if (IS_ERR(page)) {
  38. errln("failed to get inode (nid: %llu) page, err %ld",
  39. vi->nid, PTR_ERR(page));
  40. return page;
  41. }
  42. v1 = page_address(page) + *ofs;
  43. ifmt = le16_to_cpu(v1->i_advise);
  44. if (ifmt & ~EROFS_I_ALL) {
  45. errln("unsupported i_format %u of nid %llu", ifmt, vi->nid);
  46. err = -EOPNOTSUPP;
  47. goto err_out;
  48. }
  49. vi->data_mapping_mode = __inode_data_mapping(ifmt);
  50. if (unlikely(vi->data_mapping_mode >= EROFS_INODE_LAYOUT_MAX)) {
  51. errln("unknown data mapping mode %u of nid %llu",
  52. vi->data_mapping_mode, vi->nid);
  53. err = -EOPNOTSUPP;
  54. goto err_out;
  55. }
  56. switch (__inode_version(ifmt)) {
  57. case EROFS_INODE_LAYOUT_V2:
  58. vi->inode_isize = sizeof(struct erofs_inode_v2);
  59. /* check if the inode acrosses page boundary */
  60. if (*ofs + vi->inode_isize <= PAGE_SIZE) {
  61. *ofs += vi->inode_isize;
  62. v2 = (struct erofs_inode_v2 *)v1;
  63. } else {
  64. const unsigned int gotten = PAGE_SIZE - *ofs;
  65. copied = kmalloc(vi->inode_isize, GFP_NOFS);
  66. if (!copied) {
  67. err = -ENOMEM;
  68. goto err_out;
  69. }
  70. memcpy(copied, v1, gotten);
  71. unlock_page(page);
  72. put_page(page);
  73. page = erofs_get_meta_page(sb, blkaddr + 1, false);
  74. if (IS_ERR(page)) {
  75. errln("failed to get inode payload page (nid: %llu), err %ld",
  76. vi->nid, PTR_ERR(page));
  77. kfree(copied);
  78. return page;
  79. }
  80. *ofs = vi->inode_isize - gotten;
  81. memcpy((u8 *)copied + gotten, page_address(page), *ofs);
  82. v2 = copied;
  83. }
  84. vi->xattr_isize = ondisk_xattr_ibody_size(v2->i_xattr_icount);
  85. inode->i_mode = le16_to_cpu(v2->i_mode);
  86. if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
  87. S_ISLNK(inode->i_mode)) {
  88. vi->raw_blkaddr = le32_to_cpu(v2->i_u.raw_blkaddr);
  89. } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
  90. inode->i_rdev =
  91. new_decode_dev(le32_to_cpu(v2->i_u.rdev));
  92. } else if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
  93. inode->i_rdev = 0;
  94. } else {
  95. goto bogusimode;
  96. }
  97. i_uid_write(inode, le32_to_cpu(v2->i_uid));
  98. i_gid_write(inode, le32_to_cpu(v2->i_gid));
  99. set_nlink(inode, le32_to_cpu(v2->i_nlink));
  100. /* extended inode has its own timestamp */
  101. inode->i_ctime.tv_sec = le64_to_cpu(v2->i_ctime);
  102. inode->i_ctime.tv_nsec = le32_to_cpu(v2->i_ctime_nsec);
  103. inode->i_size = le64_to_cpu(v2->i_size);
  104. kfree(copied);
  105. break;
  106. case EROFS_INODE_LAYOUT_V1:
  107. vi->inode_isize = sizeof(struct erofs_inode_v1);
  108. *ofs += vi->inode_isize;
  109. vi->xattr_isize = ondisk_xattr_ibody_size(v1->i_xattr_icount);
  110. inode->i_mode = le16_to_cpu(v1->i_mode);
  111. if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
  112. S_ISLNK(inode->i_mode)) {
  113. vi->raw_blkaddr = le32_to_cpu(v1->i_u.raw_blkaddr);
  114. } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
  115. inode->i_rdev =
  116. new_decode_dev(le32_to_cpu(v1->i_u.rdev));
  117. } else if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
  118. inode->i_rdev = 0;
  119. } else {
  120. goto bogusimode;
  121. }
  122. i_uid_write(inode, le16_to_cpu(v1->i_uid));
  123. i_gid_write(inode, le16_to_cpu(v1->i_gid));
  124. set_nlink(inode, le16_to_cpu(v1->i_nlink));
  125. /* use build time for compact inodes */
  126. inode->i_ctime.tv_sec = sbi->build_time;
  127. inode->i_ctime.tv_nsec = sbi->build_time_nsec;
  128. inode->i_size = le32_to_cpu(v1->i_size);
  129. break;
  130. default:
  131. errln("unsupported on-disk inode version %u of nid %llu",
  132. __inode_version(ifmt), vi->nid);
  133. err = -EOPNOTSUPP;
  134. goto err_out;
  135. }
  136. inode->i_mtime.tv_sec = inode->i_ctime.tv_sec;
  137. inode->i_atime.tv_sec = inode->i_ctime.tv_sec;
  138. inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec;
  139. inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec;
  140. /* measure inode.i_blocks as the generic filesystem */
  141. inode->i_blocks = ((inode->i_size - 1) >> 9) + 1;
  142. return page;
  143. bogusimode:
  144. errln("bogus i_mode (%o) @ nid %llu", inode->i_mode, vi->nid);
  145. err = -EIO;
  146. err_out:
  147. DBG_BUGON(1);
  148. kfree(copied);
  149. unlock_page(page);
  150. put_page(page);
  151. return ERR_PTR(err);
  152. }
  153. /*
  154. * try_lock can be required since locking order is:
  155. * file data(fs_inode)
  156. * meta(bd_inode)
  157. * but the majority of the callers is "iget",
  158. * in that case we are pretty sure no deadlock since
  159. * no data operations exist. However I tend to
  160. * try_lock since it takes no much overhead and
  161. * will success immediately.
  162. */
  163. static int fill_inline_data(struct inode *inode, void *data, unsigned m_pofs)
  164. {
  165. struct erofs_vnode *vi = EROFS_V(inode);
  166. struct erofs_sb_info *sbi = EROFS_I_SB(inode);
  167. int mode = vi->data_mapping_mode;
  168. DBG_BUGON(mode >= EROFS_INODE_LAYOUT_MAX);
  169. /* should be inode inline C */
  170. if (mode != EROFS_INODE_LAYOUT_INLINE)
  171. return 0;
  172. /* fast symlink (following ext4) */
  173. if (S_ISLNK(inode->i_mode) && inode->i_size < PAGE_SIZE) {
  174. char *lnk = erofs_kmalloc(sbi, inode->i_size + 1, GFP_KERNEL);
  175. if (unlikely(lnk == NULL))
  176. return -ENOMEM;
  177. m_pofs += vi->xattr_isize;
  178. /* inline symlink data shouldn't across page boundary as well */
  179. if (unlikely(m_pofs + inode->i_size > PAGE_SIZE)) {
  180. DBG_BUGON(1);
  181. kfree(lnk);
  182. return -EIO;
  183. }
  184. /* get in-page inline data */
  185. memcpy(lnk, data + m_pofs, inode->i_size);
  186. lnk[inode->i_size] = '\0';
  187. inode->i_link = lnk;
  188. set_inode_fast_symlink(inode);
  189. }
  190. return -EAGAIN;
  191. }
  192. static int fill_inode(struct inode *inode, int isdir)
  193. {
  194. struct page *page;
  195. unsigned int ofs;
  196. int err = 0;
  197. trace_erofs_fill_inode(inode, isdir);
  198. /* read inode base data from disk */
  199. page = read_inode(inode, &ofs);
  200. if (IS_ERR(page)) {
  201. return PTR_ERR(page);
  202. } else {
  203. /* setup the new inode */
  204. if (S_ISREG(inode->i_mode)) {
  205. #ifdef CONFIG_EROFS_FS_XATTR
  206. inode->i_op = &erofs_generic_xattr_iops;
  207. #endif
  208. inode->i_fop = &generic_ro_fops;
  209. } else if (S_ISDIR(inode->i_mode)) {
  210. inode->i_op =
  211. #ifdef CONFIG_EROFS_FS_XATTR
  212. &erofs_dir_xattr_iops;
  213. #else
  214. &erofs_dir_iops;
  215. #endif
  216. inode->i_fop = &erofs_dir_fops;
  217. } else if (S_ISLNK(inode->i_mode)) {
  218. /* by default, page_get_link is used for symlink */
  219. inode->i_op =
  220. #ifdef CONFIG_EROFS_FS_XATTR
  221. &erofs_symlink_xattr_iops,
  222. #else
  223. &page_symlink_inode_operations;
  224. #endif
  225. inode_nohighmem(inode);
  226. } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
  227. S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
  228. #ifdef CONFIG_EROFS_FS_XATTR
  229. inode->i_op = &erofs_special_inode_operations;
  230. #endif
  231. init_special_inode(inode, inode->i_mode, inode->i_rdev);
  232. } else {
  233. err = -EIO;
  234. goto out_unlock;
  235. }
  236. if (is_inode_layout_compression(inode)) {
  237. #ifdef CONFIG_EROFS_FS_ZIP
  238. inode->i_mapping->a_ops =
  239. &z_erofs_vle_normalaccess_aops;
  240. #else
  241. err = -ENOTSUPP;
  242. #endif
  243. goto out_unlock;
  244. }
  245. inode->i_mapping->a_ops = &erofs_raw_access_aops;
  246. /* fill last page if inline data is available */
  247. fill_inline_data(inode, page_address(page), ofs);
  248. }
  249. out_unlock:
  250. unlock_page(page);
  251. put_page(page);
  252. return err;
  253. }
  254. struct inode *erofs_iget(struct super_block *sb,
  255. erofs_nid_t nid, bool isdir)
  256. {
  257. struct inode *inode = iget_locked(sb, nid);
  258. if (unlikely(inode == NULL))
  259. return ERR_PTR(-ENOMEM);
  260. if (inode->i_state & I_NEW) {
  261. int err;
  262. struct erofs_vnode *vi = EROFS_V(inode);
  263. vi->nid = nid;
  264. err = fill_inode(inode, isdir);
  265. if (likely(!err))
  266. unlock_new_inode(inode);
  267. else {
  268. iget_failed(inode);
  269. inode = ERR_PTR(err);
  270. }
  271. }
  272. return inode;
  273. }
  274. #ifdef CONFIG_EROFS_FS_XATTR
  275. const struct inode_operations erofs_generic_xattr_iops = {
  276. .listxattr = erofs_listxattr,
  277. };
  278. #endif
  279. #ifdef CONFIG_EROFS_FS_XATTR
  280. const struct inode_operations erofs_symlink_xattr_iops = {
  281. .get_link = page_get_link,
  282. .listxattr = erofs_listxattr,
  283. };
  284. #endif
  285. const struct inode_operations erofs_special_inode_operations = {
  286. #ifdef CONFIG_EROFS_FS_XATTR
  287. .listxattr = erofs_listxattr,
  288. #endif
  289. };
  290. #ifdef CONFIG_EROFS_FS_XATTR
  291. const struct inode_operations erofs_fast_symlink_xattr_iops = {
  292. .get_link = simple_get_link,
  293. .listxattr = erofs_listxattr,
  294. };
  295. #endif

namei.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/namei.c
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "internal.h"
  14. #include "xattr.h"
  15. #include <trace/events/erofs.h>
  16. struct erofs_qstr {
  17. const unsigned char *name;
  18. const unsigned char *end;
  19. };
  20. /* based on the end of qn is accurate and it must have the trailing '\0' */
  21. static inline int dirnamecmp(const struct erofs_qstr *qn,
  22. const struct erofs_qstr *qd,
  23. unsigned int *matched)
  24. {
  25. unsigned int i = *matched;
  26. /*
  27. * on-disk error, let's only BUG_ON in the debugging mode.
  28. * otherwise, it will return 1 to just skip the invalid name
  29. * and go on (in consideration of the lookup performance).
  30. */
  31. DBG_BUGON(qd->name > qd->end);
  32. /* qd could not have trailing '\0' */
  33. /* However it is absolutely safe if < qd->end */
  34. while (qd->name + i < qd->end && qd->name[i] != '\0') {
  35. if (qn->name[i] != qd->name[i]) {
  36. *matched = i;
  37. return qn->name[i] > qd->name[i] ? 1 : -1;
  38. }
  39. ++i;
  40. }
  41. *matched = i;
  42. /* See comments in __d_alloc on the terminating NUL character */
  43. return qn->name[i] == '\0' ? 0 : 1;
  44. }
  45. #define nameoff_from_disk(off, sz) (le16_to_cpu(off) & ((sz) - 1))
  46. static struct erofs_dirent *find_target_dirent(struct erofs_qstr *name,
  47. u8 *data,
  48. unsigned int dirblksize,
  49. const int ndirents)
  50. {
  51. int head, back;
  52. unsigned int startprfx, endprfx;
  53. struct erofs_dirent *const de = (struct erofs_dirent *)data;
  54. /* since the 1st dirent has been evaluated previously */
  55. head = 1;
  56. back = ndirents - 1;
  57. startprfx = endprfx = 0;
  58. while (head <= back) {
  59. const int mid = head + (back - head) / 2;
  60. const int nameoff = nameoff_from_disk(de[mid].nameoff,
  61. dirblksize);
  62. unsigned int matched = min(startprfx, endprfx);
  63. struct erofs_qstr dname = {
  64. .name = data + nameoff,
  65. .end = unlikely(mid >= ndirents - 1) ?
  66. data + dirblksize :
  67. data + nameoff_from_disk(de[mid + 1].nameoff,
  68. dirblksize)
  69. };
  70. /* string comparison without already matched prefix */
  71. int ret = dirnamecmp(name, &dname, &matched);
  72. if (unlikely(!ret)) {
  73. return de + mid;
  74. } else if (ret > 0) {
  75. head = mid + 1;
  76. startprfx = matched;
  77. } else {
  78. back = mid - 1;
  79. endprfx = matched;
  80. }
  81. }
  82. return ERR_PTR(-ENOENT);
  83. }
  84. static struct page *find_target_block_classic(struct inode *dir,
  85. struct erofs_qstr *name,
  86. int *_ndirents)
  87. {
  88. unsigned int startprfx, endprfx;
  89. int head, back;
  90. struct address_space *const mapping = dir->i_mapping;
  91. struct page *candidate = ERR_PTR(-ENOENT);
  92. startprfx = endprfx = 0;
  93. head = 0;
  94. back = inode_datablocks(dir) - 1;
  95. while (head <= back) {
  96. const int mid = head + (back - head) / 2;
  97. struct page *page = read_mapping_page(mapping, mid, NULL);
  98. if (!IS_ERR(page)) {
  99. struct erofs_dirent *de = kmap_atomic(page);
  100. const int nameoff = nameoff_from_disk(de->nameoff,
  101. EROFS_BLKSIZ);
  102. const int ndirents = nameoff / sizeof(*de);
  103. int diff;
  104. unsigned int matched;
  105. struct erofs_qstr dname;
  106. if (unlikely(!ndirents)) {
  107. DBG_BUGON(1);
  108. kunmap_atomic(de);
  109. put_page(page);
  110. page = ERR_PTR(-EIO);
  111. goto out;
  112. }
  113. matched = min(startprfx, endprfx);
  114. dname.name = (u8 *)de + nameoff;
  115. if (ndirents == 1)
  116. dname.end = (u8 *)de + EROFS_BLKSIZ;
  117. else
  118. dname.end = (u8 *)de +
  119. nameoff_from_disk(de[1].nameoff,
  120. EROFS_BLKSIZ);
  121. /* string comparison without already matched prefix */
  122. diff = dirnamecmp(name, &dname, &matched);
  123. kunmap_atomic(de);
  124. if (unlikely(!diff)) {
  125. *_ndirents = 0;
  126. goto out;
  127. } else if (diff > 0) {
  128. head = mid + 1;
  129. startprfx = matched;
  130. if (likely(!IS_ERR(candidate)))
  131. put_page(candidate);
  132. candidate = page;
  133. *_ndirents = ndirents;
  134. } else {
  135. put_page(page);
  136. back = mid - 1;
  137. endprfx = matched;
  138. }
  139. continue;
  140. }
  141. out: /* free if the candidate is valid */
  142. if (!IS_ERR(candidate))
  143. put_page(candidate);
  144. return page;
  145. }
  146. return candidate;
  147. }
  148. int erofs_namei(struct inode *dir,
  149. struct qstr *name,
  150. erofs_nid_t *nid, unsigned int *d_type)
  151. {
  152. int ndirents;
  153. struct page *page;
  154. void *data;
  155. struct erofs_dirent *de;
  156. struct erofs_qstr qn;
  157. if (unlikely(!dir->i_size))
  158. return -ENOENT;
  159. qn.name = name->name;
  160. qn.end = name->name + name->len;
  161. ndirents = 0;
  162. page = find_target_block_classic(dir, &qn, &ndirents);
  163. if (unlikely(IS_ERR(page)))
  164. return PTR_ERR(page);
  165. data = kmap_atomic(page);
  166. /* the target page has been mapped */
  167. if (ndirents)
  168. de = find_target_dirent(&qn, data, EROFS_BLKSIZ, ndirents);
  169. else
  170. de = (struct erofs_dirent *)data;
  171. if (likely(!IS_ERR(de))) {
  172. *nid = le64_to_cpu(de->nid);
  173. *d_type = de->file_type;
  174. }
  175. kunmap_atomic(data);
  176. put_page(page);
  177. return PTR_ERR_OR_ZERO(de);
  178. }
  179. /* NOTE: i_mutex is already held by vfs */
  180. static struct dentry *erofs_lookup(struct inode *dir,
  181. struct dentry *dentry, unsigned int flags)
  182. {
  183. int err;
  184. erofs_nid_t nid;
  185. unsigned d_type;
  186. struct inode *inode;
  187. DBG_BUGON(!d_really_is_negative(dentry));
  188. /* dentry must be unhashed in lookup, no need to worry about */
  189. DBG_BUGON(!d_unhashed(dentry));
  190. trace_erofs_lookup(dir, dentry, flags);
  191. /* file name exceeds fs limit */
  192. if (unlikely(dentry->d_name.len > EROFS_NAME_LEN))
  193. return ERR_PTR(-ENAMETOOLONG);
  194. /* false uninitialized warnings on gcc 4.8.x */
  195. err = erofs_namei(dir, &dentry->d_name, &nid, &d_type);
  196. if (err == -ENOENT) {
  197. /* negative dentry */
  198. inode = NULL;
  199. goto negative_out;
  200. } else if (unlikely(err))
  201. return ERR_PTR(err);
  202. debugln("%s, %s (nid %llu) found, d_type %u", __func__,
  203. dentry->d_name.name, nid, d_type);
  204. inode = erofs_iget(dir->i_sb, nid, d_type == EROFS_FT_DIR);
  205. if (IS_ERR(inode))
  206. return ERR_CAST(inode);
  207. negative_out:
  208. return d_splice_alias(inode, dentry);
  209. }
  210. const struct inode_operations erofs_dir_iops = {
  211. .lookup = erofs_lookup,
  212. };
  213. const struct inode_operations erofs_dir_xattr_iops = {
  214. .lookup = erofs_lookup,
  215. #ifdef CONFIG_EROFS_FS_XATTR
  216. .listxattr = erofs_listxattr,
  217. #endif
  218. };

super.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/super.c
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include <linux/module.h>
  14. #include <linux/buffer_head.h>
  15. #include <linux/statfs.h>
  16. #include <linux/parser.h>
  17. #include <linux/seq_file.h>
  18. #include "internal.h"
  19. #define CREATE_TRACE_POINTS
  20. #include <trace/events/erofs.h>
  21. static struct kmem_cache *erofs_inode_cachep __read_mostly;
  22. static void init_once(void *ptr)
  23. {
  24. struct erofs_vnode *vi = ptr;
  25. inode_init_once(&vi->vfs_inode);
  26. }
  27. static int erofs_init_inode_cache(void)
  28. {
  29. erofs_inode_cachep = kmem_cache_create("erofs_inode",
  30. sizeof(struct erofs_vnode), 0,
  31. SLAB_RECLAIM_ACCOUNT, init_once);
  32. return erofs_inode_cachep != NULL ? 0 : -ENOMEM;
  33. }
  34. static void erofs_exit_inode_cache(void)
  35. {
  36. kmem_cache_destroy(erofs_inode_cachep);
  37. }
  38. static struct inode *alloc_inode(struct super_block *sb)
  39. {
  40. struct erofs_vnode *vi =
  41. kmem_cache_alloc(erofs_inode_cachep, GFP_KERNEL);
  42. if (vi == NULL)
  43. return NULL;
  44. /* zero out everything except vfs_inode */
  45. memset(vi, 0, offsetof(struct erofs_vnode, vfs_inode));
  46. return &vi->vfs_inode;
  47. }
  48. static void i_callback(struct rcu_head *head)
  49. {
  50. struct inode *inode = container_of(head, struct inode, i_rcu);
  51. struct erofs_vnode *vi = EROFS_V(inode);
  52. /* be careful RCU symlink path (see ext4_inode_info->i_data)! */
  53. if (is_inode_fast_symlink(inode))
  54. kfree(inode->i_link);
  55. kfree(vi->xattr_shared_xattrs);
  56. kmem_cache_free(erofs_inode_cachep, vi);
  57. }
  58. static void destroy_inode(struct inode *inode)
  59. {
  60. call_rcu(&inode->i_rcu, i_callback);
  61. }
  62. static bool check_layout_compatibility(struct super_block *sb,
  63. struct erofs_super_block *layout)
  64. {
  65. const unsigned int requirements = le32_to_cpu(layout->requirements);
  66. EROFS_SB(sb)->requirements = requirements;
  67. /* check if current kernel meets all mandatory requirements */
  68. if (requirements & (~EROFS_ALL_REQUIREMENTS)) {
  69. errln("unidentified requirements %x, please upgrade kernel version",
  70. requirements & ~EROFS_ALL_REQUIREMENTS);
  71. return false;
  72. }
  73. return true;
  74. }
  75. static int superblock_read(struct super_block *sb)
  76. {
  77. struct erofs_sb_info *sbi;
  78. struct buffer_head *bh;
  79. struct erofs_super_block *layout;
  80. unsigned blkszbits;
  81. int ret;
  82. bh = sb_bread(sb, 0);
  83. if (bh == NULL) {
  84. errln("cannot read erofs superblock");
  85. return -EIO;
  86. }
  87. sbi = EROFS_SB(sb);
  88. layout = (struct erofs_super_block *)((u8 *)bh->b_data
  89. + EROFS_SUPER_OFFSET);
  90. ret = -EINVAL;
  91. if (le32_to_cpu(layout->magic) != EROFS_SUPER_MAGIC_V1) {
  92. errln("cannot find valid erofs superblock");
  93. goto out;
  94. }
  95. blkszbits = layout->blkszbits;
  96. /* 9(512 bytes) + LOG_SECTORS_PER_BLOCK == LOG_BLOCK_SIZE */
  97. if (unlikely(blkszbits != LOG_BLOCK_SIZE)) {
  98. errln("blksize %u isn't supported on this platform",
  99. 1 << blkszbits);
  100. goto out;
  101. }
  102. if (!check_layout_compatibility(sb, layout))
  103. goto out;
  104. sbi->blocks = le32_to_cpu(layout->blocks);
  105. sbi->meta_blkaddr = le32_to_cpu(layout->meta_blkaddr);
  106. #ifdef CONFIG_EROFS_FS_XATTR
  107. sbi->xattr_blkaddr = le32_to_cpu(layout->xattr_blkaddr);
  108. #endif
  109. sbi->islotbits = ffs(sizeof(struct erofs_inode_v1)) - 1;
  110. #ifdef CONFIG_EROFS_FS_ZIP
  111. sbi->clusterbits = 12;
  112. if (1 << (sbi->clusterbits - 12) > Z_EROFS_CLUSTER_MAX_PAGES)
  113. errln("clusterbits %u is not supported on this kernel",
  114. sbi->clusterbits);
  115. #endif
  116. sbi->root_nid = le16_to_cpu(layout->root_nid);
  117. sbi->inos = le64_to_cpu(layout->inos);
  118. sbi->build_time = le64_to_cpu(layout->build_time);
  119. sbi->build_time_nsec = le32_to_cpu(layout->build_time_nsec);
  120. memcpy(&sb->s_uuid, layout->uuid, sizeof(layout->uuid));
  121. memcpy(sbi->volume_name, layout->volume_name,
  122. sizeof(layout->volume_name));
  123. ret = 0;
  124. out:
  125. brelse(bh);
  126. return ret;
  127. }
  128. #ifdef CONFIG_EROFS_FAULT_INJECTION
  129. char *erofs_fault_name[FAULT_MAX] = {
  130. [FAULT_KMALLOC] = "kmalloc",
  131. };
  132. static void erofs_build_fault_attr(struct erofs_sb_info *sbi,
  133. unsigned int rate)
  134. {
  135. struct erofs_fault_info *ffi = &sbi->fault_info;
  136. if (rate) {
  137. atomic_set(&ffi->inject_ops, 0);
  138. ffi->inject_rate = rate;
  139. ffi->inject_type = (1 << FAULT_MAX) - 1;
  140. } else {
  141. memset(ffi, 0, sizeof(struct erofs_fault_info));
  142. }
  143. }
  144. #endif
  145. static void default_options(struct erofs_sb_info *sbi)
  146. {
  147. #ifdef CONFIG_EROFS_FS_XATTR
  148. set_opt(sbi, XATTR_USER);
  149. #endif
  150. #ifdef CONFIG_EROFS_FS_POSIX_ACL
  151. set_opt(sbi, POSIX_ACL);
  152. #endif
  153. }
  154. enum {
  155. Opt_user_xattr,
  156. Opt_nouser_xattr,
  157. Opt_acl,
  158. Opt_noacl,
  159. Opt_fault_injection,
  160. Opt_err
  161. };
  162. static match_table_t erofs_tokens = {
  163. {Opt_user_xattr, "user_xattr"},
  164. {Opt_nouser_xattr, "nouser_xattr"},
  165. {Opt_acl, "acl"},
  166. {Opt_noacl, "noacl"},
  167. {Opt_fault_injection, "fault_injection=%u"},
  168. {Opt_err, NULL}
  169. };
  170. static int parse_options(struct super_block *sb, char *options)
  171. {
  172. substring_t args[MAX_OPT_ARGS];
  173. char *p;
  174. int arg = 0;
  175. if (!options)
  176. return 0;
  177. while ((p = strsep(&options, ",")) != NULL) {
  178. int token;
  179. if (!*p)
  180. continue;
  181. args[0].to = args[0].from = NULL;
  182. token = match_token(p, erofs_tokens, args);
  183. switch (token) {
  184. #ifdef CONFIG_EROFS_FS_XATTR
  185. case Opt_user_xattr:
  186. set_opt(EROFS_SB(sb), XATTR_USER);
  187. break;
  188. case Opt_nouser_xattr:
  189. clear_opt(EROFS_SB(sb), XATTR_USER);
  190. break;
  191. #else
  192. case Opt_user_xattr:
  193. infoln("user_xattr options not supported");
  194. break;
  195. case Opt_nouser_xattr:
  196. infoln("nouser_xattr options not supported");
  197. break;
  198. #endif
  199. #ifdef CONFIG_EROFS_FS_POSIX_ACL
  200. case Opt_acl:
  201. set_opt(EROFS_SB(sb), POSIX_ACL);
  202. break;
  203. case Opt_noacl:
  204. clear_opt(EROFS_SB(sb), POSIX_ACL);
  205. break;
  206. #else
  207. case Opt_acl:
  208. infoln("acl options not supported");
  209. break;
  210. case Opt_noacl:
  211. infoln("noacl options not supported");
  212. break;
  213. #endif
  214. case Opt_fault_injection:
  215. if (args->from && match_int(args, &arg))
  216. return -EINVAL;
  217. #ifdef CONFIG_EROFS_FAULT_INJECTION
  218. erofs_build_fault_attr(EROFS_SB(sb), arg);
  219. set_opt(EROFS_SB(sb), FAULT_INJECTION);
  220. #else
  221. infoln("FAULT_INJECTION was not selected");
  222. #endif
  223. break;
  224. default:
  225. errln("Unrecognized mount option \"%s\" "
  226. "or missing value", p);
  227. return -EINVAL;
  228. }
  229. }
  230. return 0;
  231. }
  232. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  233. static const struct address_space_operations managed_cache_aops;
  234. static int managed_cache_releasepage(struct page *page, gfp_t gfp_mask)
  235. {
  236. int ret = 1; /* 0 - busy */
  237. struct address_space *const mapping = page->mapping;
  238. DBG_BUGON(!PageLocked(page));
  239. DBG_BUGON(mapping->a_ops != &managed_cache_aops);
  240. if (PagePrivate(page))
  241. ret = erofs_try_to_free_cached_page(mapping, page);
  242. return ret;
  243. }
  244. static void managed_cache_invalidatepage(struct page *page,
  245. unsigned int offset, unsigned int length)
  246. {
  247. const unsigned int stop = length + offset;
  248. DBG_BUGON(!PageLocked(page));
  249. /* Check for potential overflow in debug mode */
  250. DBG_BUGON(stop > PAGE_SIZE || stop < length);
  251. if (offset == 0 && stop == PAGE_SIZE)
  252. while (!managed_cache_releasepage(page, GFP_NOFS))
  253. cond_resched();
  254. }
  255. static const struct address_space_operations managed_cache_aops = {
  256. .releasepage = managed_cache_releasepage,
  257. .invalidatepage = managed_cache_invalidatepage,
  258. };
  259. static struct inode *erofs_init_managed_cache(struct super_block *sb)
  260. {
  261. struct inode *inode = new_inode(sb);
  262. if (unlikely(inode == NULL))
  263. return ERR_PTR(-ENOMEM);
  264. set_nlink(inode, 1);
  265. inode->i_size = OFFSET_MAX;
  266. inode->i_mapping->a_ops = &managed_cache_aops;
  267. mapping_set_gfp_mask(inode->i_mapping,
  268. GFP_NOFS | __GFP_HIGHMEM |
  269. __GFP_MOVABLE | __GFP_NOFAIL);
  270. return inode;
  271. }
  272. #endif
  273. static int erofs_read_super(struct super_block *sb,
  274. const char *dev_name, void *data, int silent)
  275. {
  276. struct inode *inode;
  277. struct erofs_sb_info *sbi;
  278. int err = -EINVAL;
  279. infoln("read_super, device -> %s", dev_name);
  280. infoln("options -> %s", (char *)data);
  281. if (unlikely(!sb_set_blocksize(sb, EROFS_BLKSIZ))) {
  282. errln("failed to set erofs blksize");
  283. goto err;
  284. }
  285. sbi = kzalloc(sizeof(struct erofs_sb_info), GFP_KERNEL);
  286. if (unlikely(sbi == NULL)) {
  287. err = -ENOMEM;
  288. goto err;
  289. }
  290. sb->s_fs_info = sbi;
  291. err = superblock_read(sb);
  292. if (err)
  293. goto err_sbread;
  294. sb->s_magic = EROFS_SUPER_MAGIC;
  295. sb->s_flags |= SB_RDONLY | SB_NOATIME;
  296. sb->s_maxbytes = MAX_LFS_FILESIZE;
  297. sb->s_time_gran = 1;
  298. sb->s_op = &erofs_sops;
  299. #ifdef CONFIG_EROFS_FS_XATTR
  300. sb->s_xattr = erofs_xattr_handlers;
  301. #endif
  302. /* set erofs default mount options */
  303. default_options(sbi);
  304. err = parse_options(sb, data);
  305. if (err)
  306. goto err_parseopt;
  307. if (!silent)
  308. infoln("root inode @ nid %llu", ROOT_NID(sbi));
  309. #ifdef CONFIG_EROFS_FS_ZIP
  310. INIT_RADIX_TREE(&sbi->workstn_tree, GFP_ATOMIC);
  311. #endif
  312. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  313. sbi->managed_cache = erofs_init_managed_cache(sb);
  314. if (IS_ERR(sbi->managed_cache)) {
  315. err = PTR_ERR(sbi->managed_cache);
  316. goto err_init_managed_cache;
  317. }
  318. #endif
  319. /* get the root inode */
  320. inode = erofs_iget(sb, ROOT_NID(sbi), true);
  321. if (IS_ERR(inode)) {
  322. err = PTR_ERR(inode);
  323. goto err_iget;
  324. }
  325. if (!S_ISDIR(inode->i_mode)) {
  326. errln("rootino(nid %llu) is not a directory(i_mode %o)",
  327. ROOT_NID(sbi), inode->i_mode);
  328. err = -EINVAL;
  329. goto err_isdir;
  330. }
  331. sb->s_root = d_make_root(inode);
  332. if (sb->s_root == NULL) {
  333. err = -ENOMEM;
  334. goto err_makeroot;
  335. }
  336. /* save the device name to sbi */
  337. sbi->dev_name = __getname();
  338. if (sbi->dev_name == NULL) {
  339. err = -ENOMEM;
  340. goto err_devname;
  341. }
  342. snprintf(sbi->dev_name, PATH_MAX, "%s", dev_name);
  343. sbi->dev_name[PATH_MAX - 1] = '\0';
  344. erofs_register_super(sb);
  345. if (!silent)
  346. infoln("mounted on %s with opts: %s.", dev_name,
  347. (char *)data);
  348. return 0;
  349. /*
  350. * please add a label for each exit point and use
  351. * the following name convention, thus new features
  352. * can be integrated easily without renaming labels.
  353. */
  354. err_devname:
  355. dput(sb->s_root);
  356. err_makeroot:
  357. err_isdir:
  358. if (sb->s_root == NULL)
  359. iput(inode);
  360. err_iget:
  361. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  362. iput(sbi->managed_cache);
  363. err_init_managed_cache:
  364. #endif
  365. err_parseopt:
  366. err_sbread:
  367. sb->s_fs_info = NULL;
  368. kfree(sbi);
  369. err:
  370. return err;
  371. }
  372. /*
  373. * could be triggered after deactivate_locked_super()
  374. * is called, thus including umount and failed to initialize.
  375. */
  376. static void erofs_put_super(struct super_block *sb)
  377. {
  378. struct erofs_sb_info *sbi = EROFS_SB(sb);
  379. /* for cases which are failed in "read_super" */
  380. if (sbi == NULL)
  381. return;
  382. WARN_ON(sb->s_magic != EROFS_SUPER_MAGIC);
  383. infoln("unmounted for %s", sbi->dev_name);
  384. __putname(sbi->dev_name);
  385. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  386. iput(sbi->managed_cache);
  387. #endif
  388. mutex_lock(&sbi->umount_mutex);
  389. #ifdef CONFIG_EROFS_FS_ZIP
  390. erofs_workstation_cleanup_all(sb);
  391. #endif
  392. erofs_unregister_super(sb);
  393. mutex_unlock(&sbi->umount_mutex);
  394. kfree(sbi);
  395. sb->s_fs_info = NULL;
  396. }
  397. struct erofs_mount_private {
  398. const char *dev_name;
  399. char *options;
  400. };
  401. /* support mount_bdev() with options */
  402. static int erofs_fill_super(struct super_block *sb,
  403. void *_priv, int silent)
  404. {
  405. struct erofs_mount_private *priv = _priv;
  406. return erofs_read_super(sb, priv->dev_name,
  407. priv->options, silent);
  408. }
  409. static struct dentry *erofs_mount(
  410. struct file_system_type *fs_type, int flags,
  411. const char *dev_name, void *data)
  412. {
  413. struct erofs_mount_private priv = {
  414. .dev_name = dev_name,
  415. .options = data
  416. };
  417. return mount_bdev(fs_type, flags, dev_name,
  418. &priv, erofs_fill_super);
  419. }
  420. static void erofs_kill_sb(struct super_block *sb)
  421. {
  422. kill_block_super(sb);
  423. }
  424. static struct shrinker erofs_shrinker_info = {
  425. .scan_objects = erofs_shrink_scan,
  426. .count_objects = erofs_shrink_count,
  427. .seeks = DEFAULT_SEEKS,
  428. };
  429. static struct file_system_type erofs_fs_type = {
  430. .owner = THIS_MODULE,
  431. .name = "erofs",
  432. .mount = erofs_mount,
  433. .kill_sb = erofs_kill_sb,
  434. .fs_flags = FS_REQUIRES_DEV,
  435. };
  436. MODULE_ALIAS_FS("erofs");
  437. #ifdef CONFIG_EROFS_FS_ZIP
  438. extern int z_erofs_init_zip_subsystem(void);
  439. extern void z_erofs_exit_zip_subsystem(void);
  440. #endif
  441. static int __init erofs_module_init(void)
  442. {
  443. int err;
  444. erofs_check_ondisk_layout_definitions();
  445. infoln("initializing erofs " EROFS_VERSION);
  446. err = erofs_init_inode_cache();
  447. if (err)
  448. goto icache_err;
  449. err = register_shrinker(&erofs_shrinker_info);
  450. if (err)
  451. goto shrinker_err;
  452. #ifdef CONFIG_EROFS_FS_ZIP
  453. err = z_erofs_init_zip_subsystem();
  454. if (err)
  455. goto zip_err;
  456. #endif
  457. err = register_filesystem(&erofs_fs_type);
  458. if (err)
  459. goto fs_err;
  460. infoln("successfully to initialize erofs");
  461. return 0;
  462. fs_err:
  463. #ifdef CONFIG_EROFS_FS_ZIP
  464. z_erofs_exit_zip_subsystem();
  465. zip_err:
  466. #endif
  467. unregister_shrinker(&erofs_shrinker_info);
  468. shrinker_err:
  469. erofs_exit_inode_cache();
  470. icache_err:
  471. return err;
  472. }
  473. static void __exit erofs_module_exit(void)
  474. {
  475. unregister_filesystem(&erofs_fs_type);
  476. #ifdef CONFIG_EROFS_FS_ZIP
  477. z_erofs_exit_zip_subsystem();
  478. #endif
  479. unregister_shrinker(&erofs_shrinker_info);
  480. erofs_exit_inode_cache();
  481. infoln("successfully finalize erofs");
  482. }
  483. /* get filesystem statistics */
  484. static int erofs_statfs(struct dentry *dentry, struct kstatfs *buf)
  485. {
  486. struct super_block *sb = dentry->d_sb;
  487. struct erofs_sb_info *sbi = EROFS_SB(sb);
  488. u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
  489. buf->f_type = sb->s_magic;
  490. buf->f_bsize = EROFS_BLKSIZ;
  491. buf->f_blocks = sbi->blocks;
  492. buf->f_bfree = buf->f_bavail = 0;
  493. buf->f_files = ULLONG_MAX;
  494. buf->f_ffree = ULLONG_MAX - sbi->inos;
  495. buf->f_namelen = EROFS_NAME_LEN;
  496. buf->f_fsid.val[0] = (u32)id;
  497. buf->f_fsid.val[1] = (u32)(id >> 32);
  498. return 0;
  499. }
  500. static int erofs_show_options(struct seq_file *seq, struct dentry *root)
  501. {
  502. struct erofs_sb_info *sbi __maybe_unused = EROFS_SB(root->d_sb);
  503. #ifdef CONFIG_EROFS_FS_XATTR
  504. if (test_opt(sbi, XATTR_USER))
  505. seq_puts(seq, ",user_xattr");
  506. else
  507. seq_puts(seq, ",nouser_xattr");
  508. #endif
  509. #ifdef CONFIG_EROFS_FS_POSIX_ACL
  510. if (test_opt(sbi, POSIX_ACL))
  511. seq_puts(seq, ",acl");
  512. else
  513. seq_puts(seq, ",noacl");
  514. #endif
  515. #ifdef CONFIG_EROFS_FAULT_INJECTION
  516. if (test_opt(sbi, FAULT_INJECTION))
  517. seq_printf(seq, ",fault_injection=%u",
  518. sbi->fault_info.inject_rate);
  519. #endif
  520. return 0;
  521. }
  522. static int erofs_remount(struct super_block *sb, int *flags, char *data)
  523. {
  524. DBG_BUGON(!sb_rdonly(sb));
  525. *flags |= SB_RDONLY;
  526. return 0;
  527. }
  528. const struct super_operations erofs_sops = {
  529. .put_super = erofs_put_super,
  530. .alloc_inode = alloc_inode,
  531. .destroy_inode = destroy_inode,
  532. .statfs = erofs_statfs,
  533. .show_options = erofs_show_options,
  534. .remount_fs = erofs_remount,
  535. };
  536. module_init(erofs_module_init);
  537. module_exit(erofs_module_exit);
  538. MODULE_DESCRIPTION("Enhanced ROM File System");
  539. MODULE_AUTHOR("Gao Xiang, Yu Chao, Miao Xie, CONSUMER BG, HUAWEI Inc.");
  540. MODULE_LICENSE("GPL");

unzip_lz4.c

  1. // SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
  2. /*
  3. * linux/drivers/staging/erofs/unzip_lz4.c
  4. *
  5. * Copyright (C) 2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * Original code taken from 'linux/lib/lz4/lz4_decompress.c'
  10. */
  11. /*
  12. * LZ4 - Fast LZ compression algorithm
  13. * Copyright (C) 2011 - 2016, Yann Collet.
  14. * BSD 2 - Clause License (http://www.opensource.org/licenses/bsd - license.php)
  15. * Redistribution and use in source and binary forms, with or without
  16. * modification, are permitted provided that the following conditions are
  17. * met:
  18. * * Redistributions of source code must retain the above copyright
  19. * notice, this list of conditions and the following disclaimer.
  20. * * Redistributions in binary form must reproduce the above
  21. * copyright notice, this list of conditions and the following disclaimer
  22. * in the documentation and/or other materials provided with the
  23. * distribution.
  24. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  25. * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  26. * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  27. * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  28. * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  29. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  30. * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  31. * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  32. * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  33. * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  34. * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  35. * You can contact the author at :
  36. * - LZ4 homepage : http://www.lz4.org
  37. * - LZ4 source repository : https://github.com/lz4/lz4
  38. *
  39. * Changed for kernel usage by:
  40. * Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
  41. */
  42. #include "internal.h"
  43. #include <asm/unaligned.h>
  44. #include "lz4defs.h"
  45. /*
  46. * no public solution to solve our requirement yet.
  47. * see: <required buffer size for LZ4_decompress_safe_partial>
  48. * https://groups.google.com/forum/#!topic/lz4c/_3kkz5N6n00
  49. */
  50. static FORCE_INLINE int customized_lz4_decompress_safe_partial(
  51. const void * const source,
  52. void * const dest,
  53. int inputSize,
  54. int outputSize)
  55. {
  56. /* Local Variables */
  57. const BYTE *ip = (const BYTE *) source;
  58. const BYTE * const iend = ip + inputSize;
  59. BYTE *op = (BYTE *) dest;
  60. BYTE * const oend = op + outputSize;
  61. BYTE *cpy;
  62. static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
  63. static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
  64. /* Empty output buffer */
  65. if (unlikely(outputSize == 0))
  66. return ((inputSize == 1) && (*ip == 0)) ? 0 : -1;
  67. /* Main Loop : decode sequences */
  68. while (1) {
  69. size_t length;
  70. const BYTE *match;
  71. size_t offset;
  72. /* get literal length */
  73. unsigned int const token = *ip++;
  74. length = token>>ML_BITS;
  75. if (length == RUN_MASK) {
  76. unsigned int s;
  77. do {
  78. s = *ip++;
  79. length += s;
  80. } while ((ip < iend - RUN_MASK) & (s == 255));
  81. if (unlikely((size_t)(op + length) < (size_t)(op))) {
  82. /* overflow detection */
  83. goto _output_error;
  84. }
  85. if (unlikely((size_t)(ip + length) < (size_t)(ip))) {
  86. /* overflow detection */
  87. goto _output_error;
  88. }
  89. }
  90. /* copy literals */
  91. cpy = op + length;
  92. if ((cpy > oend - WILDCOPYLENGTH) ||
  93. (ip + length > iend - (2 + 1 + LASTLITERALS))) {
  94. if (cpy > oend) {
  95. memcpy(op, ip, length = oend - op);
  96. op += length;
  97. break;
  98. }
  99. if (unlikely(ip + length > iend)) {
  100. /*
  101. * Error :
  102. * read attempt beyond
  103. * end of input buffer
  104. */
  105. goto _output_error;
  106. }
  107. memcpy(op, ip, length);
  108. ip += length;
  109. op += length;
  110. if (ip > iend - 2)
  111. break;
  112. /* Necessarily EOF, due to parsing restrictions */
  113. /* break; */
  114. } else {
  115. LZ4_wildCopy(op, ip, cpy);
  116. ip += length;
  117. op = cpy;
  118. }
  119. /* get offset */
  120. offset = LZ4_readLE16(ip);
  121. ip += 2;
  122. match = op - offset;
  123. if (unlikely(match < (const BYTE *)dest)) {
  124. /* Error : offset outside buffers */
  125. goto _output_error;
  126. }
  127. /* get matchlength */
  128. length = token & ML_MASK;
  129. if (length == ML_MASK) {
  130. unsigned int s;
  131. do {
  132. s = *ip++;
  133. if (ip > iend - LASTLITERALS)
  134. goto _output_error;
  135. length += s;
  136. } while (s == 255);
  137. if (unlikely((size_t)(op + length) < (size_t)op)) {
  138. /* overflow detection */
  139. goto _output_error;
  140. }
  141. }
  142. length += MINMATCH;
  143. /* copy match within block */
  144. cpy = op + length;
  145. if (unlikely(cpy >= oend - WILDCOPYLENGTH)) {
  146. if (cpy >= oend) {
  147. while (op < oend)
  148. *op++ = *match++;
  149. break;
  150. }
  151. goto __match;
  152. }
  153. /* costs ~1%; silence an msan warning when offset == 0 */
  154. LZ4_write32(op, (U32)offset);
  155. if (unlikely(offset < 8)) {
  156. const int dec64 = dec64table[offset];
  157. op[0] = match[0];
  158. op[1] = match[1];
  159. op[2] = match[2];
  160. op[3] = match[3];
  161. match += dec32table[offset];
  162. memcpy(op + 4, match, 4);
  163. match -= dec64;
  164. } else {
  165. LZ4_copy8(op, match);
  166. match += 8;
  167. }
  168. op += 8;
  169. if (unlikely(cpy > oend - 12)) {
  170. BYTE * const oCopyLimit = oend - (WILDCOPYLENGTH - 1);
  171. if (op < oCopyLimit) {
  172. LZ4_wildCopy(op, match, oCopyLimit);
  173. match += oCopyLimit - op;
  174. op = oCopyLimit;
  175. }
  176. __match:
  177. while (op < cpy)
  178. *op++ = *match++;
  179. } else {
  180. LZ4_copy8(op, match);
  181. if (length > 16)
  182. LZ4_wildCopy(op + 8, match + 8, cpy);
  183. }
  184. op = cpy; /* correction */
  185. }
  186. DBG_BUGON((void *)ip - source > inputSize);
  187. DBG_BUGON((void *)op - dest > outputSize);
  188. /* Nb of output bytes decoded */
  189. return (int) ((void *)op - dest);
  190. /* Overflow error detected */
  191. _output_error:
  192. return -ERANGE;
  193. }
  194. int z_erofs_unzip_lz4(void *in, void *out, size_t inlen, size_t outlen)
  195. {
  196. int ret = customized_lz4_decompress_safe_partial(in,
  197. out, inlen, outlen);
  198. if (ret >= 0)
  199. return ret;
  200. /*
  201. * LZ4_decompress_safe will return an error code
  202. * (< 0) if decompression failed
  203. */
  204. errln("%s, failed to decompress, in[%p, %zu] outlen[%p, %zu]",
  205. __func__, in, inlen, out, outlen);
  206. WARN_ON(1);
  207. print_hex_dump(KERN_DEBUG, "raw data [in]: ", DUMP_PREFIX_OFFSET,
  208. 16, 1, in, inlen, true);
  209. print_hex_dump(KERN_DEBUG, "raw data [out]: ", DUMP_PREFIX_OFFSET,
  210. 16, 1, out, outlen, true);
  211. return -EIO;
  212. }

unzip_vle.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/unzip_vle.c
  4. *
  5. * Copyright (C) 2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "unzip_vle.h"
  14. #include <linux/prefetch.h>
  15. static struct workqueue_struct *z_erofs_workqueue __read_mostly;
  16. static struct kmem_cache *z_erofs_workgroup_cachep __read_mostly;
  17. void z_erofs_exit_zip_subsystem(void)
  18. {
  19. destroy_workqueue(z_erofs_workqueue);
  20. kmem_cache_destroy(z_erofs_workgroup_cachep);
  21. }
  22. static inline int init_unzip_workqueue(void)
  23. {
  24. const unsigned onlinecpus = num_possible_cpus();
  25. /*
  26. * we don't need too many threads, limiting threads
  27. * could improve scheduling performance.
  28. */
  29. z_erofs_workqueue = alloc_workqueue("erofs_unzipd",
  30. WQ_UNBOUND | WQ_HIGHPRI | WQ_CPU_INTENSIVE,
  31. onlinecpus + onlinecpus / 4);
  32. return z_erofs_workqueue != NULL ? 0 : -ENOMEM;
  33. }
  34. int z_erofs_init_zip_subsystem(void)
  35. {
  36. z_erofs_workgroup_cachep =
  37. kmem_cache_create("erofs_compress",
  38. Z_EROFS_WORKGROUP_SIZE, 0,
  39. SLAB_RECLAIM_ACCOUNT, NULL);
  40. if (z_erofs_workgroup_cachep != NULL) {
  41. if (!init_unzip_workqueue())
  42. return 0;
  43. kmem_cache_destroy(z_erofs_workgroup_cachep);
  44. }
  45. return -ENOMEM;
  46. }
  47. enum z_erofs_vle_work_role {
  48. Z_EROFS_VLE_WORK_SECONDARY,
  49. Z_EROFS_VLE_WORK_PRIMARY,
  50. /*
  51. * The current work was the tail of an exist chain, and the previous
  52. * processed chained works are all decided to be hooked up to it.
  53. * A new chain should be created for the remaining unprocessed works,
  54. * therefore different from Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED,
  55. * the next work cannot reuse the whole page in the following scenario:
  56. * ________________________________________________________________
  57. * | tail (partial) page | head (partial) page |
  58. * | (belongs to the next work) | (belongs to the current work) |
  59. * |_______PRIMARY_FOLLOWED_______|________PRIMARY_HOOKED___________|
  60. */
  61. Z_EROFS_VLE_WORK_PRIMARY_HOOKED,
  62. /*
  63. * The current work has been linked with the processed chained works,
  64. * and could be also linked with the potential remaining works, which
  65. * means if the processing page is the tail partial page of the work,
  66. * the current work can safely use the whole page (since the next work
  67. * is under control) for in-place decompression, as illustrated below:
  68. * ________________________________________________________________
  69. * | tail (partial) page | head (partial) page |
  70. * | (of the current work) | (of the previous work) |
  71. * | PRIMARY_FOLLOWED or | |
  72. * |_____PRIMARY_HOOKED____|____________PRIMARY_FOLLOWED____________|
  73. *
  74. * [ (*) the above page can be used for the current work itself. ]
  75. */
  76. Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED,
  77. Z_EROFS_VLE_WORK_MAX
  78. };
  79. struct z_erofs_vle_work_builder {
  80. enum z_erofs_vle_work_role role;
  81. /*
  82. * 'hosted = false' means that the current workgroup doesn't belong to
  83. * the owned chained workgroups. In the other words, it is none of our
  84. * business to submit this workgroup.
  85. */
  86. bool hosted;
  87. struct z_erofs_vle_workgroup *grp;
  88. struct z_erofs_vle_work *work;
  89. struct z_erofs_pagevec_ctor vector;
  90. /* pages used for reading the compressed data */
  91. struct page **compressed_pages;
  92. unsigned compressed_deficit;
  93. };
  94. #define VLE_WORK_BUILDER_INIT() \
  95. { .work = NULL, .role = Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED }
  96. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  97. static bool grab_managed_cache_pages(struct address_space *mapping,
  98. erofs_blk_t start,
  99. struct page **compressed_pages,
  100. int clusterblks,
  101. bool reserve_allocation)
  102. {
  103. bool noio = true;
  104. unsigned int i;
  105. /* TODO: optimize by introducing find_get_pages_range */
  106. for (i = 0; i < clusterblks; ++i) {
  107. struct page *page, *found;
  108. if (READ_ONCE(compressed_pages[i]) != NULL)
  109. continue;
  110. page = found = find_get_page(mapping, start + i);
  111. if (found == NULL) {
  112. noio = false;
  113. if (!reserve_allocation)
  114. continue;
  115. page = EROFS_UNALLOCATED_CACHED_PAGE;
  116. }
  117. if (NULL == cmpxchg(compressed_pages + i, NULL, page))
  118. continue;
  119. if (found != NULL)
  120. put_page(found);
  121. }
  122. return noio;
  123. }
  124. /* called by erofs_shrinker to get rid of all compressed_pages */
  125. int erofs_try_to_free_all_cached_pages(struct erofs_sb_info *sbi,
  126. struct erofs_workgroup *egrp)
  127. {
  128. struct z_erofs_vle_workgroup *const grp =
  129. container_of(egrp, struct z_erofs_vle_workgroup, obj);
  130. struct address_space *const mapping = sbi->managed_cache->i_mapping;
  131. const int clusterpages = erofs_clusterpages(sbi);
  132. int i;
  133. /*
  134. * refcount of workgroup is now freezed as 1,
  135. * therefore no need to worry about available decompression users.
  136. */
  137. for (i = 0; i < clusterpages; ++i) {
  138. struct page *page = grp->compressed_pages[i];
  139. if (page == NULL || page->mapping != mapping)
  140. continue;
  141. /* block other users from reclaiming or migrating the page */
  142. if (!trylock_page(page))
  143. return -EBUSY;
  144. /* barrier is implied in the following 'unlock_page' */
  145. WRITE_ONCE(grp->compressed_pages[i], NULL);
  146. set_page_private(page, 0);
  147. ClearPagePrivate(page);
  148. unlock_page(page);
  149. put_page(page);
  150. }
  151. return 0;
  152. }
  153. int erofs_try_to_free_cached_page(struct address_space *mapping,
  154. struct page *page)
  155. {
  156. struct erofs_sb_info *const sbi = EROFS_SB(mapping->host->i_sb);
  157. const unsigned int clusterpages = erofs_clusterpages(sbi);
  158. struct z_erofs_vle_workgroup *grp;
  159. int ret = 0; /* 0 - busy */
  160. /* prevent the workgroup from being freed */
  161. rcu_read_lock();
  162. grp = (void *)page_private(page);
  163. if (erofs_workgroup_try_to_freeze(&grp->obj, 1)) {
  164. unsigned int i;
  165. for (i = 0; i < clusterpages; ++i) {
  166. if (grp->compressed_pages[i] == page) {
  167. WRITE_ONCE(grp->compressed_pages[i], NULL);
  168. ret = 1;
  169. break;
  170. }
  171. }
  172. erofs_workgroup_unfreeze(&grp->obj, 1);
  173. }
  174. rcu_read_unlock();
  175. if (ret) {
  176. ClearPagePrivate(page);
  177. put_page(page);
  178. }
  179. return ret;
  180. }
  181. #endif
  182. /* page_type must be Z_EROFS_PAGE_TYPE_EXCLUSIVE */
  183. static inline bool try_to_reuse_as_compressed_page(
  184. struct z_erofs_vle_work_builder *b,
  185. struct page *page)
  186. {
  187. while (b->compressed_deficit) {
  188. --b->compressed_deficit;
  189. if (NULL == cmpxchg(b->compressed_pages++, NULL, page))
  190. return true;
  191. }
  192. return false;
  193. }
  194. /* callers must be with work->lock held */
  195. static int z_erofs_vle_work_add_page(struct z_erofs_vle_work_builder *builder,
  196. struct page *page,
  197. enum z_erofs_page_type type,
  198. bool pvec_safereuse)
  199. {
  200. int ret;
  201. /* give priority for the compressed data storage */
  202. if (builder->role >= Z_EROFS_VLE_WORK_PRIMARY &&
  203. type == Z_EROFS_PAGE_TYPE_EXCLUSIVE &&
  204. try_to_reuse_as_compressed_page(builder, page))
  205. return 0;
  206. ret = z_erofs_pagevec_ctor_enqueue(&builder->vector, page, type,
  207. pvec_safereuse);
  208. builder->work->vcnt += (unsigned)ret;
  209. return ret ? 0 : -EAGAIN;
  210. }
  211. static enum z_erofs_vle_work_role
  212. try_to_claim_workgroup(struct z_erofs_vle_workgroup *grp,
  213. z_erofs_vle_owned_workgrp_t *owned_head,
  214. bool *hosted)
  215. {
  216. DBG_BUGON(*hosted == true);
  217. /* let's claim these following types of workgroup */
  218. retry:
  219. if (grp->next == Z_EROFS_VLE_WORKGRP_NIL) {
  220. /* type 1, nil workgroup */
  221. if (Z_EROFS_VLE_WORKGRP_NIL != cmpxchg(&grp->next,
  222. Z_EROFS_VLE_WORKGRP_NIL, *owned_head))
  223. goto retry;
  224. *owned_head = grp;
  225. *hosted = true;
  226. /* lucky, I am the followee :) */
  227. return Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED;
  228. } else if (grp->next == Z_EROFS_VLE_WORKGRP_TAIL) {
  229. /*
  230. * type 2, link to the end of a existing open chain,
  231. * be careful that its submission itself is governed
  232. * by the original owned chain.
  233. */
  234. if (Z_EROFS_VLE_WORKGRP_TAIL != cmpxchg(&grp->next,
  235. Z_EROFS_VLE_WORKGRP_TAIL, *owned_head))
  236. goto retry;
  237. *owned_head = Z_EROFS_VLE_WORKGRP_TAIL;
  238. return Z_EROFS_VLE_WORK_PRIMARY_HOOKED;
  239. }
  240. return Z_EROFS_VLE_WORK_PRIMARY; /* :( better luck next time */
  241. }
  242. static struct z_erofs_vle_work *
  243. z_erofs_vle_work_lookup(struct super_block *sb,
  244. pgoff_t idx, unsigned pageofs,
  245. struct z_erofs_vle_workgroup **grp_ret,
  246. enum z_erofs_vle_work_role *role,
  247. z_erofs_vle_owned_workgrp_t *owned_head,
  248. bool *hosted)
  249. {
  250. bool tag, primary;
  251. struct erofs_workgroup *egrp;
  252. struct z_erofs_vle_workgroup *grp;
  253. struct z_erofs_vle_work *work;
  254. egrp = erofs_find_workgroup(sb, idx, &tag);
  255. if (egrp == NULL) {
  256. *grp_ret = NULL;
  257. return NULL;
  258. }
  259. *grp_ret = grp = container_of(egrp,
  260. struct z_erofs_vle_workgroup, obj);
  261. work = z_erofs_vle_grab_work(grp, pageofs);
  262. /* if multiref is disabled, `primary' is always true */
  263. primary = true;
  264. if (work->pageofs != pageofs) {
  265. DBG_BUGON(1);
  266. erofs_workgroup_put(egrp);
  267. return ERR_PTR(-EIO);
  268. }
  269. /*
  270. * lock must be taken first to avoid grp->next == NIL between
  271. * claiming workgroup and adding pages:
  272. * grp->next != NIL
  273. * grp->next = NIL
  274. * mutex_unlock_all
  275. * mutex_lock(&work->lock)
  276. * add all pages to pagevec
  277. *
  278. * [correct locking case 1]:
  279. * mutex_lock(grp->work[a])
  280. * ...
  281. * mutex_lock(grp->work[b]) mutex_lock(grp->work[c])
  282. * ... *role = SECONDARY
  283. * add all pages to pagevec
  284. * ...
  285. * mutex_unlock(grp->work[c])
  286. * mutex_lock(grp->work[c])
  287. * ...
  288. * grp->next = NIL
  289. * mutex_unlock_all
  290. *
  291. * [correct locking case 2]:
  292. * mutex_lock(grp->work[b])
  293. * ...
  294. * mutex_lock(grp->work[a])
  295. * ...
  296. * mutex_lock(grp->work[c])
  297. * ...
  298. * grp->next = NIL
  299. * mutex_unlock_all
  300. * mutex_lock(grp->work[a])
  301. * *role = PRIMARY_OWNER
  302. * add all pages to pagevec
  303. * ...
  304. */
  305. mutex_lock(&work->lock);
  306. *hosted = false;
  307. if (!primary)
  308. *role = Z_EROFS_VLE_WORK_SECONDARY;
  309. else /* claim the workgroup if possible */
  310. *role = try_to_claim_workgroup(grp, owned_head, hosted);
  311. return work;
  312. }
  313. static struct z_erofs_vle_work *
  314. z_erofs_vle_work_register(struct super_block *sb,
  315. struct z_erofs_vle_workgroup **grp_ret,
  316. struct erofs_map_blocks *map,
  317. pgoff_t index, unsigned pageofs,
  318. enum z_erofs_vle_work_role *role,
  319. z_erofs_vle_owned_workgrp_t *owned_head,
  320. bool *hosted)
  321. {
  322. bool newgrp = false;
  323. struct z_erofs_vle_workgroup *grp = *grp_ret;
  324. struct z_erofs_vle_work *work;
  325. /* if multiref is disabled, grp should never be nullptr */
  326. if (unlikely(grp)) {
  327. DBG_BUGON(1);
  328. return ERR_PTR(-EINVAL);
  329. }
  330. /* no available workgroup, let's allocate one */
  331. grp = kmem_cache_zalloc(z_erofs_workgroup_cachep, GFP_NOFS);
  332. if (unlikely(grp == NULL))
  333. return ERR_PTR(-ENOMEM);
  334. grp->obj.index = index;
  335. grp->llen = map->m_llen;
  336. z_erofs_vle_set_workgrp_fmt(grp,
  337. (map->m_flags & EROFS_MAP_ZIPPED) ?
  338. Z_EROFS_VLE_WORKGRP_FMT_LZ4 :
  339. Z_EROFS_VLE_WORKGRP_FMT_PLAIN);
  340. atomic_set(&grp->obj.refcount, 1);
  341. /* new workgrps have been claimed as type 1 */
  342. WRITE_ONCE(grp->next, *owned_head);
  343. /* primary and followed work for all new workgrps */
  344. *role = Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED;
  345. /* it should be submitted by ourselves */
  346. *hosted = true;
  347. newgrp = true;
  348. work = z_erofs_vle_grab_primary_work(grp);
  349. work->pageofs = pageofs;
  350. mutex_init(&work->lock);
  351. if (newgrp) {
  352. int err = erofs_register_workgroup(sb, &grp->obj, 0);
  353. if (err) {
  354. kmem_cache_free(z_erofs_workgroup_cachep, grp);
  355. return ERR_PTR(-EAGAIN);
  356. }
  357. }
  358. *owned_head = *grp_ret = grp;
  359. mutex_lock(&work->lock);
  360. return work;
  361. }
  362. static inline void __update_workgrp_llen(struct z_erofs_vle_workgroup *grp,
  363. unsigned int llen)
  364. {
  365. while (1) {
  366. unsigned int orig_llen = grp->llen;
  367. if (orig_llen >= llen || orig_llen ==
  368. cmpxchg(&grp->llen, orig_llen, llen))
  369. break;
  370. }
  371. }
  372. #define builder_is_hooked(builder) \
  373. ((builder)->role >= Z_EROFS_VLE_WORK_PRIMARY_HOOKED)
  374. #define builder_is_followed(builder) \
  375. ((builder)->role >= Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED)
  376. static int z_erofs_vle_work_iter_begin(struct z_erofs_vle_work_builder *builder,
  377. struct super_block *sb,
  378. struct erofs_map_blocks *map,
  379. z_erofs_vle_owned_workgrp_t *owned_head)
  380. {
  381. const unsigned clusterpages = erofs_clusterpages(EROFS_SB(sb));
  382. const erofs_blk_t index = erofs_blknr(map->m_pa);
  383. const unsigned pageofs = map->m_la & ~PAGE_MASK;
  384. struct z_erofs_vle_workgroup *grp;
  385. struct z_erofs_vle_work *work;
  386. DBG_BUGON(builder->work != NULL);
  387. /* must be Z_EROFS_WORK_TAIL or the next chained work */
  388. DBG_BUGON(*owned_head == Z_EROFS_VLE_WORKGRP_NIL);
  389. DBG_BUGON(*owned_head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
  390. DBG_BUGON(erofs_blkoff(map->m_pa));
  391. repeat:
  392. work = z_erofs_vle_work_lookup(sb, index,
  393. pageofs, &grp, &builder->role, owned_head, &builder->hosted);
  394. if (work != NULL) {
  395. __update_workgrp_llen(grp, map->m_llen);
  396. goto got_it;
  397. }
  398. work = z_erofs_vle_work_register(sb, &grp, map, index, pageofs,
  399. &builder->role, owned_head, &builder->hosted);
  400. if (unlikely(work == ERR_PTR(-EAGAIN)))
  401. goto repeat;
  402. if (unlikely(IS_ERR(work)))
  403. return PTR_ERR(work);
  404. got_it:
  405. z_erofs_pagevec_ctor_init(&builder->vector,
  406. Z_EROFS_VLE_INLINE_PAGEVECS, work->pagevec, work->vcnt);
  407. if (builder->role >= Z_EROFS_VLE_WORK_PRIMARY) {
  408. /* enable possibly in-place decompression */
  409. builder->compressed_pages = grp->compressed_pages;
  410. builder->compressed_deficit = clusterpages;
  411. } else {
  412. builder->compressed_pages = NULL;
  413. builder->compressed_deficit = 0;
  414. }
  415. builder->grp = grp;
  416. builder->work = work;
  417. return 0;
  418. }
  419. /*
  420. * keep in mind that no referenced workgroups will be freed
  421. * only after a RCU grace period, so rcu_read_lock() could
  422. * prevent a workgroup from being freed.
  423. */
  424. static void z_erofs_rcu_callback(struct rcu_head *head)
  425. {
  426. struct z_erofs_vle_work *work = container_of(head,
  427. struct z_erofs_vle_work, rcu);
  428. struct z_erofs_vle_workgroup *grp =
  429. z_erofs_vle_work_workgroup(work, true);
  430. kmem_cache_free(z_erofs_workgroup_cachep, grp);
  431. }
  432. void erofs_workgroup_free_rcu(struct erofs_workgroup *grp)
  433. {
  434. struct z_erofs_vle_workgroup *const vgrp = container_of(grp,
  435. struct z_erofs_vle_workgroup, obj);
  436. struct z_erofs_vle_work *const work = &vgrp->work;
  437. call_rcu(&work->rcu, z_erofs_rcu_callback);
  438. }
  439. static void __z_erofs_vle_work_release(struct z_erofs_vle_workgroup *grp,
  440. struct z_erofs_vle_work *work __maybe_unused)
  441. {
  442. erofs_workgroup_put(&grp->obj);
  443. }
  444. void z_erofs_vle_work_release(struct z_erofs_vle_work *work)
  445. {
  446. struct z_erofs_vle_workgroup *grp =
  447. z_erofs_vle_work_workgroup(work, true);
  448. __z_erofs_vle_work_release(grp, work);
  449. }
  450. static inline bool
  451. z_erofs_vle_work_iter_end(struct z_erofs_vle_work_builder *builder)
  452. {
  453. struct z_erofs_vle_work *work = builder->work;
  454. if (work == NULL)
  455. return false;
  456. z_erofs_pagevec_ctor_exit(&builder->vector, false);
  457. mutex_unlock(&work->lock);
  458. /*
  459. * if all pending pages are added, don't hold work reference
  460. * any longer if the current work isn't hosted by ourselves.
  461. */
  462. if (!builder->hosted)
  463. __z_erofs_vle_work_release(builder->grp, work);
  464. builder->work = NULL;
  465. builder->grp = NULL;
  466. return true;
  467. }
  468. static inline struct page *__stagingpage_alloc(struct list_head *pagepool,
  469. gfp_t gfp)
  470. {
  471. struct page *page = erofs_allocpage(pagepool, gfp);
  472. if (unlikely(page == NULL))
  473. return NULL;
  474. page->mapping = Z_EROFS_MAPPING_STAGING;
  475. return page;
  476. }
  477. struct z_erofs_vle_frontend {
  478. struct inode *const inode;
  479. struct z_erofs_vle_work_builder builder;
  480. struct erofs_map_blocks_iter m_iter;
  481. z_erofs_vle_owned_workgrp_t owned_head;
  482. bool initial;
  483. #if (EROFS_FS_ZIP_CACHE_LVL >= 2)
  484. erofs_off_t cachedzone_la;
  485. #endif
  486. };
  487. #define VLE_FRONTEND_INIT(__i) { \
  488. .inode = __i, \
  489. .m_iter = { \
  490. { .m_llen = 0, .m_plen = 0 }, \
  491. .mpage = NULL \
  492. }, \
  493. .builder = VLE_WORK_BUILDER_INIT(), \
  494. .owned_head = Z_EROFS_VLE_WORKGRP_TAIL, \
  495. .initial = true, }
  496. static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
  497. struct page *page,
  498. struct list_head *page_pool)
  499. {
  500. struct super_block *const sb = fe->inode->i_sb;
  501. struct erofs_sb_info *const sbi __maybe_unused = EROFS_SB(sb);
  502. struct erofs_map_blocks_iter *const m = &fe->m_iter;
  503. struct erofs_map_blocks *const map = &m->map;
  504. struct z_erofs_vle_work_builder *const builder = &fe->builder;
  505. const loff_t offset = page_offset(page);
  506. bool tight = builder_is_hooked(builder);
  507. struct z_erofs_vle_work *work = builder->work;
  508. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  509. struct address_space *const mngda = sbi->managed_cache->i_mapping;
  510. struct z_erofs_vle_workgroup *grp;
  511. bool noio_outoforder;
  512. #endif
  513. enum z_erofs_page_type page_type;
  514. unsigned cur, end, spiltted, index;
  515. int err = 0;
  516. /* register locked file pages as online pages in pack */
  517. z_erofs_onlinepage_init(page);
  518. spiltted = 0;
  519. end = PAGE_SIZE;
  520. repeat:
  521. cur = end - 1;
  522. /* lucky, within the range of the current map_blocks */
  523. if (offset + cur >= map->m_la &&
  524. offset + cur < map->m_la + map->m_llen) {
  525. /* didn't get a valid unzip work previously (very rare) */
  526. if (!builder->work)
  527. goto restart_now;
  528. goto hitted;
  529. }
  530. /* go ahead the next map_blocks */
  531. debugln("%s: [out-of-range] pos %llu", __func__, offset + cur);
  532. if (z_erofs_vle_work_iter_end(builder))
  533. fe->initial = false;
  534. map->m_la = offset + cur;
  535. map->m_llen = 0;
  536. err = erofs_map_blocks_iter(fe->inode, map, &m->mpage, 0);
  537. if (unlikely(err))
  538. goto err_out;
  539. restart_now:
  540. if (unlikely(!(map->m_flags & EROFS_MAP_MAPPED)))
  541. goto hitted;
  542. DBG_BUGON(map->m_plen != 1 << sbi->clusterbits);
  543. DBG_BUGON(erofs_blkoff(map->m_pa));
  544. err = z_erofs_vle_work_iter_begin(builder, sb, map, &fe->owned_head);
  545. if (unlikely(err))
  546. goto err_out;
  547. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  548. grp = fe->builder.grp;
  549. /* let's do out-of-order decompression for noio */
  550. noio_outoforder = grab_managed_cache_pages(mngda,
  551. erofs_blknr(map->m_pa),
  552. grp->compressed_pages, erofs_blknr(map->m_plen),
  553. /* compressed page caching selection strategy */
  554. fe->initial | (EROFS_FS_ZIP_CACHE_LVL >= 2 ?
  555. map->m_la < fe->cachedzone_la : 0));
  556. if (noio_outoforder && builder_is_followed(builder))
  557. builder->role = Z_EROFS_VLE_WORK_PRIMARY;
  558. #endif
  559. tight &= builder_is_hooked(builder);
  560. work = builder->work;
  561. hitted:
  562. cur = end - min_t(unsigned, offset + end - map->m_la, end);
  563. if (unlikely(!(map->m_flags & EROFS_MAP_MAPPED))) {
  564. zero_user_segment(page, cur, end);
  565. goto next_part;
  566. }
  567. /* let's derive page type */
  568. page_type = cur ? Z_EROFS_VLE_PAGE_TYPE_HEAD :
  569. (!spiltted ? Z_EROFS_PAGE_TYPE_EXCLUSIVE :
  570. (tight ? Z_EROFS_PAGE_TYPE_EXCLUSIVE :
  571. Z_EROFS_VLE_PAGE_TYPE_TAIL_SHARED));
  572. if (cur)
  573. tight &= builder_is_followed(builder);
  574. retry:
  575. err = z_erofs_vle_work_add_page(builder, page, page_type,
  576. builder_is_followed(builder));
  577. /* should allocate an additional staging page for pagevec */
  578. if (err == -EAGAIN) {
  579. struct page *const newpage =
  580. __stagingpage_alloc(page_pool, GFP_NOFS);
  581. err = z_erofs_vle_work_add_page(builder,
  582. newpage, Z_EROFS_PAGE_TYPE_EXCLUSIVE, true);
  583. if (likely(!err))
  584. goto retry;
  585. }
  586. if (unlikely(err))
  587. goto err_out;
  588. index = page->index - map->m_la / PAGE_SIZE;
  589. /* FIXME! avoid the last relundant fixup & endio */
  590. z_erofs_onlinepage_fixup(page, index, true);
  591. /* bump up the number of spiltted parts of a page */
  592. ++spiltted;
  593. /* also update nr_pages */
  594. work->nr_pages = max_t(pgoff_t, work->nr_pages, index + 1);
  595. next_part:
  596. /* can be used for verification */
  597. map->m_llen = offset + cur - map->m_la;
  598. end = cur;
  599. if (end > 0)
  600. goto repeat;
  601. out:
  602. /* FIXME! avoid the last relundant fixup & endio */
  603. z_erofs_onlinepage_endio(page);
  604. debugln("%s, finish page: %pK spiltted: %u map->m_llen %llu",
  605. __func__, page, spiltted, map->m_llen);
  606. return err;
  607. /* if some error occurred while processing this page */
  608. err_out:
  609. SetPageError(page);
  610. goto out;
  611. }
  612. static void z_erofs_vle_unzip_kickoff(void *ptr, int bios)
  613. {
  614. tagptr1_t t = tagptr_init(tagptr1_t, ptr);
  615. struct z_erofs_vle_unzip_io *io = tagptr_unfold_ptr(t);
  616. bool background = tagptr_unfold_tags(t);
  617. if (!background) {
  618. unsigned long flags;
  619. spin_lock_irqsave(&io->u.wait.lock, flags);
  620. if (!atomic_add_return(bios, &io->pending_bios))
  621. wake_up_locked(&io->u.wait);
  622. spin_unlock_irqrestore(&io->u.wait.lock, flags);
  623. return;
  624. }
  625. if (!atomic_add_return(bios, &io->pending_bios))
  626. queue_work(z_erofs_workqueue, &io->u.work);
  627. }
  628. static inline void z_erofs_vle_read_endio(struct bio *bio)
  629. {
  630. const blk_status_t err = bio->bi_status;
  631. unsigned i;
  632. struct bio_vec *bvec;
  633. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  634. struct address_space *mngda = NULL;
  635. #endif
  636. bio_for_each_segment_all(bvec, bio, i) {
  637. struct page *page = bvec->bv_page;
  638. bool cachemngd = false;
  639. DBG_BUGON(PageUptodate(page));
  640. DBG_BUGON(!page->mapping);
  641. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  642. if (unlikely(mngda == NULL && !z_erofs_is_stagingpage(page))) {
  643. struct inode *const inode = page->mapping->host;
  644. struct super_block *const sb = inode->i_sb;
  645. mngda = EROFS_SB(sb)->managed_cache->i_mapping;
  646. }
  647. /*
  648. * If mngda has not gotten, it equals NULL,
  649. * however, page->mapping never be NULL if working properly.
  650. */
  651. cachemngd = (page->mapping == mngda);
  652. #endif
  653. if (unlikely(err))
  654. SetPageError(page);
  655. else if (cachemngd)
  656. SetPageUptodate(page);
  657. if (cachemngd)
  658. unlock_page(page);
  659. }
  660. z_erofs_vle_unzip_kickoff(bio->bi_private, -1);
  661. bio_put(bio);
  662. }
  663. static struct page *z_pagemap_global[Z_EROFS_VLE_VMAP_GLOBAL_PAGES];
  664. static DEFINE_MUTEX(z_pagemap_global_lock);
  665. static int z_erofs_vle_unzip(struct super_block *sb,
  666. struct z_erofs_vle_workgroup *grp,
  667. struct list_head *page_pool)
  668. {
  669. struct erofs_sb_info *const sbi = EROFS_SB(sb);
  670. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  671. struct address_space *const mngda = sbi->managed_cache->i_mapping;
  672. #endif
  673. const unsigned clusterpages = erofs_clusterpages(sbi);
  674. struct z_erofs_pagevec_ctor ctor;
  675. unsigned int nr_pages;
  676. unsigned int sparsemem_pages = 0;
  677. struct page *pages_onstack[Z_EROFS_VLE_VMAP_ONSTACK_PAGES];
  678. struct page **pages, **compressed_pages, *page;
  679. unsigned i, llen;
  680. enum z_erofs_page_type page_type;
  681. bool overlapped;
  682. struct z_erofs_vle_work *work;
  683. void *vout;
  684. int err;
  685. might_sleep();
  686. work = z_erofs_vle_grab_primary_work(grp);
  687. DBG_BUGON(!READ_ONCE(work->nr_pages));
  688. mutex_lock(&work->lock);
  689. nr_pages = work->nr_pages;
  690. if (likely(nr_pages <= Z_EROFS_VLE_VMAP_ONSTACK_PAGES))
  691. pages = pages_onstack;
  692. else if (nr_pages <= Z_EROFS_VLE_VMAP_GLOBAL_PAGES &&
  693. mutex_trylock(&z_pagemap_global_lock))
  694. pages = z_pagemap_global;
  695. else {
  696. repeat:
  697. pages = kvmalloc_array(nr_pages,
  698. sizeof(struct page *), GFP_KERNEL);
  699. /* fallback to global pagemap for the lowmem scenario */
  700. if (unlikely(pages == NULL)) {
  701. if (nr_pages > Z_EROFS_VLE_VMAP_GLOBAL_PAGES)
  702. goto repeat;
  703. else {
  704. mutex_lock(&z_pagemap_global_lock);
  705. pages = z_pagemap_global;
  706. }
  707. }
  708. }
  709. for (i = 0; i < nr_pages; ++i)
  710. pages[i] = NULL;
  711. err = 0;
  712. z_erofs_pagevec_ctor_init(&ctor,
  713. Z_EROFS_VLE_INLINE_PAGEVECS, work->pagevec, 0);
  714. for (i = 0; i < work->vcnt; ++i) {
  715. unsigned pagenr;
  716. page = z_erofs_pagevec_ctor_dequeue(&ctor, &page_type);
  717. /* all pages in pagevec ought to be valid */
  718. DBG_BUGON(page == NULL);
  719. DBG_BUGON(page->mapping == NULL);
  720. if (z_erofs_gather_if_stagingpage(page_pool, page))
  721. continue;
  722. if (page_type == Z_EROFS_VLE_PAGE_TYPE_HEAD)
  723. pagenr = 0;
  724. else
  725. pagenr = z_erofs_onlinepage_index(page);
  726. DBG_BUGON(pagenr >= nr_pages);
  727. /*
  728. * currently EROFS doesn't support multiref(dedup),
  729. * so here erroring out one multiref page.
  730. */
  731. if (pages[pagenr]) {
  732. DBG_BUGON(1);
  733. SetPageError(pages[pagenr]);
  734. z_erofs_onlinepage_endio(pages[pagenr]);
  735. err = -EIO;
  736. }
  737. pages[pagenr] = page;
  738. }
  739. sparsemem_pages = i;
  740. z_erofs_pagevec_ctor_exit(&ctor, true);
  741. overlapped = false;
  742. compressed_pages = grp->compressed_pages;
  743. for (i = 0; i < clusterpages; ++i) {
  744. unsigned pagenr;
  745. page = compressed_pages[i];
  746. /* all compressed pages ought to be valid */
  747. DBG_BUGON(page == NULL);
  748. DBG_BUGON(page->mapping == NULL);
  749. if (!z_erofs_is_stagingpage(page)) {
  750. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  751. if (page->mapping == mngda) {
  752. if (unlikely(!PageUptodate(page)))
  753. err = -EIO;
  754. continue;
  755. }
  756. #endif
  757. /*
  758. * only if non-head page can be selected
  759. * for inplace decompression
  760. */
  761. pagenr = z_erofs_onlinepage_index(page);
  762. DBG_BUGON(pagenr >= nr_pages);
  763. if (pages[pagenr]) {
  764. DBG_BUGON(1);
  765. SetPageError(pages[pagenr]);
  766. z_erofs_onlinepage_endio(pages[pagenr]);
  767. err = -EIO;
  768. }
  769. ++sparsemem_pages;
  770. pages[pagenr] = page;
  771. overlapped = true;
  772. }
  773. /* PG_error needs checking for inplaced and staging pages */
  774. if (unlikely(PageError(page))) {
  775. DBG_BUGON(PageUptodate(page));
  776. err = -EIO;
  777. }
  778. }
  779. if (unlikely(err))
  780. goto out;
  781. llen = (nr_pages << PAGE_SHIFT) - work->pageofs;
  782. if (z_erofs_vle_workgrp_fmt(grp) == Z_EROFS_VLE_WORKGRP_FMT_PLAIN) {
  783. err = z_erofs_vle_plain_copy(compressed_pages, clusterpages,
  784. pages, nr_pages, work->pageofs);
  785. goto out;
  786. }
  787. if (llen > grp->llen)
  788. llen = grp->llen;
  789. err = z_erofs_vle_unzip_fast_percpu(compressed_pages, clusterpages,
  790. pages, llen, work->pageofs);
  791. if (err != -ENOTSUPP)
  792. goto out;
  793. if (sparsemem_pages >= nr_pages)
  794. goto skip_allocpage;
  795. for (i = 0; i < nr_pages; ++i) {
  796. if (pages[i] != NULL)
  797. continue;
  798. pages[i] = __stagingpage_alloc(page_pool, GFP_NOFS);
  799. }
  800. skip_allocpage:
  801. vout = erofs_vmap(pages, nr_pages);
  802. if (!vout) {
  803. err = -ENOMEM;
  804. goto out;
  805. }
  806. err = z_erofs_vle_unzip_vmap(compressed_pages,
  807. clusterpages, vout, llen, work->pageofs, overlapped);
  808. erofs_vunmap(vout, nr_pages);
  809. out:
  810. /* must handle all compressed pages before endding pages */
  811. for (i = 0; i < clusterpages; ++i) {
  812. page = compressed_pages[i];
  813. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  814. if (page->mapping == mngda)
  815. continue;
  816. #endif
  817. /* recycle all individual staging pages */
  818. (void)z_erofs_gather_if_stagingpage(page_pool, page);
  819. WRITE_ONCE(compressed_pages[i], NULL);
  820. }
  821. for (i = 0; i < nr_pages; ++i) {
  822. page = pages[i];
  823. if (!page)
  824. continue;
  825. DBG_BUGON(page->mapping == NULL);
  826. /* recycle all individual staging pages */
  827. if (z_erofs_gather_if_stagingpage(page_pool, page))
  828. continue;
  829. if (unlikely(err < 0))
  830. SetPageError(page);
  831. z_erofs_onlinepage_endio(page);
  832. }
  833. if (pages == z_pagemap_global)
  834. mutex_unlock(&z_pagemap_global_lock);
  835. else if (unlikely(pages != pages_onstack))
  836. kvfree(pages);
  837. work->nr_pages = 0;
  838. work->vcnt = 0;
  839. /* all work locks MUST be taken before the following line */
  840. WRITE_ONCE(grp->next, Z_EROFS_VLE_WORKGRP_NIL);
  841. /* all work locks SHOULD be released right now */
  842. mutex_unlock(&work->lock);
  843. z_erofs_vle_work_release(work);
  844. return err;
  845. }
  846. static void z_erofs_vle_unzip_all(struct super_block *sb,
  847. struct z_erofs_vle_unzip_io *io,
  848. struct list_head *page_pool)
  849. {
  850. z_erofs_vle_owned_workgrp_t owned = io->head;
  851. while (owned != Z_EROFS_VLE_WORKGRP_TAIL_CLOSED) {
  852. struct z_erofs_vle_workgroup *grp;
  853. /* no possible that 'owned' equals Z_EROFS_WORK_TPTR_TAIL */
  854. DBG_BUGON(owned == Z_EROFS_VLE_WORKGRP_TAIL);
  855. /* no possible that 'owned' equals NULL */
  856. DBG_BUGON(owned == Z_EROFS_VLE_WORKGRP_NIL);
  857. grp = owned;
  858. owned = READ_ONCE(grp->next);
  859. z_erofs_vle_unzip(sb, grp, page_pool);
  860. }
  861. }
  862. static void z_erofs_vle_unzip_wq(struct work_struct *work)
  863. {
  864. struct z_erofs_vle_unzip_io_sb *iosb = container_of(work,
  865. struct z_erofs_vle_unzip_io_sb, io.u.work);
  866. LIST_HEAD(page_pool);
  867. DBG_BUGON(iosb->io.head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
  868. z_erofs_vle_unzip_all(iosb->sb, &iosb->io, &page_pool);
  869. put_pages_list(&page_pool);
  870. kvfree(iosb);
  871. }
  872. static inline struct z_erofs_vle_unzip_io *
  873. prepare_io_handler(struct super_block *sb,
  874. struct z_erofs_vle_unzip_io *io,
  875. bool background)
  876. {
  877. struct z_erofs_vle_unzip_io_sb *iosb;
  878. if (!background) {
  879. /* waitqueue available for foreground io */
  880. BUG_ON(io == NULL);
  881. init_waitqueue_head(&io->u.wait);
  882. atomic_set(&io->pending_bios, 0);
  883. goto out;
  884. }
  885. if (io != NULL)
  886. BUG();
  887. else {
  888. /* allocate extra io descriptor for background io */
  889. iosb = kvzalloc(sizeof(struct z_erofs_vle_unzip_io_sb),
  890. GFP_KERNEL | __GFP_NOFAIL);
  891. BUG_ON(iosb == NULL);
  892. io = &iosb->io;
  893. }
  894. iosb->sb = sb;
  895. INIT_WORK(&io->u.work, z_erofs_vle_unzip_wq);
  896. out:
  897. io->head = Z_EROFS_VLE_WORKGRP_TAIL_CLOSED;
  898. return io;
  899. }
  900. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  901. /* true - unlocked (noio), false - locked (need submit io) */
  902. static inline bool recover_managed_page(struct z_erofs_vle_workgroup *grp,
  903. struct page *page)
  904. {
  905. wait_on_page_locked(page);
  906. if (PagePrivate(page) && PageUptodate(page))
  907. return true;
  908. lock_page(page);
  909. ClearPageError(page);
  910. if (unlikely(!PagePrivate(page))) {
  911. set_page_private(page, (unsigned long)grp);
  912. SetPagePrivate(page);
  913. }
  914. if (unlikely(PageUptodate(page))) {
  915. unlock_page(page);
  916. return true;
  917. }
  918. return false;
  919. }
  920. #define __FSIO_1 1
  921. #else
  922. #define __FSIO_1 0
  923. #endif
  924. static bool z_erofs_vle_submit_all(struct super_block *sb,
  925. z_erofs_vle_owned_workgrp_t owned_head,
  926. struct list_head *pagepool,
  927. struct z_erofs_vle_unzip_io *fg_io,
  928. bool force_fg)
  929. {
  930. struct erofs_sb_info *const sbi = EROFS_SB(sb);
  931. const unsigned clusterpages = erofs_clusterpages(sbi);
  932. const gfp_t gfp = GFP_NOFS;
  933. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  934. struct address_space *const mngda = sbi->managed_cache->i_mapping;
  935. struct z_erofs_vle_workgroup *lstgrp_noio = NULL, *lstgrp_io = NULL;
  936. #endif
  937. struct z_erofs_vle_unzip_io *ios[1 + __FSIO_1];
  938. struct bio *bio;
  939. tagptr1_t bi_private;
  940. /* since bio will be NULL, no need to initialize last_index */
  941. pgoff_t uninitialized_var(last_index);
  942. bool force_submit = false;
  943. unsigned nr_bios;
  944. if (unlikely(owned_head == Z_EROFS_VLE_WORKGRP_TAIL))
  945. return false;
  946. /*
  947. * force_fg == 1, (io, fg_io[0]) no io, (io, fg_io[1]) need submit io
  948. * force_fg == 0, (io, fg_io[0]) no io; (io[1], bg_io) need submit io
  949. */
  950. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  951. ios[0] = prepare_io_handler(sb, fg_io + 0, false);
  952. #endif
  953. if (force_fg) {
  954. ios[__FSIO_1] = prepare_io_handler(sb, fg_io + __FSIO_1, false);
  955. bi_private = tagptr_fold(tagptr1_t, ios[__FSIO_1], 0);
  956. } else {
  957. ios[__FSIO_1] = prepare_io_handler(sb, NULL, true);
  958. bi_private = tagptr_fold(tagptr1_t, ios[__FSIO_1], 1);
  959. }
  960. nr_bios = 0;
  961. force_submit = false;
  962. bio = NULL;
  963. /* by default, all need io submission */
  964. ios[__FSIO_1]->head = owned_head;
  965. do {
  966. struct z_erofs_vle_workgroup *grp;
  967. struct page **compressed_pages, *oldpage, *page;
  968. pgoff_t first_index;
  969. unsigned i = 0;
  970. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  971. unsigned int noio = 0;
  972. bool cachemngd;
  973. #endif
  974. int err;
  975. /* no possible 'owned_head' equals the following */
  976. DBG_BUGON(owned_head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
  977. DBG_BUGON(owned_head == Z_EROFS_VLE_WORKGRP_NIL);
  978. grp = owned_head;
  979. /* close the main owned chain at first */
  980. owned_head = cmpxchg(&grp->next, Z_EROFS_VLE_WORKGRP_TAIL,
  981. Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
  982. first_index = grp->obj.index;
  983. compressed_pages = grp->compressed_pages;
  984. force_submit |= (first_index != last_index + 1);
  985. repeat:
  986. /* fulfill all compressed pages */
  987. oldpage = page = READ_ONCE(compressed_pages[i]);
  988. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  989. cachemngd = false;
  990. if (page == EROFS_UNALLOCATED_CACHED_PAGE) {
  991. cachemngd = true;
  992. goto do_allocpage;
  993. } else if (page != NULL) {
  994. if (page->mapping != mngda)
  995. BUG_ON(PageUptodate(page));
  996. else if (recover_managed_page(grp, page)) {
  997. /* page is uptodate, skip io submission */
  998. force_submit = true;
  999. ++noio;
  1000. goto skippage;
  1001. }
  1002. } else {
  1003. do_allocpage:
  1004. #else
  1005. if (page != NULL)
  1006. BUG_ON(PageUptodate(page));
  1007. else {
  1008. #endif
  1009. page = __stagingpage_alloc(pagepool, gfp);
  1010. if (oldpage != cmpxchg(compressed_pages + i,
  1011. oldpage, page)) {
  1012. list_add(&page->lru, pagepool);
  1013. goto repeat;
  1014. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  1015. } else if (cachemngd && !add_to_page_cache_lru(page,
  1016. mngda, first_index + i, gfp)) {
  1017. set_page_private(page, (unsigned long)grp);
  1018. SetPagePrivate(page);
  1019. #endif
  1020. }
  1021. }
  1022. if (bio != NULL && force_submit) {
  1023. submit_bio_retry:
  1024. __submit_bio(bio, REQ_OP_READ, 0);
  1025. bio = NULL;
  1026. }
  1027. if (bio == NULL) {
  1028. bio = prepare_bio(sb, first_index + i,
  1029. BIO_MAX_PAGES, z_erofs_vle_read_endio);
  1030. bio->bi_private = tagptr_cast_ptr(bi_private);
  1031. ++nr_bios;
  1032. }
  1033. err = bio_add_page(bio, page, PAGE_SIZE, 0);
  1034. if (err < PAGE_SIZE)
  1035. goto submit_bio_retry;
  1036. force_submit = false;
  1037. last_index = first_index + i;
  1038. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  1039. skippage:
  1040. #endif
  1041. if (++i < clusterpages)
  1042. goto repeat;
  1043. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  1044. if (noio < clusterpages) {
  1045. lstgrp_io = grp;
  1046. } else {
  1047. z_erofs_vle_owned_workgrp_t iogrp_next =
  1048. owned_head == Z_EROFS_VLE_WORKGRP_TAIL ?
  1049. Z_EROFS_VLE_WORKGRP_TAIL_CLOSED :
  1050. owned_head;
  1051. if (lstgrp_io == NULL)
  1052. ios[1]->head = iogrp_next;
  1053. else
  1054. WRITE_ONCE(lstgrp_io->next, iogrp_next);
  1055. if (lstgrp_noio == NULL)
  1056. ios[0]->head = grp;
  1057. else
  1058. WRITE_ONCE(lstgrp_noio->next, grp);
  1059. lstgrp_noio = grp;
  1060. }
  1061. #endif
  1062. } while (owned_head != Z_EROFS_VLE_WORKGRP_TAIL);
  1063. if (bio != NULL)
  1064. __submit_bio(bio, REQ_OP_READ, 0);
  1065. #ifndef EROFS_FS_HAS_MANAGED_CACHE
  1066. BUG_ON(!nr_bios);
  1067. #else
  1068. if (lstgrp_noio != NULL)
  1069. WRITE_ONCE(lstgrp_noio->next, Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
  1070. if (!force_fg && !nr_bios) {
  1071. kvfree(container_of(ios[1],
  1072. struct z_erofs_vle_unzip_io_sb, io));
  1073. return true;
  1074. }
  1075. #endif
  1076. z_erofs_vle_unzip_kickoff(tagptr_cast_ptr(bi_private), nr_bios);
  1077. return true;
  1078. }
  1079. static void z_erofs_submit_and_unzip(struct z_erofs_vle_frontend *f,
  1080. struct list_head *pagepool,
  1081. bool force_fg)
  1082. {
  1083. struct super_block *sb = f->inode->i_sb;
  1084. struct z_erofs_vle_unzip_io io[1 + __FSIO_1];
  1085. if (!z_erofs_vle_submit_all(sb, f->owned_head, pagepool, io, force_fg))
  1086. return;
  1087. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  1088. z_erofs_vle_unzip_all(sb, &io[0], pagepool);
  1089. #endif
  1090. if (!force_fg)
  1091. return;
  1092. /* wait until all bios are completed */
  1093. wait_event(io[__FSIO_1].u.wait,
  1094. !atomic_read(&io[__FSIO_1].pending_bios));
  1095. /* let's synchronous decompression */
  1096. z_erofs_vle_unzip_all(sb, &io[__FSIO_1], pagepool);
  1097. }
  1098. static int z_erofs_vle_normalaccess_readpage(struct file *file,
  1099. struct page *page)
  1100. {
  1101. struct inode *const inode = page->mapping->host;
  1102. struct z_erofs_vle_frontend f = VLE_FRONTEND_INIT(inode);
  1103. int err;
  1104. LIST_HEAD(pagepool);
  1105. #if (EROFS_FS_ZIP_CACHE_LVL >= 2)
  1106. f.cachedzone_la = page->index << PAGE_SHIFT;
  1107. #endif
  1108. err = z_erofs_do_read_page(&f, page, &pagepool);
  1109. (void)z_erofs_vle_work_iter_end(&f.builder);
  1110. /* if some compressed cluster ready, need submit them anyway */
  1111. z_erofs_submit_and_unzip(&f, &pagepool, true);
  1112. if (err)
  1113. errln("%s, failed to read, err [%d]", __func__, err);
  1114. if (f.m_iter.mpage != NULL)
  1115. put_page(f.m_iter.mpage);
  1116. /* clean up the remaining free pages */
  1117. put_pages_list(&pagepool);
  1118. return err;
  1119. }
  1120. static inline int __z_erofs_vle_normalaccess_readpages(
  1121. struct file *filp,
  1122. struct address_space *mapping,
  1123. struct list_head *pages, unsigned nr_pages, bool sync)
  1124. {
  1125. struct inode *const inode = mapping->host;
  1126. struct z_erofs_vle_frontend f = VLE_FRONTEND_INIT(inode);
  1127. gfp_t gfp = mapping_gfp_constraint(mapping, GFP_KERNEL);
  1128. struct page *head = NULL;
  1129. LIST_HEAD(pagepool);
  1130. #if (EROFS_FS_ZIP_CACHE_LVL >= 2)
  1131. f.cachedzone_la = lru_to_page(pages)->index << PAGE_SHIFT;
  1132. #endif
  1133. for (; nr_pages; --nr_pages) {
  1134. struct page *page = lru_to_page(pages);
  1135. prefetchw(&page->flags);
  1136. list_del(&page->lru);
  1137. if (add_to_page_cache_lru(page, mapping, page->index, gfp)) {
  1138. list_add(&page->lru, &pagepool);
  1139. continue;
  1140. }
  1141. set_page_private(page, (unsigned long)head);
  1142. head = page;
  1143. }
  1144. while (head != NULL) {
  1145. struct page *page = head;
  1146. int err;
  1147. /* traversal in reverse order */
  1148. head = (void *)page_private(page);
  1149. err = z_erofs_do_read_page(&f, page, &pagepool);
  1150. if (err) {
  1151. struct erofs_vnode *vi = EROFS_V(inode);
  1152. errln("%s, readahead error at page %lu of nid %llu",
  1153. __func__, page->index, vi->nid);
  1154. }
  1155. put_page(page);
  1156. }
  1157. (void)z_erofs_vle_work_iter_end(&f.builder);
  1158. z_erofs_submit_and_unzip(&f, &pagepool, sync);
  1159. if (f.m_iter.mpage != NULL)
  1160. put_page(f.m_iter.mpage);
  1161. /* clean up the remaining free pages */
  1162. put_pages_list(&pagepool);
  1163. return 0;
  1164. }
  1165. static int z_erofs_vle_normalaccess_readpages(
  1166. struct file *filp,
  1167. struct address_space *mapping,
  1168. struct list_head *pages, unsigned nr_pages)
  1169. {
  1170. return __z_erofs_vle_normalaccess_readpages(filp,
  1171. mapping, pages, nr_pages,
  1172. nr_pages < 4 /* sync */);
  1173. }
  1174. const struct address_space_operations z_erofs_vle_normalaccess_aops = {
  1175. .readpage = z_erofs_vle_normalaccess_readpage,
  1176. .readpages = z_erofs_vle_normalaccess_readpages,
  1177. };
  1178. #define __vle_cluster_advise(x, bit, bits) \
  1179. ((le16_to_cpu(x) >> (bit)) & ((1 << (bits)) - 1))
  1180. #define __vle_cluster_type(advise) __vle_cluster_advise(advise, \
  1181. Z_EROFS_VLE_DI_CLUSTER_TYPE_BIT, Z_EROFS_VLE_DI_CLUSTER_TYPE_BITS)
  1182. enum {
  1183. Z_EROFS_VLE_CLUSTER_TYPE_PLAIN,
  1184. Z_EROFS_VLE_CLUSTER_TYPE_HEAD,
  1185. Z_EROFS_VLE_CLUSTER_TYPE_NONHEAD,
  1186. Z_EROFS_VLE_CLUSTER_TYPE_RESERVED,
  1187. Z_EROFS_VLE_CLUSTER_TYPE_MAX
  1188. };
  1189. #define vle_cluster_type(di) \
  1190. __vle_cluster_type((di)->di_advise)
  1191. static inline unsigned
  1192. vle_compressed_index_clusterofs(unsigned clustersize,
  1193. struct z_erofs_vle_decompressed_index *di)
  1194. {
  1195. debugln("%s, vle=%pK, advise=%x (type %u), clusterofs=%x blkaddr=%x",
  1196. __func__, di, di->di_advise, vle_cluster_type(di),
  1197. di->di_clusterofs, di->di_u.blkaddr);
  1198. switch (vle_cluster_type(di)) {
  1199. case Z_EROFS_VLE_CLUSTER_TYPE_NONHEAD:
  1200. break;
  1201. case Z_EROFS_VLE_CLUSTER_TYPE_PLAIN:
  1202. case Z_EROFS_VLE_CLUSTER_TYPE_HEAD:
  1203. return di->di_clusterofs;
  1204. default:
  1205. BUG_ON(1);
  1206. }
  1207. return clustersize;
  1208. }
  1209. static inline erofs_blk_t
  1210. vle_extent_blkaddr(struct inode *inode, pgoff_t index)
  1211. {
  1212. struct erofs_sb_info *sbi = EROFS_I_SB(inode);
  1213. struct erofs_vnode *vi = EROFS_V(inode);
  1214. unsigned ofs = Z_EROFS_VLE_EXTENT_ALIGN(vi->inode_isize +
  1215. vi->xattr_isize) + sizeof(struct erofs_extent_header) +
  1216. index * sizeof(struct z_erofs_vle_decompressed_index);
  1217. return erofs_blknr(iloc(sbi, vi->nid) + ofs);
  1218. }
  1219. static inline unsigned int
  1220. vle_extent_blkoff(struct inode *inode, pgoff_t index)
  1221. {
  1222. struct erofs_sb_info *sbi = EROFS_I_SB(inode);
  1223. struct erofs_vnode *vi = EROFS_V(inode);
  1224. unsigned ofs = Z_EROFS_VLE_EXTENT_ALIGN(vi->inode_isize +
  1225. vi->xattr_isize) + sizeof(struct erofs_extent_header) +
  1226. index * sizeof(struct z_erofs_vle_decompressed_index);
  1227. return erofs_blkoff(iloc(sbi, vi->nid) + ofs);
  1228. }
  1229. /*
  1230. * Variable-sized Logical Extent (Fixed Physical Cluster) Compression Mode
  1231. * ---
  1232. * VLE compression mode attempts to compress a number of logical data into
  1233. * a physical cluster with a fixed size.
  1234. * VLE compression mode uses "struct z_erofs_vle_decompressed_index".
  1235. */
  1236. static erofs_off_t vle_get_logical_extent_head(
  1237. struct inode *inode,
  1238. struct page **page_iter,
  1239. void **kaddr_iter,
  1240. unsigned lcn, /* logical cluster number */
  1241. erofs_blk_t *pcn,
  1242. unsigned *flags)
  1243. {
  1244. /* for extent meta */
  1245. struct page *page = *page_iter;
  1246. erofs_blk_t blkaddr = vle_extent_blkaddr(inode, lcn);
  1247. struct z_erofs_vle_decompressed_index *di;
  1248. unsigned long long ofs;
  1249. const unsigned int clusterbits = EROFS_SB(inode->i_sb)->clusterbits;
  1250. const unsigned int clustersize = 1 << clusterbits;
  1251. unsigned int delta0;
  1252. if (page->index != blkaddr) {
  1253. kunmap_atomic(*kaddr_iter);
  1254. unlock_page(page);
  1255. put_page(page);
  1256. *page_iter = page = erofs_get_meta_page(inode->i_sb,
  1257. blkaddr, false);
  1258. *kaddr_iter = kmap_atomic(page);
  1259. }
  1260. di = *kaddr_iter + vle_extent_blkoff(inode, lcn);
  1261. switch (vle_cluster_type(di)) {
  1262. case Z_EROFS_VLE_CLUSTER_TYPE_NONHEAD:
  1263. delta0 = le16_to_cpu(di->di_u.delta[0]);
  1264. DBG_BUGON(!delta0);
  1265. DBG_BUGON(lcn < delta0);
  1266. ofs = vle_get_logical_extent_head(inode,
  1267. page_iter, kaddr_iter,
  1268. lcn - delta0, pcn, flags);
  1269. break;
  1270. case Z_EROFS_VLE_CLUSTER_TYPE_PLAIN:
  1271. *flags ^= EROFS_MAP_ZIPPED;
  1272. case Z_EROFS_VLE_CLUSTER_TYPE_HEAD:
  1273. /* clustersize should be a power of two */
  1274. ofs = ((unsigned long long)lcn << clusterbits) +
  1275. (le16_to_cpu(di->di_clusterofs) & (clustersize - 1));
  1276. *pcn = le32_to_cpu(di->di_u.blkaddr);
  1277. break;
  1278. default:
  1279. BUG_ON(1);
  1280. }
  1281. return ofs;
  1282. }
  1283. int z_erofs_map_blocks_iter(struct inode *inode,
  1284. struct erofs_map_blocks *map,
  1285. struct page **mpage_ret, int flags)
  1286. {
  1287. /* logicial extent (start, end) offset */
  1288. unsigned long long ofs, end;
  1289. struct z_erofs_vle_decompressed_index *di;
  1290. erofs_blk_t e_blkaddr, pcn;
  1291. unsigned lcn, logical_cluster_ofs, cluster_type;
  1292. u32 ofs_rem;
  1293. struct page *mpage = *mpage_ret;
  1294. void *kaddr;
  1295. bool initial;
  1296. const unsigned int clusterbits = EROFS_SB(inode->i_sb)->clusterbits;
  1297. const unsigned int clustersize = 1 << clusterbits;
  1298. int err = 0;
  1299. /* if both m_(l,p)len are 0, regularize l_lblk, l_lofs, etc... */
  1300. initial = !map->m_llen;
  1301. /* when trying to read beyond EOF, leave it unmapped */
  1302. if (unlikely(map->m_la >= inode->i_size)) {
  1303. BUG_ON(!initial);
  1304. map->m_llen = map->m_la + 1 - inode->i_size;
  1305. map->m_la = inode->i_size - 1;
  1306. map->m_flags = 0;
  1307. goto out;
  1308. }
  1309. debugln("%s, m_la %llu m_llen %llu --- start", __func__,
  1310. map->m_la, map->m_llen);
  1311. ofs = map->m_la + map->m_llen;
  1312. /* clustersize should be power of two */
  1313. lcn = ofs >> clusterbits;
  1314. ofs_rem = ofs & (clustersize - 1);
  1315. e_blkaddr = vle_extent_blkaddr(inode, lcn);
  1316. if (mpage == NULL || mpage->index != e_blkaddr) {
  1317. if (mpage != NULL)
  1318. put_page(mpage);
  1319. mpage = erofs_get_meta_page(inode->i_sb, e_blkaddr, false);
  1320. *mpage_ret = mpage;
  1321. } else {
  1322. lock_page(mpage);
  1323. DBG_BUGON(!PageUptodate(mpage));
  1324. }
  1325. kaddr = kmap_atomic(mpage);
  1326. di = kaddr + vle_extent_blkoff(inode, lcn);
  1327. debugln("%s, lcn %u e_blkaddr %u e_blkoff %u", __func__, lcn,
  1328. e_blkaddr, vle_extent_blkoff(inode, lcn));
  1329. logical_cluster_ofs = vle_compressed_index_clusterofs(clustersize, di);
  1330. if (!initial) {
  1331. /* [walking mode] 'map' has been already initialized */
  1332. map->m_llen += logical_cluster_ofs;
  1333. goto unmap_out;
  1334. }
  1335. /* by default, compressed */
  1336. map->m_flags |= EROFS_MAP_ZIPPED;
  1337. end = (u64)(lcn + 1) * clustersize;
  1338. cluster_type = vle_cluster_type(di);
  1339. switch (cluster_type) {
  1340. case Z_EROFS_VLE_CLUSTER_TYPE_PLAIN:
  1341. if (ofs_rem >= logical_cluster_ofs)
  1342. map->m_flags ^= EROFS_MAP_ZIPPED;
  1343. /* fallthrough */
  1344. case Z_EROFS_VLE_CLUSTER_TYPE_HEAD:
  1345. if (ofs_rem == logical_cluster_ofs) {
  1346. pcn = le32_to_cpu(di->di_u.blkaddr);
  1347. goto exact_hitted;
  1348. }
  1349. if (ofs_rem > logical_cluster_ofs) {
  1350. ofs = lcn * clustersize | logical_cluster_ofs;
  1351. pcn = le32_to_cpu(di->di_u.blkaddr);
  1352. break;
  1353. }
  1354. /* logical cluster number should be >= 1 */
  1355. if (unlikely(!lcn)) {
  1356. errln("invalid logical cluster 0 at nid %llu",
  1357. EROFS_V(inode)->nid);
  1358. err = -EIO;
  1359. goto unmap_out;
  1360. }
  1361. end = (lcn-- * clustersize) | logical_cluster_ofs;
  1362. /* fallthrough */
  1363. case Z_EROFS_VLE_CLUSTER_TYPE_NONHEAD:
  1364. /* get the correspoinding first chunk */
  1365. ofs = vle_get_logical_extent_head(inode, mpage_ret,
  1366. &kaddr, lcn, &pcn, &map->m_flags);
  1367. mpage = *mpage_ret;
  1368. break;
  1369. default:
  1370. errln("unknown cluster type %u at offset %llu of nid %llu",
  1371. cluster_type, ofs, EROFS_V(inode)->nid);
  1372. err = -EIO;
  1373. goto unmap_out;
  1374. }
  1375. map->m_la = ofs;
  1376. exact_hitted:
  1377. map->m_llen = end - ofs;
  1378. map->m_plen = clustersize;
  1379. map->m_pa = blknr_to_addr(pcn);
  1380. map->m_flags |= EROFS_MAP_MAPPED;
  1381. unmap_out:
  1382. kunmap_atomic(kaddr);
  1383. unlock_page(mpage);
  1384. out:
  1385. debugln("%s, m_la %llu m_pa %llu m_llen %llu m_plen %llu m_flags 0%o",
  1386. __func__, map->m_la, map->m_pa,
  1387. map->m_llen, map->m_plen, map->m_flags);
  1388. /* aggressively BUG_ON iff CONFIG_EROFS_FS_DEBUG is on */
  1389. DBG_BUGON(err < 0);
  1390. return err;
  1391. }

unzip_vle_lz4.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/unzip_vle_lz4.c
  4. *
  5. * Copyright (C) 2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "unzip_vle.h"
  14. #if Z_EROFS_CLUSTER_MAX_PAGES > Z_EROFS_VLE_INLINE_PAGEVECS
  15. #define EROFS_PERCPU_NR_PAGES Z_EROFS_CLUSTER_MAX_PAGES
  16. #else
  17. #define EROFS_PERCPU_NR_PAGES Z_EROFS_VLE_INLINE_PAGEVECS
  18. #endif
  19. static struct {
  20. char data[PAGE_SIZE * EROFS_PERCPU_NR_PAGES];
  21. } erofs_pcpubuf[NR_CPUS];
  22. int z_erofs_vle_plain_copy(struct page **compressed_pages,
  23. unsigned clusterpages,
  24. struct page **pages,
  25. unsigned nr_pages,
  26. unsigned short pageofs)
  27. {
  28. unsigned i, j;
  29. void *src = NULL;
  30. const unsigned righthalf = PAGE_SIZE - pageofs;
  31. char *percpu_data;
  32. bool mirrored[Z_EROFS_CLUSTER_MAX_PAGES] = { 0 };
  33. preempt_disable();
  34. percpu_data = erofs_pcpubuf[smp_processor_id()].data;
  35. j = 0;
  36. for (i = 0; i < nr_pages; j = i++) {
  37. struct page *page = pages[i];
  38. void *dst;
  39. if (page == NULL) {
  40. if (src != NULL) {
  41. if (!mirrored[j])
  42. kunmap_atomic(src);
  43. src = NULL;
  44. }
  45. continue;
  46. }
  47. dst = kmap_atomic(page);
  48. for (; j < clusterpages; ++j) {
  49. if (compressed_pages[j] != page)
  50. continue;
  51. DBG_BUGON(mirrored[j]);
  52. memcpy(percpu_data + j * PAGE_SIZE, dst, PAGE_SIZE);
  53. mirrored[j] = true;
  54. break;
  55. }
  56. if (i) {
  57. if (src == NULL)
  58. src = mirrored[i-1] ?
  59. percpu_data + (i-1) * PAGE_SIZE :
  60. kmap_atomic(compressed_pages[i-1]);
  61. memcpy(dst, src + righthalf, pageofs);
  62. if (!mirrored[i-1])
  63. kunmap_atomic(src);
  64. if (unlikely(i >= clusterpages)) {
  65. kunmap_atomic(dst);
  66. break;
  67. }
  68. }
  69. if (!righthalf)
  70. src = NULL;
  71. else {
  72. src = mirrored[i] ? percpu_data + i * PAGE_SIZE :
  73. kmap_atomic(compressed_pages[i]);
  74. memcpy(dst + pageofs, src, righthalf);
  75. }
  76. kunmap_atomic(dst);
  77. }
  78. if (src != NULL && !mirrored[j])
  79. kunmap_atomic(src);
  80. preempt_enable();
  81. return 0;
  82. }
  83. extern int z_erofs_unzip_lz4(void *in, void *out, size_t inlen, size_t outlen);
  84. int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages,
  85. unsigned clusterpages,
  86. struct page **pages,
  87. unsigned outlen,
  88. unsigned short pageofs)
  89. {
  90. void *vin, *vout;
  91. unsigned nr_pages, i, j;
  92. int ret;
  93. if (outlen + pageofs > EROFS_PERCPU_NR_PAGES * PAGE_SIZE)
  94. return -ENOTSUPP;
  95. nr_pages = DIV_ROUND_UP(outlen + pageofs, PAGE_SIZE);
  96. if (clusterpages == 1) {
  97. vin = kmap_atomic(compressed_pages[0]);
  98. } else {
  99. vin = erofs_vmap(compressed_pages, clusterpages);
  100. if (!vin)
  101. return -ENOMEM;
  102. }
  103. preempt_disable();
  104. vout = erofs_pcpubuf[smp_processor_id()].data;
  105. ret = z_erofs_unzip_lz4(vin, vout + pageofs,
  106. clusterpages * PAGE_SIZE, outlen);
  107. if (ret < 0)
  108. goto out;
  109. ret = 0;
  110. for (i = 0; i < nr_pages; ++i) {
  111. j = min((unsigned)PAGE_SIZE - pageofs, outlen);
  112. if (pages[i] != NULL) {
  113. if (clusterpages == 1 &&
  114. pages[i] == compressed_pages[0]) {
  115. memcpy(vin + pageofs, vout + pageofs, j);
  116. } else {
  117. void *dst = kmap_atomic(pages[i]);
  118. memcpy(dst + pageofs, vout + pageofs, j);
  119. kunmap_atomic(dst);
  120. }
  121. }
  122. vout += PAGE_SIZE;
  123. outlen -= j;
  124. pageofs = 0;
  125. }
  126. out:
  127. preempt_enable();
  128. if (clusterpages == 1)
  129. kunmap_atomic(vin);
  130. else
  131. erofs_vunmap(vin, clusterpages);
  132. return ret;
  133. }
  134. int z_erofs_vle_unzip_vmap(struct page **compressed_pages,
  135. unsigned clusterpages,
  136. void *vout,
  137. unsigned llen,
  138. unsigned short pageofs,
  139. bool overlapped)
  140. {
  141. void *vin;
  142. unsigned i;
  143. int ret;
  144. if (overlapped) {
  145. preempt_disable();
  146. vin = erofs_pcpubuf[smp_processor_id()].data;
  147. for (i = 0; i < clusterpages; ++i) {
  148. void *t = kmap_atomic(compressed_pages[i]);
  149. memcpy(vin + PAGE_SIZE *i, t, PAGE_SIZE);
  150. kunmap_atomic(t);
  151. }
  152. } else if (clusterpages == 1)
  153. vin = kmap_atomic(compressed_pages[0]);
  154. else {
  155. vin = erofs_vmap(compressed_pages, clusterpages);
  156. }
  157. ret = z_erofs_unzip_lz4(vin, vout + pageofs,
  158. clusterpages * PAGE_SIZE, llen);
  159. if (ret > 0)
  160. ret = 0;
  161. if (!overlapped) {
  162. if (clusterpages == 1)
  163. kunmap_atomic(vin);
  164. else {
  165. erofs_vunmap(vin, clusterpages);
  166. }
  167. } else
  168. preempt_enable();
  169. return ret;
  170. }

utils.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/utils.c
  4. *
  5. * Copyright (C) 2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include "internal.h"
  14. #include <linux/pagevec.h>
  15. struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp)
  16. {
  17. struct page *page;
  18. if (!list_empty(pool)) {
  19. page = lru_to_page(pool);
  20. list_del(&page->lru);
  21. } else {
  22. page = alloc_pages(gfp | __GFP_NOFAIL, 0);
  23. }
  24. return page;
  25. }
  26. /* global shrink count (for all mounted EROFS instances) */
  27. static atomic_long_t erofs_global_shrink_cnt;
  28. #ifdef CONFIG_EROFS_FS_ZIP
  29. /* radix_tree and the future XArray both don't use tagptr_t yet */
  30. struct erofs_workgroup *erofs_find_workgroup(
  31. struct super_block *sb, pgoff_t index, bool *tag)
  32. {
  33. struct erofs_sb_info *sbi = EROFS_SB(sb);
  34. struct erofs_workgroup *grp;
  35. int oldcount;
  36. repeat:
  37. rcu_read_lock();
  38. grp = radix_tree_lookup(&sbi->workstn_tree, index);
  39. if (grp != NULL) {
  40. *tag = radix_tree_exceptional_entry(grp);
  41. grp = (void *)((unsigned long)grp &
  42. ~RADIX_TREE_EXCEPTIONAL_ENTRY);
  43. if (erofs_workgroup_get(grp, &oldcount)) {
  44. /* prefer to relax rcu read side */
  45. rcu_read_unlock();
  46. goto repeat;
  47. }
  48. /* decrease refcount added by erofs_workgroup_put */
  49. if (unlikely(oldcount == 1))
  50. atomic_long_dec(&erofs_global_shrink_cnt);
  51. DBG_BUGON(index != grp->index);
  52. }
  53. rcu_read_unlock();
  54. return grp;
  55. }
  56. int erofs_register_workgroup(struct super_block *sb,
  57. struct erofs_workgroup *grp,
  58. bool tag)
  59. {
  60. struct erofs_sb_info *sbi;
  61. int err;
  62. /* grp shouldn't be broken or used before */
  63. if (unlikely(atomic_read(&grp->refcount) != 1)) {
  64. DBG_BUGON(1);
  65. return -EINVAL;
  66. }
  67. err = radix_tree_preload(GFP_NOFS);
  68. if (err)
  69. return err;
  70. sbi = EROFS_SB(sb);
  71. erofs_workstn_lock(sbi);
  72. if (tag)
  73. grp = (void *)((unsigned long)grp |
  74. 1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
  75. /*
  76. * Bump up reference count before making this workgroup
  77. * visible to other users in order to avoid potential UAF
  78. * without serialized by erofs_workstn_lock.
  79. */
  80. __erofs_workgroup_get(grp);
  81. err = radix_tree_insert(&sbi->workstn_tree,
  82. grp->index, grp);
  83. if (unlikely(err))
  84. /*
  85. * it's safe to decrease since the workgroup isn't visible
  86. * and refcount >= 2 (cannot be freezed).
  87. */
  88. __erofs_workgroup_put(grp);
  89. erofs_workstn_unlock(sbi);
  90. radix_tree_preload_end();
  91. return err;
  92. }
  93. extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
  94. static void __erofs_workgroup_free(struct erofs_workgroup *grp)
  95. {
  96. atomic_long_dec(&erofs_global_shrink_cnt);
  97. erofs_workgroup_free_rcu(grp);
  98. }
  99. int erofs_workgroup_put(struct erofs_workgroup *grp)
  100. {
  101. int count = atomic_dec_return(&grp->refcount);
  102. if (count == 1)
  103. atomic_long_inc(&erofs_global_shrink_cnt);
  104. else if (!count)
  105. __erofs_workgroup_free(grp);
  106. return count;
  107. }
  108. #ifdef EROFS_FS_HAS_MANAGED_CACHE
  109. /* for cache-managed case, customized reclaim paths exist */
  110. static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
  111. {
  112. erofs_workgroup_unfreeze(grp, 0);
  113. __erofs_workgroup_free(grp);
  114. }
  115. bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
  116. struct erofs_workgroup *grp,
  117. bool cleanup)
  118. {
  119. void *entry;
  120. /*
  121. * for managed cache enabled, the refcount of workgroups
  122. * themselves could be < 0 (freezed). So there is no guarantee
  123. * that all refcount > 0 if managed cache is enabled.
  124. */
  125. if (!erofs_workgroup_try_to_freeze(grp, 1))
  126. return false;
  127. /*
  128. * note that all cached pages should be unlinked
  129. * before delete it from the radix tree.
  130. * Otherwise some cached pages of an orphan old workgroup
  131. * could be still linked after the new one is available.
  132. */
  133. if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
  134. erofs_workgroup_unfreeze(grp, 1);
  135. return false;
  136. }
  137. /*
  138. * it is impossible to fail after the workgroup is freezed,
  139. * however in order to avoid some race conditions, add a
  140. * DBG_BUGON to observe this in advance.
  141. */
  142. entry = radix_tree_delete(&sbi->workstn_tree, grp->index);
  143. DBG_BUGON((void *)((unsigned long)entry &
  144. ~RADIX_TREE_EXCEPTIONAL_ENTRY) != grp);
  145. /*
  146. * if managed cache is enable, the last refcount
  147. * should indicate the related workstation.
  148. */
  149. erofs_workgroup_unfreeze_final(grp);
  150. return true;
  151. }
  152. #else
  153. /* for nocache case, no customized reclaim path at all */
  154. bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
  155. struct erofs_workgroup *grp,
  156. bool cleanup)
  157. {
  158. int cnt = atomic_read(&grp->refcount);
  159. void *entry;
  160. DBG_BUGON(cnt <= 0);
  161. DBG_BUGON(cleanup && cnt != 1);
  162. if (cnt > 1)
  163. return false;
  164. entry = radix_tree_delete(&sbi->workstn_tree, grp->index);
  165. DBG_BUGON((void *)((unsigned long)entry &
  166. ~RADIX_TREE_EXCEPTIONAL_ENTRY) != grp);
  167. /* (rarely) could be grabbed again when freeing */
  168. erofs_workgroup_put(grp);
  169. return true;
  170. }
  171. #endif
  172. unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
  173. unsigned long nr_shrink,
  174. bool cleanup)
  175. {
  176. pgoff_t first_index = 0;
  177. void *batch[PAGEVEC_SIZE];
  178. unsigned freed = 0;
  179. int i, found;
  180. repeat:
  181. erofs_workstn_lock(sbi);
  182. found = radix_tree_gang_lookup(&sbi->workstn_tree,
  183. batch, first_index, PAGEVEC_SIZE);
  184. for (i = 0; i < found; ++i) {
  185. struct erofs_workgroup *grp = (void *)
  186. ((unsigned long)batch[i] &
  187. ~RADIX_TREE_EXCEPTIONAL_ENTRY);
  188. first_index = grp->index + 1;
  189. /* try to shrink each valid workgroup */
  190. if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
  191. continue;
  192. ++freed;
  193. if (unlikely(!--nr_shrink))
  194. break;
  195. }
  196. erofs_workstn_unlock(sbi);
  197. if (i && nr_shrink)
  198. goto repeat;
  199. return freed;
  200. }
  201. #endif
  202. /* protected by 'erofs_sb_list_lock' */
  203. static unsigned int shrinker_run_no;
  204. /* protects the mounted 'erofs_sb_list' */
  205. static DEFINE_SPINLOCK(erofs_sb_list_lock);
  206. static LIST_HEAD(erofs_sb_list);
  207. void erofs_register_super(struct super_block *sb)
  208. {
  209. struct erofs_sb_info *sbi = EROFS_SB(sb);
  210. mutex_init(&sbi->umount_mutex);
  211. spin_lock(&erofs_sb_list_lock);
  212. list_add(&sbi->list, &erofs_sb_list);
  213. spin_unlock(&erofs_sb_list_lock);
  214. }
  215. void erofs_unregister_super(struct super_block *sb)
  216. {
  217. spin_lock(&erofs_sb_list_lock);
  218. list_del(&EROFS_SB(sb)->list);
  219. spin_unlock(&erofs_sb_list_lock);
  220. }
  221. unsigned long erofs_shrink_count(struct shrinker *shrink,
  222. struct shrink_control *sc)
  223. {
  224. return atomic_long_read(&erofs_global_shrink_cnt);
  225. }
  226. unsigned long erofs_shrink_scan(struct shrinker *shrink,
  227. struct shrink_control *sc)
  228. {
  229. struct erofs_sb_info *sbi;
  230. struct list_head *p;
  231. unsigned long nr = sc->nr_to_scan;
  232. unsigned int run_no;
  233. unsigned long freed = 0;
  234. spin_lock(&erofs_sb_list_lock);
  235. do
  236. run_no = ++shrinker_run_no;
  237. while (run_no == 0);
  238. /* Iterate over all mounted superblocks and try to shrink them */
  239. p = erofs_sb_list.next;
  240. while (p != &erofs_sb_list) {
  241. sbi = list_entry(p, struct erofs_sb_info, list);
  242. /*
  243. * We move the ones we do to the end of the list, so we stop
  244. * when we see one we have already done.
  245. */
  246. if (sbi->shrinker_run_no == run_no)
  247. break;
  248. if (!mutex_trylock(&sbi->umount_mutex)) {
  249. p = p->next;
  250. continue;
  251. }
  252. spin_unlock(&erofs_sb_list_lock);
  253. sbi->shrinker_run_no = run_no;
  254. #ifdef CONFIG_EROFS_FS_ZIP
  255. freed += erofs_shrink_workstation(sbi, nr - freed, false);
  256. #endif
  257. spin_lock(&erofs_sb_list_lock);
  258. /* Get the next list element before we move this one */
  259. p = p->next;
  260. /*
  261. * Move this one to the end of the list to provide some
  262. * fairness.
  263. */
  264. list_move_tail(&sbi->list, &erofs_sb_list);
  265. mutex_unlock(&sbi->umount_mutex);
  266. if (freed >= nr)
  267. break;
  268. }
  269. spin_unlock(&erofs_sb_list_lock);
  270. return freed;
  271. }

xattr.c

  1. // SPDX-License-Identifier: GPL-2.0
  2. /*
  3. * linux/drivers/staging/erofs/xattr.c
  4. *
  5. * Copyright (C) 2017-2018 HUAWEI, Inc.
  6. * http://www.huawei.com/
  7. * Created by Gao Xiang <gaoxiang25@huawei.com>
  8. *
  9. * This file is subject to the terms and conditions of the GNU General Public
  10. * License. See the file COPYING in the main directory of the Linux
  11. * distribution for more details.
  12. */
  13. #include <linux/security.h>
  14. #include "xattr.h"
  15. struct xattr_iter {
  16. struct super_block *sb;
  17. struct page *page;
  18. void *kaddr;
  19. erofs_blk_t blkaddr;
  20. unsigned ofs;
  21. };
  22. static inline void xattr_iter_end(struct xattr_iter *it, bool atomic)
  23. {
  24. /* the only user of kunmap() is 'init_inode_xattrs' */
  25. if (unlikely(!atomic))
  26. kunmap(it->page);
  27. else
  28. kunmap_atomic(it->kaddr);
  29. unlock_page(it->page);
  30. put_page(it->page);
  31. }
  32. static inline void xattr_iter_end_final(struct xattr_iter *it)
  33. {
  34. if (!it->page)
  35. return;
  36. xattr_iter_end(it, true);
  37. }
  38. static int init_inode_xattrs(struct inode *inode)
  39. {
  40. struct erofs_vnode *const vi = EROFS_V(inode);
  41. struct xattr_iter it;
  42. unsigned i;
  43. struct erofs_xattr_ibody_header *ih;
  44. struct erofs_sb_info *sbi;
  45. bool atomic_map;
  46. int ret = 0;
  47. /* the most case is that xattrs of this inode are initialized. */
  48. if (test_bit(EROFS_V_EA_INITED_BIT, &vi->flags))
  49. return 0;
  50. if (wait_on_bit_lock(&vi->flags, EROFS_V_BL_XATTR_BIT, TASK_KILLABLE))
  51. return -ERESTARTSYS;
  52. /* someone has initialized xattrs for us? */
  53. if (test_bit(EROFS_V_EA_INITED_BIT, &vi->flags))
  54. goto out_unlock;
  55. /*
  56. * bypass all xattr operations if ->xattr_isize is not greater than
  57. * sizeof(struct erofs_xattr_ibody_header), in detail:
  58. * 1) it is not enough to contain erofs_xattr_ibody_header then
  59. * ->xattr_isize should be 0 (it means no xattr);
  60. * 2) it is just to contain erofs_xattr_ibody_header, which is on-disk
  61. * undefined right now (maybe use later with some new sb feature).
  62. */
  63. if (vi->xattr_isize == sizeof(struct erofs_xattr_ibody_header)) {
  64. errln("xattr_isize %d of nid %llu is not supported yet",
  65. vi->xattr_isize, vi->nid);
  66. ret = -ENOTSUPP;
  67. goto out_unlock;
  68. } else if (vi->xattr_isize < sizeof(struct erofs_xattr_ibody_header)) {
  69. if (unlikely(vi->xattr_isize)) {
  70. DBG_BUGON(1);
  71. ret = -EIO;
  72. goto out_unlock; /* xattr ondisk layout error */
  73. }
  74. ret = -ENOATTR;
  75. goto out_unlock;
  76. }
  77. sbi = EROFS_I_SB(inode);
  78. it.blkaddr = erofs_blknr(iloc(sbi, vi->nid) + vi->inode_isize);
  79. it.ofs = erofs_blkoff(iloc(sbi, vi->nid) + vi->inode_isize);
  80. it.page = erofs_get_inline_page(inode, it.blkaddr);
  81. if (IS_ERR(it.page)) {
  82. ret = PTR_ERR(it.page);
  83. goto out_unlock;
  84. }
  85. /* read in shared xattr array (non-atomic, see kmalloc below) */
  86. it.kaddr = kmap(it.page);
  87. atomic_map = false;
  88. ih = (struct erofs_xattr_ibody_header *)(it.kaddr + it.ofs);
  89. vi->xattr_shared_count = ih->h_shared_count;
  90. vi->xattr_shared_xattrs = kmalloc_array(vi->xattr_shared_count,
  91. sizeof(uint), GFP_KERNEL);
  92. if (!vi->xattr_shared_xattrs) {
  93. xattr_iter_end(&it, atomic_map);
  94. ret = -ENOMEM;
  95. goto out_unlock;
  96. }
  97. /* let's skip ibody header */
  98. it.ofs += sizeof(struct erofs_xattr_ibody_header);
  99. for (i = 0; i < vi->xattr_shared_count; ++i) {
  100. if (unlikely(it.ofs >= EROFS_BLKSIZ)) {
  101. /* cannot be unaligned */
  102. BUG_ON(it.ofs != EROFS_BLKSIZ);
  103. xattr_iter_end(&it, atomic_map);
  104. it.page = erofs_get_meta_page(inode->i_sb,
  105. ++it.blkaddr, S_ISDIR(inode->i_mode));
  106. if (IS_ERR(it.page)) {
  107. kfree(vi->xattr_shared_xattrs);
  108. vi->xattr_shared_xattrs = NULL;
  109. ret = PTR_ERR(it.page);
  110. goto out_unlock;
  111. }
  112. it.kaddr = kmap_atomic(it.page);
  113. atomic_map = true;
  114. it.ofs = 0;
  115. }
  116. vi->xattr_shared_xattrs[i] =
  117. le32_to_cpu(*(__le32 *)(it.kaddr + it.ofs));
  118. it.ofs += sizeof(__le32);
  119. }
  120. xattr_iter_end(&it, atomic_map);
  121. set_bit(EROFS_V_EA_INITED_BIT, &vi->flags);
  122. out_unlock:
  123. clear_and_wake_up_bit(EROFS_V_BL_XATTR_BIT, &vi->flags);
  124. return ret;
  125. }
  126. struct xattr_iter_handlers {
  127. int (*entry)(struct xattr_iter *, struct erofs_xattr_entry *);
  128. int (*name)(struct xattr_iter *, unsigned, char *, unsigned);
  129. int (*alloc_buffer)(struct xattr_iter *, unsigned);
  130. void (*value)(struct xattr_iter *, unsigned, char *, unsigned);
  131. };
  132. static inline int xattr_iter_fixup(struct xattr_iter *it)
  133. {
  134. if (it->ofs < EROFS_BLKSIZ)
  135. return 0;
  136. xattr_iter_end(it, true);
  137. it->blkaddr += erofs_blknr(it->ofs);
  138. it->page = erofs_get_meta_page(it->sb, it->blkaddr, false);
  139. if (IS_ERR(it->page)) {
  140. int err = PTR_ERR(it->page);
  141. it->page = NULL;
  142. return err;
  143. }
  144. it->kaddr = kmap_atomic(it->page);
  145. it->ofs = erofs_blkoff(it->ofs);
  146. return 0;
  147. }
  148. static int inline_xattr_iter_begin(struct xattr_iter *it,
  149. struct inode *inode)
  150. {
  151. struct erofs_vnode *const vi = EROFS_V(inode);
  152. struct erofs_sb_info *const sbi = EROFS_SB(inode->i_sb);
  153. unsigned xattr_header_sz, inline_xattr_ofs;
  154. xattr_header_sz = inlinexattr_header_size(inode);
  155. if (unlikely(xattr_header_sz >= vi->xattr_isize)) {
  156. BUG_ON(xattr_header_sz > vi->xattr_isize);
  157. return -ENOATTR;
  158. }
  159. inline_xattr_ofs = vi->inode_isize + xattr_header_sz;
  160. it->blkaddr = erofs_blknr(iloc(sbi, vi->nid) + inline_xattr_ofs);
  161. it->ofs = erofs_blkoff(iloc(sbi, vi->nid) + inline_xattr_ofs);
  162. it->page = erofs_get_inline_page(inode, it->blkaddr);
  163. if (IS_ERR(it->page))
  164. return PTR_ERR(it->page);
  165. it->kaddr = kmap_atomic(it->page);
  166. return vi->xattr_isize - xattr_header_sz;
  167. }
  168. static int xattr_foreach(struct xattr_iter *it,
  169. const struct xattr_iter_handlers *op, unsigned int *tlimit)
  170. {
  171. struct erofs_xattr_entry entry;
  172. unsigned value_sz, processed, slice;
  173. int err;
  174. /* 0. fixup blkaddr, ofs, ipage */
  175. err = xattr_iter_fixup(it);
  176. if (err)
  177. return err;
  178. /*
  179. * 1. read xattr entry to the memory,
  180. * since we do EROFS_XATTR_ALIGN
  181. * therefore entry should be in the page
  182. */
  183. entry = *(struct erofs_xattr_entry *)(it->kaddr + it->ofs);
  184. if (tlimit != NULL) {
  185. unsigned entry_sz = EROFS_XATTR_ENTRY_SIZE(&entry);
  186. BUG_ON(*tlimit < entry_sz);
  187. *tlimit -= entry_sz;
  188. }
  189. it->ofs += sizeof(struct erofs_xattr_entry);
  190. value_sz = le16_to_cpu(entry.e_value_size);
  191. /* handle entry */
  192. err = op->entry(it, &entry);
  193. if (err) {
  194. it->ofs += entry.e_name_len + value_sz;
  195. goto out;
  196. }
  197. /* 2. handle xattr name (ofs will finally be at the end of name) */
  198. processed = 0;
  199. while (processed < entry.e_name_len) {
  200. if (it->ofs >= EROFS_BLKSIZ) {
  201. BUG_ON(it->ofs > EROFS_BLKSIZ);
  202. err = xattr_iter_fixup(it);
  203. if (err)
  204. goto out;
  205. it->ofs = 0;
  206. }
  207. slice = min_t(unsigned, PAGE_SIZE - it->ofs,
  208. entry.e_name_len - processed);
  209. /* handle name */
  210. err = op->name(it, processed, it->kaddr + it->ofs, slice);
  211. if (err) {
  212. it->ofs += entry.e_name_len - processed + value_sz;
  213. goto out;
  214. }
  215. it->ofs += slice;
  216. processed += slice;
  217. }
  218. /* 3. handle xattr value */
  219. processed = 0;
  220. if (op->alloc_buffer != NULL) {
  221. err = op->alloc_buffer(it, value_sz);
  222. if (err) {
  223. it->ofs += value_sz;
  224. goto out;
  225. }
  226. }
  227. while (processed < value_sz) {
  228. if (it->ofs >= EROFS_BLKSIZ) {
  229. BUG_ON(it->ofs > EROFS_BLKSIZ);
  230. err = xattr_iter_fixup(it);
  231. if (err)
  232. goto out;
  233. it->ofs = 0;
  234. }
  235. slice = min_t(unsigned, PAGE_SIZE - it->ofs,
  236. value_sz - processed);
  237. op->value(it, processed, it->kaddr + it->ofs, slice);
  238. it->ofs += slice;
  239. processed += slice;
  240. }
  241. out:
  242. /* we assume that ofs is aligned with 4 bytes */
  243. it->ofs = EROFS_XATTR_ALIGN(it->ofs);
  244. return err;
  245. }
  246. struct getxattr_iter {
  247. struct xattr_iter it;
  248. char *buffer;
  249. int buffer_size, index;
  250. struct qstr name;
  251. };
  252. static int xattr_entrymatch(struct xattr_iter *_it,
  253. struct erofs_xattr_entry *entry)
  254. {
  255. struct getxattr_iter *it = container_of(_it, struct getxattr_iter, it);
  256. return (it->index != entry->e_name_index ||
  257. it->name.len != entry->e_name_len) ? -ENOATTR : 0;
  258. }
  259. static int xattr_namematch(struct xattr_iter *_it,
  260. unsigned processed, char *buf, unsigned len)
  261. {
  262. struct getxattr_iter *it = container_of(_it, struct getxattr_iter, it);
  263. return memcmp(buf, it->name.name + processed, len) ? -ENOATTR : 0;
  264. }
  265. static int xattr_checkbuffer(struct xattr_iter *_it,
  266. unsigned value_sz)
  267. {
  268. struct getxattr_iter *it = container_of(_it, struct getxattr_iter, it);
  269. int err = it->buffer_size < value_sz ? -ERANGE : 0;
  270. it->buffer_size = value_sz;
  271. return it->buffer == NULL ? 1 : err;
  272. }
  273. static void xattr_copyvalue(struct xattr_iter *_it,
  274. unsigned processed, char *buf, unsigned len)
  275. {
  276. struct getxattr_iter *it = container_of(_it, struct getxattr_iter, it);
  277. memcpy(it->buffer + processed, buf, len);
  278. }
  279. static const struct xattr_iter_handlers find_xattr_handlers = {
  280. .entry = xattr_entrymatch,
  281. .name = xattr_namematch,
  282. .alloc_buffer = xattr_checkbuffer,
  283. .value = xattr_copyvalue
  284. };
  285. static int inline_getxattr(struct inode *inode, struct getxattr_iter *it)
  286. {
  287. int ret;
  288. unsigned remaining;
  289. ret = inline_xattr_iter_begin(&it->it, inode);
  290. if (ret < 0)
  291. return ret;
  292. remaining = ret;
  293. while (remaining) {
  294. ret = xattr_foreach(&it->it, &find_xattr_handlers, &remaining);
  295. if (ret >= 0)
  296. break;
  297. if (ret != -ENOATTR) /* -ENOMEM, -EIO, etc. */
  298. break;
  299. }
  300. xattr_iter_end_final(&it->it);
  301. return ret < 0 ? ret : it->buffer_size;
  302. }
  303. static int shared_getxattr(struct inode *inode, struct getxattr_iter *it)
  304. {
  305. struct erofs_vnode *const vi = EROFS_V(inode);
  306. struct erofs_sb_info *const sbi = EROFS_SB(inode->i_sb);
  307. unsigned i;
  308. int ret = -ENOATTR;
  309. for (i = 0; i < vi->xattr_shared_count; ++i) {
  310. erofs_blk_t blkaddr =
  311. xattrblock_addr(sbi, vi->xattr_shared_xattrs[i]);
  312. it->it.ofs = xattrblock_offset(sbi, vi->xattr_shared_xattrs[i]);
  313. if (!i || blkaddr != it->it.blkaddr) {
  314. if (i)
  315. xattr_iter_end(&it->it, true);
  316. it->it.page = erofs_get_meta_page(inode->i_sb,
  317. blkaddr, false);
  318. if (IS_ERR(it->it.page))
  319. return PTR_ERR(it->it.page);
  320. it->it.kaddr = kmap_atomic(it->it.page);
  321. it->it.blkaddr = blkaddr;
  322. }
  323. ret = xattr_foreach(&it->it, &find_xattr_handlers, NULL);
  324. if (ret >= 0)
  325. break;
  326. if (ret != -ENOATTR) /* -ENOMEM, -EIO, etc. */
  327. break;
  328. }
  329. if (vi->xattr_shared_count)
  330. xattr_iter_end_final(&it->it);
  331. return ret < 0 ? ret : it->buffer_size;
  332. }
  333. static bool erofs_xattr_user_list(struct dentry *dentry)
  334. {
  335. return test_opt(EROFS_SB(dentry->d_sb), XATTR_USER);
  336. }
  337. static bool erofs_xattr_trusted_list(struct dentry *dentry)
  338. {
  339. return capable(CAP_SYS_ADMIN);
  340. }
  341. int erofs_getxattr(struct inode *inode, int index,
  342. const char *name,
  343. void *buffer, size_t buffer_size)
  344. {
  345. int ret;
  346. struct getxattr_iter it;
  347. if (unlikely(name == NULL))
  348. return -EINVAL;
  349. ret = init_inode_xattrs(inode);
  350. if (ret)
  351. return ret;
  352. it.index = index;
  353. it.name.len = strlen(name);
  354. if (it.name.len > EROFS_NAME_LEN)
  355. return -ERANGE;
  356. it.name.name = name;
  357. it.buffer = buffer;
  358. it.buffer_size = buffer_size;
  359. it.it.sb = inode->i_sb;
  360. ret = inline_getxattr(inode, &it);
  361. if (ret == -ENOATTR)
  362. ret = shared_getxattr(inode, &it);
  363. return ret;
  364. }
  365. static int erofs_xattr_generic_get(const struct xattr_handler *handler,
  366. struct dentry *unused, struct inode *inode,
  367. const char *name, void *buffer, size_t size)
  368. {
  369. struct erofs_sb_info *const sbi = EROFS_I_SB(inode);
  370. switch (handler->flags) {
  371. case EROFS_XATTR_INDEX_USER:
  372. if (!test_opt(sbi, XATTR_USER))
  373. return -EOPNOTSUPP;
  374. break;
  375. case EROFS_XATTR_INDEX_TRUSTED:
  376. if (!capable(CAP_SYS_ADMIN))
  377. return -EPERM;
  378. break;
  379. case EROFS_XATTR_INDEX_SECURITY:
  380. break;
  381. default:
  382. return -EINVAL;
  383. }
  384. return erofs_getxattr(inode, handler->flags, name, buffer, size);
  385. }
  386. const struct xattr_handler erofs_xattr_user_handler = {
  387. .prefix = XATTR_USER_PREFIX,
  388. .flags = EROFS_XATTR_INDEX_USER,
  389. .list = erofs_xattr_user_list,
  390. .get = erofs_xattr_generic_get,
  391. };
  392. const struct xattr_handler erofs_xattr_trusted_handler = {
  393. .prefix = XATTR_TRUSTED_PREFIX,
  394. .flags = EROFS_XATTR_INDEX_TRUSTED,
  395. .list = erofs_xattr_trusted_list,
  396. .get = erofs_xattr_generic_get,
  397. };
  398. #ifdef CONFIG_EROFS_FS_SECURITY
  399. const struct xattr_handler __maybe_unused erofs_xattr_security_handler = {
  400. .prefix = XATTR_SECURITY_PREFIX,
  401. .flags = EROFS_XATTR_INDEX_SECURITY,
  402. .get = erofs_xattr_generic_get,
  403. };
  404. #endif
  405. const struct xattr_handler *erofs_xattr_handlers[] = {
  406. &erofs_xattr_user_handler,
  407. #ifdef CONFIG_EROFS_FS_POSIX_ACL
  408. &posix_acl_access_xattr_handler,
  409. &posix_acl_default_xattr_handler,
  410. #endif
  411. &erofs_xattr_trusted_handler,
  412. #ifdef CONFIG_EROFS_FS_SECURITY
  413. &erofs_xattr_security_handler,
  414. #endif
  415. NULL,
  416. };
  417. struct listxattr_iter {
  418. struct xattr_iter it;
  419. struct dentry *dentry;
  420. char *buffer;
  421. int buffer_size, buffer_ofs;
  422. };
  423. static int xattr_entrylist(struct xattr_iter *_it,
  424. struct erofs_xattr_entry *entry)
  425. {
  426. struct listxattr_iter *it =
  427. container_of(_it, struct listxattr_iter, it);
  428. unsigned prefix_len;
  429. const char *prefix;
  430. const struct xattr_handler *h =
  431. erofs_xattr_handler(entry->e_name_index);
  432. if (h == NULL || (h->list != NULL && !h->list(it->dentry)))
  433. return 1;
  434. /* Note that at least one of 'prefix' and 'name' should be non-NULL */
  435. prefix = h->prefix != NULL ? h->prefix : h->name;
  436. prefix_len = strlen(prefix);
  437. if (it->buffer == NULL) {
  438. it->buffer_ofs += prefix_len + entry->e_name_len + 1;
  439. return 1;
  440. }
  441. if (it->buffer_ofs + prefix_len
  442. + entry->e_name_len + 1 > it->buffer_size)
  443. return -ERANGE;
  444. memcpy(it->buffer + it->buffer_ofs, prefix, prefix_len);
  445. it->buffer_ofs += prefix_len;
  446. return 0;
  447. }
  448. static int xattr_namelist(struct xattr_iter *_it,
  449. unsigned processed, char *buf, unsigned len)
  450. {
  451. struct listxattr_iter *it =
  452. container_of(_it, struct listxattr_iter, it);
  453. memcpy(it->buffer + it->buffer_ofs, buf, len);
  454. it->buffer_ofs += len;
  455. return 0;
  456. }
  457. static int xattr_skipvalue(struct xattr_iter *_it,
  458. unsigned value_sz)
  459. {
  460. struct listxattr_iter *it =
  461. container_of(_it, struct listxattr_iter, it);
  462. it->buffer[it->buffer_ofs++] = '\0';
  463. return 1;
  464. }
  465. static const struct xattr_iter_handlers list_xattr_handlers = {
  466. .entry = xattr_entrylist,
  467. .name = xattr_namelist,
  468. .alloc_buffer = xattr_skipvalue,
  469. .value = NULL
  470. };
  471. static int inline_listxattr(struct listxattr_iter *it)
  472. {
  473. int ret;
  474. unsigned remaining;
  475. ret = inline_xattr_iter_begin(&it->it, d_inode(it->dentry));
  476. if (ret < 0)
  477. return ret;
  478. remaining = ret;
  479. while (remaining) {
  480. ret = xattr_foreach(&it->it, &list_xattr_handlers, &remaining);
  481. if (ret < 0)
  482. break;
  483. }
  484. xattr_iter_end_final(&it->it);
  485. return ret < 0 ? ret : it->buffer_ofs;
  486. }
  487. static int shared_listxattr(struct listxattr_iter *it)
  488. {
  489. struct inode *const inode = d_inode(it->dentry);
  490. struct erofs_vnode *const vi = EROFS_V(inode);
  491. struct erofs_sb_info *const sbi = EROFS_I_SB(inode);
  492. unsigned i;
  493. int ret = 0;
  494. for (i = 0; i < vi->xattr_shared_count; ++i) {
  495. erofs_blk_t blkaddr =
  496. xattrblock_addr(sbi, vi->xattr_shared_xattrs[i]);
  497. it->it.ofs = xattrblock_offset(sbi, vi->xattr_shared_xattrs[i]);
  498. if (!i || blkaddr != it->it.blkaddr) {
  499. if (i)
  500. xattr_iter_end(&it->it, true);
  501. it->it.page = erofs_get_meta_page(inode->i_sb,
  502. blkaddr, false);
  503. if (IS_ERR(it->it.page))
  504. return PTR_ERR(it->it.page);
  505. it->it.kaddr = kmap_atomic(it->it.page);
  506. it->it.blkaddr = blkaddr;
  507. }
  508. ret = xattr_foreach(&it->it, &list_xattr_handlers, NULL);
  509. if (ret < 0)
  510. break;
  511. }
  512. if (vi->xattr_shared_count)
  513. xattr_iter_end_final(&it->it);
  514. return ret < 0 ? ret : it->buffer_ofs;
  515. }
  516. ssize_t erofs_listxattr(struct dentry *dentry,
  517. char *buffer, size_t buffer_size)
  518. {
  519. int ret;
  520. struct listxattr_iter it;
  521. ret = init_inode_xattrs(d_inode(dentry));
  522. if (ret == -ENOATTR)
  523. return 0;
  524. if (ret)
  525. return ret;
  526. it.dentry = dentry;
  527. it.buffer = buffer;
  528. it.buffer_size = buffer_size;
  529. it.buffer_ofs = 0;
  530. it.it.sb = dentry->d_sb;
  531. ret = inline_listxattr(&it);
  532. if (ret < 0 && ret != -ENOATTR)
  533. return ret;
  534. return shared_listxattr(&it);
  535. }

声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号