<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/btrfs/sysfs.c, branch v6.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>btrfs: sysfs: show temp_fsid feature</title>
<updated>2023-10-12T14:44:18+00:00</updated>
<author>
<name>Anand Jain</name>
<email>anand.jain@oracle.com</email>
</author>
<published>2023-10-04T15:00:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f3623740068e548b7c6fdb42171c118189d0e03f'/>
<id>f3623740068e548b7c6fdb42171c118189d0e03f</id>
<content type='text'>
This adds sysfs objects to indicate temp_fsid feature support and
its status.

  /sys/fs/btrfs/features/temp_fsid
  /sys/fs/btrfs/&lt;UUID&gt;/temp_fsid

For example:

   Consider two cloned and mounted devices.

      $ blkid /dev/sdc[1-2]
      /dev/sdc1: UUID="509ad44b-ad2a-4a8a-bc8d-fe69db7220d5" ..
      /dev/sdc2: UUID="509ad44b-ad2a-4a8a-bc8d-fe69db7220d5" ..

   One gets actual fsid, and the other gets the temp_fsid when
   mounted.

      $ btrfs filesystem show -m
      Label: none  uuid: 509ad44b-ad2a-4a8a-bc8d-fe69db7220d5
	      Total devices 1 FS bytes used 54.14MiB
	      devid    1 size 300.00MiB used 144.00MiB path /dev/sdc1

      Label: none  uuid: 33bad74e-c91b-43a5-aef8-b3cab97ae63a
	      Total devices 1 FS bytes used 54.14MiB
	      devid    1 size 300.00MiB used 144.00MiB path /dev/sdc2

   Their sysfs as below.

      $ cat /sys/fs/btrfs/features/temp_fsid
      0

      $ cat /sys/fs/btrfs/509ad44b-ad2a-4a8a-bc8d-fe69db7220d5/temp_fsid
      0

      $ cat /sys/fs/btrfs/33bad74e-c91b-43a5-aef8-b3cab97ae63a/temp_fsid
      1

Signed-off-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds sysfs objects to indicate temp_fsid feature support and
its status.

  /sys/fs/btrfs/features/temp_fsid
  /sys/fs/btrfs/&lt;UUID&gt;/temp_fsid

For example:

   Consider two cloned and mounted devices.

      $ blkid /dev/sdc[1-2]
      /dev/sdc1: UUID="509ad44b-ad2a-4a8a-bc8d-fe69db7220d5" ..
      /dev/sdc2: UUID="509ad44b-ad2a-4a8a-bc8d-fe69db7220d5" ..

   One gets actual fsid, and the other gets the temp_fsid when
   mounted.

      $ btrfs filesystem show -m
      Label: none  uuid: 509ad44b-ad2a-4a8a-bc8d-fe69db7220d5
	      Total devices 1 FS bytes used 54.14MiB
	      devid    1 size 300.00MiB used 144.00MiB path /dev/sdc1

      Label: none  uuid: 33bad74e-c91b-43a5-aef8-b3cab97ae63a
	      Total devices 1 FS bytes used 54.14MiB
	      devid    1 size 300.00MiB used 144.00MiB path /dev/sdc2

   Their sysfs as below.

      $ cat /sys/fs/btrfs/features/temp_fsid
      0

      $ cat /sys/fs/btrfs/509ad44b-ad2a-4a8a-bc8d-fe69db7220d5/temp_fsid
      0

      $ cat /sys/fs/btrfs/33bad74e-c91b-43a5-aef8-b3cab97ae63a/temp_fsid
      1

Signed-off-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: add and use helpers for reading and writing fs_info-&gt;generation</title>
<updated>2023-10-12T14:44:17+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2023-10-04T10:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4a4f8fe2b0230c22aba40c9f8ea7b9c6fcfc8417'/>
<id>4a4f8fe2b0230c22aba40c9f8ea7b9c6fcfc8417</id>
<content type='text'>
Currently the generation field of struct btrfs_fs_info is always modified
while holding fs_info-&gt;trans_lock locked. Most readers will access this
field without taking that lock but while holding a transaction handle,
which is safe to do due to the transaction life cycle.

However there are other readers that are neither holding the lock nor
holding a transaction handle open:

1) When reading an inode from disk, at btrfs_read_locked_inode();

2) When reading the generation to expose it to sysfs, at
   btrfs_generation_show();

3) Early in the fsync path, at skip_inode_logging();

4) When creating a hole at btrfs_cont_expand(), during write paths,
   truncate and reflinking;

5) In the fs_info ioctl (btrfs_ioctl_fs_info());

6) While mounting the filesystem, in the open_ctree() path. In these
   cases it's safe to directly read fs_info-&gt;generation as no one
   can concurrently start a transaction and update fs_info-&gt;generation.

In case of the fsync path, races here should be harmless, and in the worst
case they may cause a fsync to log an inode when it's not really needed,
so nothing bad from a functional perspective. In the other cases it's not
so clear if functional problems may arise, though in case 1 rare things
like a load/store tearing [1] may cause the BTRFS_INODE_NEEDS_FULL_SYNC
flag not being set on an inode and therefore result in incorrect logging
later on in case a fsync call is made.

To avoid data race warnings from tools like KCSAN and other issues such
as load and store tearing (amongst others, see [1]), create helpers to
access the generation field of struct btrfs_fs_info using READ_ONCE() and
WRITE_ONCE(), and use these helpers where needed.

[1] https://lwn.net/Articles/793253/

Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the generation field of struct btrfs_fs_info is always modified
while holding fs_info-&gt;trans_lock locked. Most readers will access this
field without taking that lock but while holding a transaction handle,
which is safe to do due to the transaction life cycle.

However there are other readers that are neither holding the lock nor
holding a transaction handle open:

1) When reading an inode from disk, at btrfs_read_locked_inode();

2) When reading the generation to expose it to sysfs, at
   btrfs_generation_show();

3) Early in the fsync path, at skip_inode_logging();

4) When creating a hole at btrfs_cont_expand(), during write paths,
   truncate and reflinking;

5) In the fs_info ioctl (btrfs_ioctl_fs_info());

6) While mounting the filesystem, in the open_ctree() path. In these
   cases it's safe to directly read fs_info-&gt;generation as no one
   can concurrently start a transaction and update fs_info-&gt;generation.

In case of the fsync path, races here should be harmless, and in the worst
case they may cause a fsync to log an inode when it's not really needed,
so nothing bad from a functional perspective. In the other cases it's not
so clear if functional problems may arise, though in case 1 rare things
like a load/store tearing [1] may cause the BTRFS_INODE_NEEDS_FULL_SYNC
flag not being set on an inode and therefore result in incorrect logging
later on in case a fsync call is made.

To avoid data race warnings from tools like KCSAN and other issues such
as load and store tearing (amongst others, see [1]), create helpers to
access the generation field of struct btrfs_fs_info using READ_ONCE() and
WRITE_ONCE(), and use these helpers where needed.

[1] https://lwn.net/Articles/793253/

Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: add simple_quota incompat feature entry</title>
<updated>2023-10-12T14:44:10+00:00</updated>
<author>
<name>Boris Burkov</name>
<email>boris@bur.io</email>
</author>
<published>2023-07-13T20:39:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a744986ac4dbe42496410780800fbbc00afaa910'/>
<id>a744986ac4dbe42496410780800fbbc00afaa910</id>
<content type='text'>
Add an entry in the features directory for the new incompat flag

Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add an entry in the features directory for the new incompat flag

Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: expose quota mode via sysfs</title>
<updated>2023-10-12T14:44:10+00:00</updated>
<author>
<name>Boris Burkov</name>
<email>boris@bur.io</email>
</author>
<published>2023-04-27T17:58:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0182764a21b2923d4c238fed9a54db82b65d33b6'/>
<id>0182764a21b2923d4c238fed9a54db82b65d33b6</id>
<content type='text'>
Add a new sysfs file /sys/fs/btrfs/&lt;uuid&gt;/qgroups/mode
which prints out the mode qgroups is running in. The possible modes are
qgroup, and squota.

If quotas are not enabled, then the qgroups directory will not exist,
so don't handle that mode.

Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new sysfs file /sys/fs/btrfs/&lt;uuid&gt;/qgroups/mode
which prints out the mode qgroups is running in. The possible modes are
qgroup, and squota.

If quotas are not enabled, then the qgroups directory will not exist,
so don't handle that mode.

Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: announce presence of raid-stripe-tree</title>
<updated>2023-10-12T14:44:09+00:00</updated>
<author>
<name>Johannes Thumshirn</name>
<email>johannes.thumshirn@wdc.com</email>
</author>
<published>2023-09-14T16:07:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9f9918a8017b7925da2fad16b4ccf6a14630f03e'/>
<id>9f9918a8017b7925da2fad16b4ccf6a14630f03e</id>
<content type='text'>
If a filesystem with a raid-stripe-tree is mounted, show the RST feature
in sysfs, currently still under the CONFIG_BTRFS_DEBUG option.

Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a filesystem with a raid-stripe-tree is mounted, show the RST feature
in sysfs, currently still under the CONFIG_BTRFS_DEBUG option.

Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: show if ACL support has been compiled in</title>
<updated>2023-08-21T12:52:12+00:00</updated>
<author>
<name>Anand Jain</name>
<email>anand.jain@oracle.com</email>
</author>
<published>2023-06-20T08:55:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=070bb0011ccfd00c7de25dce2489fbe9cce5ba44'/>
<id>070bb0011ccfd00c7de25dce2489fbe9cce5ba44</id>
<content type='text'>
ACL support depends on the compile-time configuration option
CONFIG_BTRFS_FS_POSIX_ACL. Prior to mounting a btrfs filesystem, it is not
possible to determine whether ACL support has been compiled in. To address
this, add a sysfs interface, /sys/fs/btrfs/features/acl, and check for ACL
support in the system's btrfs.

  To determine ACL support:

  Return 0 indicates ACL is not supported:
    $ cat /sys/fs/btrfs/features/acl
    0

  Return 1 indicates ACL is supported:
    $ cat /sys/fs/btrfs/features/acl
    1

IMO, this is a better approach, so that we also know if kernel is older.

  On an older kernel
    $ ls /sys/fs/btrfs/features/acl
    ls: cannot access '/sys/fs/btrfs/features/acl': No such file or directory

    mount a btrfs filesystem
    $ cat /proc/self/mounts | grep btrfs | grep -q noacl
    $ echo $?
    0

Signed-off-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ACL support depends on the compile-time configuration option
CONFIG_BTRFS_FS_POSIX_ACL. Prior to mounting a btrfs filesystem, it is not
possible to determine whether ACL support has been compiled in. To address
this, add a sysfs interface, /sys/fs/btrfs/features/acl, and check for ACL
support in the system's btrfs.

  To determine ACL support:

  Return 0 indicates ACL is not supported:
    $ cat /sys/fs/btrfs/features/acl
    0

  Return 1 indicates ACL is supported:
    $ cat /sys/fs/btrfs/features/acl
    1

IMO, this is a better approach, so that we also know if kernel is older.

  On an older kernel
    $ ls /sys/fs/btrfs/features/acl
    ls: cannot access '/sys/fs/btrfs/features/acl': No such file or directory

    mount a btrfs filesystem
    $ cat /proc/self/mounts | grep btrfs | grep -q noacl
    $ echo $?
    0

Signed-off-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: relax bg_reclaim_threshold for debugging purposes</title>
<updated>2023-04-17T16:01:18+00:00</updated>
<author>
<name>Naohiro Aota</name>
<email>naohiro.aota@wdc.com</email>
</author>
<published>2023-03-13T07:46:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d1cc579383191fe6c3f1eeac45eb50b8976d38de'/>
<id>d1cc579383191fe6c3f1eeac45eb50b8976d38de</id>
<content type='text'>
Currently, /sys/fs/btrfs/&lt;UUID&gt;/bg_reclaim_threshold is limited to 0
(disable) or [50 .. 100]%, so we need to fill 50% of a device to start the
auto reclaim process. It is cumbersome to do so when we want to shake out
possible race issues of normal write vs reclaim.

Relax the threshold check under the BTRFS_DEBUG option.

Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Naohiro Aota &lt;naohiro.aota@wdc.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, /sys/fs/btrfs/&lt;UUID&gt;/bg_reclaim_threshold is limited to 0
(disable) or [50 .. 100]%, so we need to fill 50% of a device to start the
auto reclaim process. It is cumbersome to do so when we want to shake out
possible race issues of normal write vs reclaim.

Relax the threshold check under the BTRFS_DEBUG option.

Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Naohiro Aota &lt;naohiro.aota@wdc.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: add size class stats</title>
<updated>2023-03-01T18:27:20+00:00</updated>
<author>
<name>Boris Burkov</name>
<email>boris@bur.io</email>
</author>
<published>2023-02-15T20:59:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fcd9531b305288dc2848d38567d466af4ee147b2'/>
<id>fcd9531b305288dc2848d38567d466af4ee147b2</id>
<content type='text'>
Make it possible to see the distribution of size classes for block
groups. Helpful for testing and debugging the allocator w.r.t. to size
classes.

The new stats can be found at the path:

  /sys/fs/btrfs/&lt;FSID&gt;/allocation/&lt;bg-type&gt;/size_class

but they will only be non-zero for bg-type = data.

Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make it possible to see the distribution of size classes for block
groups. Helpful for testing and debugging the allocator w.r.t. to size
classes.

The new stats can be found at the path:

  /sys/fs/btrfs/&lt;FSID&gt;/allocation/&lt;bg-type&gt;/size_class

but they will only be non-zero for bg-type = data.

Signed-off-by: Boris Burkov &lt;boris@bur.io&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: make kobj_type structures constant</title>
<updated>2023-02-15T18:38:55+00:00</updated>
<author>
<name>Thomas Weißschuh</name>
<email>linux@weissschuh.net</email>
</author>
<published>2023-02-10T02:13:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=964a54e5e1a0d70cd80bd5a0885a1938463625b1'/>
<id>964a54e5e1a0d70cd80bd5a0885a1938463625b1</id>
<content type='text'>
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Reviewed-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Reviewed-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: sysfs: update fs features directory asynchronously</title>
<updated>2023-02-13T16:50:35+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>wqu@suse.com</email>
</author>
<published>2023-01-13T11:11:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b7625f461da6734a21c38ba6e7558538a116a2e3'/>
<id>b7625f461da6734a21c38ba6e7558538a116a2e3</id>
<content type='text'>
[BUG]
Since the introduction of per-fs feature sysfs interface
(/sys/fs/btrfs/&lt;UUID&gt;/features/), the content of that directory is never
updated.

Thus for the following case, that directory will not show the new
features like RAID56:

  # mkfs.btrfs -f $dev1 $dev2 $dev3
  # mount $dev1 $mnt
  # btrfs balance start -f -mconvert=raid5 $mnt
  # ls /sys/fs/btrfs/$uuid/features/
  extended_iref  free_space_tree  no_holes  skinny_metadata

While after unmount and mount, we got the correct features:

  # umount $mnt
  # mount $dev1 $mnt
  # ls /sys/fs/btrfs/$uuid/features/
  extended_iref  free_space_tree  no_holes  raid56 skinny_metadata

[CAUSE]
Because we never really try to update the content of per-fs features/
directory.

We had an attempt to update the features directory dynamically in commit
14e46e04958d ("btrfs: synchronize incompat feature bits with sysfs
files"), but unfortunately it get reverted in commit e410e34fad91
("Revert "btrfs: synchronize incompat feature bits with sysfs files"").
The problem in the original patch is, in the context of
btrfs_create_chunk(), we can not afford to update the sysfs group.

The exported but never utilized function, btrfs_sysfs_feature_update()
is the leftover of such attempt.  As even if we go sysfs_update_group(),
new files will need extra memory allocation, and we have no way to
specify the sysfs update to go GFP_NOFS.

[FIX]
This patch will address the old problem by doing asynchronous sysfs
update in the cleaner thread.

This involves the following changes:

- Make __btrfs_(set|clear)_fs_(incompat|compat_ro) helpers to set
  BTRFS_FS_FEATURE_CHANGED flag when needed

- Update btrfs_sysfs_feature_update() to use sysfs_update_group()
  And drop unnecessary arguments.

- Call btrfs_sysfs_feature_update() in cleaner_kthread
  If we have the BTRFS_FS_FEATURE_CHANGED flag set.

- Wake up cleaner_kthread in btrfs_commit_transaction if we have
  BTRFS_FS_FEATURE_CHANGED flag

By this, all the previously dangerous call sites like
btrfs_create_chunk() need no new changes, as above helpers would
have already set the BTRFS_FS_FEATURE_CHANGED flag.

The real work happens at cleaner_kthread, thus we pay the cost of
delaying the update to sysfs directory, but the delayed time should be
small enough that end user can not distinguish though it might get
delayed if the cleaner thread is busy with removing subvolumes or
defrag.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[BUG]
Since the introduction of per-fs feature sysfs interface
(/sys/fs/btrfs/&lt;UUID&gt;/features/), the content of that directory is never
updated.

Thus for the following case, that directory will not show the new
features like RAID56:

  # mkfs.btrfs -f $dev1 $dev2 $dev3
  # mount $dev1 $mnt
  # btrfs balance start -f -mconvert=raid5 $mnt
  # ls /sys/fs/btrfs/$uuid/features/
  extended_iref  free_space_tree  no_holes  skinny_metadata

While after unmount and mount, we got the correct features:

  # umount $mnt
  # mount $dev1 $mnt
  # ls /sys/fs/btrfs/$uuid/features/
  extended_iref  free_space_tree  no_holes  raid56 skinny_metadata

[CAUSE]
Because we never really try to update the content of per-fs features/
directory.

We had an attempt to update the features directory dynamically in commit
14e46e04958d ("btrfs: synchronize incompat feature bits with sysfs
files"), but unfortunately it get reverted in commit e410e34fad91
("Revert "btrfs: synchronize incompat feature bits with sysfs files"").
The problem in the original patch is, in the context of
btrfs_create_chunk(), we can not afford to update the sysfs group.

The exported but never utilized function, btrfs_sysfs_feature_update()
is the leftover of such attempt.  As even if we go sysfs_update_group(),
new files will need extra memory allocation, and we have no way to
specify the sysfs update to go GFP_NOFS.

[FIX]
This patch will address the old problem by doing asynchronous sysfs
update in the cleaner thread.

This involves the following changes:

- Make __btrfs_(set|clear)_fs_(incompat|compat_ro) helpers to set
  BTRFS_FS_FEATURE_CHANGED flag when needed

- Update btrfs_sysfs_feature_update() to use sysfs_update_group()
  And drop unnecessary arguments.

- Call btrfs_sysfs_feature_update() in cleaner_kthread
  If we have the BTRFS_FS_FEATURE_CHANGED flag set.

- Wake up cleaner_kthread in btrfs_commit_transaction if we have
  BTRFS_FS_FEATURE_CHANGED flag

By this, all the previously dangerous call sites like
btrfs_create_chunk() need no new changes, as above helpers would
have already set the BTRFS_FS_FEATURE_CHANGED flag.

The real work happens at cleaner_kthread, thus we pay the cost of
delaying the update to sysfs directory, but the delayed time should be
small enough that end user can not distinguish though it might get
delayed if the cleaner thread is busy with removing subvolumes or
defrag.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Anand Jain &lt;anand.jain@oracle.com&gt;
Signed-off-by: Qu Wenruo &lt;wqu@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
