<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/s390/kernel/process.c, branch v6.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>s390: include linux/io.h instead of asm/io.h</title>
<updated>2023-07-03T09:19:40+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2023-06-22T08:46:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b378a982614360686f45c3e6b63fd5d1acd02d08'/>
<id>b378a982614360686f45c3e6b63fd5d1acd02d08</id>
<content type='text'>
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes
asm/io.h, so this shouldn't cause any problems. Instead this might help for
some randconfig build errors which were reported due to some undefined io
related functions.

Also move the changed include so it stays grouped together with other
includes from the same directory.

For ctcm_mpc.c also remove not needed comments (actually questions).

Acked-by: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes
asm/io.h, so this shouldn't cause any problems. Instead this might help for
some randconfig build errors which were reported due to some undefined io
related functions.

Also move the changed include so it stays grouped together with other
includes from the same directory.

For ctcm_mpc.c also remove not needed comments (actually questions).

Acked-by: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390: trim ancient junk from copy_thread()</title>
<updated>2023-03-13T08:16:42+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2023-03-06T00:55:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fb77914a692d550a5bb0c7f71eac40e6da9c0e6d'/>
<id>fb77914a692d550a5bb0c7f71eac40e6da9c0e6d</id>
<content type='text'>
Setting and -&gt;psw.addr in childregs of kernel thread is a rudiment of
the old kernel_thread()/kernel_execve() implementation.  Mainline hadn't
been using them since 2012.

And clarify the assignments to frame-&gt;sf.gprs - the array stores
grp6..gpr15 values to be set by __switch_to(), so frame-&gt;sf.gprs[5]
actually affects grp11, etc.

Better spell that as frame-&gt;sf.gprs[11 - 6]...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/r/ZAU6BYFisE8evmYf@ZenIV
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Setting and -&gt;psw.addr in childregs of kernel thread is a rudiment of
the old kernel_thread()/kernel_execve() implementation.  Mainline hadn't
been using them since 2012.

And clarify the assignments to frame-&gt;sf.gprs - the array stores
grp6..gpr15 values to be set by __switch_to(), so frame-&gt;sf.gprs[5]
actually affects grp11, etc.

Better spell that as frame-&gt;sf.gprs[11 - 6]...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/r/ZAU6BYFisE8evmYf@ZenIV
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/mm: start kernel with DAT enabled</title>
<updated>2023-01-13T13:15:05+00:00</updated>
<author>
<name>Alexander Gordeev</name>
<email>agordeev@linux.ibm.com</email>
</author>
<published>2022-12-13T10:35:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bb1520d581a3a46e2d6e12bb74604ace33404de5'/>
<id>bb1520d581a3a46e2d6e12bb74604ace33404de5</id>
<content type='text'>
The setup of the kernel virtual address space is spread
throughout the sources, boot stages and config options
like this:

1. The available physical memory regions are queried
   and stored as mem_detect information for later use
   in the decompressor.

2. Based on the physical memory availability the virtual
   memory layout is established in the decompressor;

3. If CONFIG_KASAN is disabled the kernel paging setup
   code populates kernel pgtables and turns DAT mode on.
   It uses the information stored at step [1].

4. If CONFIG_KASAN is enabled the kernel early boot
   kasan setup populates kernel pgtables and turns DAT
   mode on. It uses the information stored at step [1].

   The kasan setup creates early_pg_dir directory and
   directly overwrites swapper_pg_dir entries to make
   shadow memory pages available.

Move the kernel virtual memory setup to the decompressor
and start the kernel with DAT turned on right from the
very first istruction. That completely eliminates the
boot phase when the kernel runs in DAT-off mode, simplies
the overall design and consolidates pgtables setup.

The identity mapping is created in the decompressor, while
kasan shadow mappings are still created by the early boot
kernel code.

Share with decompressor the existing kasan memory allocator.
It decreases the size of a newly requested memory block from
pgalloc_pos and ensures that kernel image is not overwritten.
pgalloc_low and pgalloc_pos pointers are made preserved boot
variables for that.

Use the bootdata infrastructure to setup swapper_pg_dir
and invalid_pg_dir directories used by the kernel later.
The interim early_pg_dir directory established by the
kasan initialization code gets eliminated as result.

As the kernel runs in DAT-on mode only the PSW_KERNEL_BITS
define gets PSW_MASK_DAT bit by default. Additionally, the
setup_lowcore_dat_off() and setup_lowcore_dat_on() routines
get merged, since there is no DAT-off mode stage anymore.

The memory mappings are created with RW+X protection that
allows the early boot code setting up all necessary data
and services for the kernel being booted. Just before the
paging is enabled the memory protection is changed to
RO+X for text, RO+NX for read-only data and RW+NX for
kernel data and the identity mapping.

Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The setup of the kernel virtual address space is spread
throughout the sources, boot stages and config options
like this:

1. The available physical memory regions are queried
   and stored as mem_detect information for later use
   in the decompressor.

2. Based on the physical memory availability the virtual
   memory layout is established in the decompressor;

3. If CONFIG_KASAN is disabled the kernel paging setup
   code populates kernel pgtables and turns DAT mode on.
   It uses the information stored at step [1].

4. If CONFIG_KASAN is enabled the kernel early boot
   kasan setup populates kernel pgtables and turns DAT
   mode on. It uses the information stored at step [1].

   The kasan setup creates early_pg_dir directory and
   directly overwrites swapper_pg_dir entries to make
   shadow memory pages available.

Move the kernel virtual memory setup to the decompressor
and start the kernel with DAT turned on right from the
very first istruction. That completely eliminates the
boot phase when the kernel runs in DAT-off mode, simplies
the overall design and consolidates pgtables setup.

The identity mapping is created in the decompressor, while
kasan shadow mappings are still created by the early boot
kernel code.

Share with decompressor the existing kasan memory allocator.
It decreases the size of a newly requested memory block from
pgalloc_pos and ensures that kernel image is not overwritten.
pgalloc_low and pgalloc_pos pointers are made preserved boot
variables for that.

Use the bootdata infrastructure to setup swapper_pg_dir
and invalid_pg_dir directories used by the kernel later.
The interim early_pg_dir directory established by the
kasan initialization code gets eliminated as result.

As the kernel runs in DAT-on mode only the PSW_KERNEL_BITS
define gets PSW_MASK_DAT bit by default. Additionally, the
setup_lowcore_dat_off() and setup_lowcore_dat_on() routines
get merged, since there is no DAT-off mode stage anymore.

The memory mappings are created with RW+X protection that
allows the early boot code setting up all necessary data
and services for the kernel being booted. Just before the
paging is enabled the memory protection is changed to
RO+X for text, RO+NX for read-only data and RW+NX for
kernel data and the identity mapping.

Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: use get_random_u32_below() instead of deprecated function</title>
<updated>2022-11-18T01:15:15+00:00</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-10-10T02:44:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8032bf1233a74627ce69b803608e650f3f35971c'/>
<id>8032bf1233a74627ce69b803608e650f3f35971c</id>
<content type='text'>
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # for xfs
Reviewed-by: SeongJae Park &lt;sj@kernel.org&gt; # for damon
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt; # for infiniband
Reviewed-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt; # for arm
Acked-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt; # for mmc
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # for xfs
Reviewed-by: SeongJae Park &lt;sj@kernel.org&gt; # for damon
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt; # for infiniband
Reviewed-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt; # for arm
Acked-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt; # for mmc
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: use get_random_{u8,u16}() when possible, part 2</title>
<updated>2022-10-11T23:42:58+00:00</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-10-05T15:23:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f743f16c548b1a2633e8b6034058d6475d7f26a3'/>
<id>f743f16c548b1a2633e8b6034058d6475d7f26a3</id>
<content type='text'>
Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
simply use the get_random_{u8,u16}() functions, which are faster than
wasting the additional bytes from a 32-bit value. This was done by hand,
identifying all of the places where one of the random integer functions
was used in a non-32-bit context.

Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt; # for s390
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
simply use the get_random_{u8,u16}() functions, which are faster than
wasting the additional bytes from a 32-bit value. This was done by hand,
identifying all of the places where one of the random integer functions
was used in a non-32-bit context.

Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt; # for s390
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: use prandom_u32_max() when possible, part 1</title>
<updated>2022-10-11T23:42:55+00:00</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-10-05T14:43:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=81895a65ec63ee1daec3255dc1a06675d2fbe915'/>
<id>81895a65ec63ee1daec3255dc1a06675d2fbe915</id>
<content type='text'>
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() &amp; ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() &gt;&gt; 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() &amp; ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-       RAND = get_random_u32();
        ... when != RAND
-       RAND %= (E);
+       RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p &amp; (LITERAL))

// Add one to the literal.
@script:python add_one@
literal &lt;&lt; literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
        print("Skipping 0x%x for cleanup elsewhere" % (value))
        cocci.include_match(False)
elif value &amp; (value + 1) != 0:
        print("Skipping 0x%x because it's not a power of two minus one" % (value))
        cocci.include_match(False)
elif literal.startswith('0x'):
        coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
        coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p &amp; (LITERAL))
+       prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-       T VAR;
-       VAR = (E);
-       return VAR;
+       return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-       T VAR;
        ... when != VAR
 }

Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Reviewed-by: KP Singh &lt;kpsingh@kernel.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt; # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder &lt;christoph.boehmwalder@linbit.com&gt; # for drbd
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt; # for s390
Acked-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt; # for mmc
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # for xfs
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() &amp; ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() &gt;&gt; 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() &amp; ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-       RAND = get_random_u32();
        ... when != RAND
-       RAND %= (E);
+       RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p &amp; (LITERAL))

// Add one to the literal.
@script:python add_one@
literal &lt;&lt; literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
        print("Skipping 0x%x for cleanup elsewhere" % (value))
        cocci.include_match(False)
elif value &amp; (value + 1) != 0:
        print("Skipping 0x%x because it's not a power of two minus one" % (value))
        cocci.include_match(False)
elif literal.startswith('0x'):
        coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
        coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p &amp; (LITERAL))
+       prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-       T VAR;
-       VAR = (E);
-       return VAR;
+       return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-       T VAR;
        ... when != VAR
 }

Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Reviewed-by: KP Singh &lt;kpsingh@kernel.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt; # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder &lt;christoph.boehmwalder@linbit.com&gt; # for drbd
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt; # for s390
Acked-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt; # for mmc
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # for xfs
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390: fix double free of GS and RI CBs on fork() failure</title>
<updated>2022-08-25T13:12:32+00:00</updated>
<author>
<name>Brian Foster</name>
<email>bfoster@redhat.com</email>
</author>
<published>2022-08-16T15:54:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13cccafe0edcd03bf1c841de8ab8a1c8e34f77d9'/>
<id>13cccafe0edcd03bf1c841de8ab8a1c8e34f77d9</id>
<content type='text'>
The pointers for guarded storage and runtime instrumentation control
blocks are stored in the thread_struct of the associated task. These
pointers are initially copied on fork() via arch_dup_task_struct()
and then cleared via copy_thread() before fork() returns. If fork()
happens to fail after the initial task dup and before copy_thread(),
the newly allocated task and associated thread_struct memory are
freed via free_task() -&gt; arch_release_task_struct(). This results in
a double free of the guarded storage and runtime info structs
because the fields in the failed task still refer to memory
associated with the source task.

This problem can manifest as a BUG_ON() in set_freepointer() (with
CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
when running trinity syscall fuzz tests on s390x. To avoid this
problem, clear the associated pointer fields in
arch_dup_task_struct() immediately after the new task is copied.
Note that the RI flag is still cleared in copy_thread() because it
resides in thread stack memory and that is where stack info is
copied.

Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling")
Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling")
Cc: &lt;stable@vger.kernel.org&gt; # 4.15
Reviewed-by: Gerald Schaefer &lt;gerald.schaefer@linux.ibm.com&gt;
Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220816155407.537372-1-bfoster@redhat.com
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The pointers for guarded storage and runtime instrumentation control
blocks are stored in the thread_struct of the associated task. These
pointers are initially copied on fork() via arch_dup_task_struct()
and then cleared via copy_thread() before fork() returns. If fork()
happens to fail after the initial task dup and before copy_thread(),
the newly allocated task and associated thread_struct memory are
freed via free_task() -&gt; arch_release_task_struct(). This results in
a double free of the guarded storage and runtime info structs
because the fields in the failed task still refer to memory
associated with the source task.

This problem can manifest as a BUG_ON() in set_freepointer() (with
CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
when running trinity syscall fuzz tests on s390x. To avoid this
problem, clear the associated pointer fields in
arch_dup_task_struct() immediately after the new task is copied.
Note that the RI flag is still cleared in copy_thread() because it
resides in thread stack memory and that is where stack info is
copied.

Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling")
Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling")
Cc: &lt;stable@vger.kernel.org&gt; # 4.15
Reviewed-by: Gerald Schaefer &lt;gerald.schaefer@linux.ibm.com&gt;
Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220816155407.537372-1-bfoster@redhat.com
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fork: Generalize PF_IO_WORKER handling</title>
<updated>2022-05-07T14:01:59+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2022-04-12T15:18:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5bd2e97c868a8a44470950ed01846cab6328e540'/>
<id>5bd2e97c868a8a44470950ed01846cab6328e540</id>
<content type='text'>
Add fn and fn_arg members into struct kernel_clone_args and test for
them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER).
This allows any task that wants to be a user space task that only runs
in kernel mode to use this functionality.

The code on x86 is an exception and still retains a PF_KTHREAD test
because x86 unlikely everything else handles kthreads slightly
differently than user space tasks that start with a function.

The functions that created tasks that start with a function
have been updated to set ".fn" and ".fn_arg" instead of
".stack" and ".stack_size".  These functions are fork_idle(),
create_io_thread(), kernel_thread(), and user_mode_thread().

Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add fn and fn_arg members into struct kernel_clone_args and test for
them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER).
This allows any task that wants to be a user space task that only runs
in kernel mode to use this functionality.

The code on x86 is an exception and still retains a PF_KTHREAD test
because x86 unlikely everything else handles kthreads slightly
differently than user space tasks that start with a function.

The functions that created tasks that start with a function
have been updated to set ".fn" and ".fn_arg" instead of
".stack" and ".stack_size".  These functions are fork_idle(),
create_io_thread(), kernel_thread(), and user_mode_thread().

Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fork: Pass struct kernel_clone_args into copy_thread</title>
<updated>2022-05-07T14:01:48+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2022-04-08T23:07:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c5febea0956fd3874e8fb59c6f84d68f128d68f8'/>
<id>c5febea0956fd3874e8fb59c6f84d68f128d68f8</id>
<content type='text'>
With io_uring we have started supporting tasks that are for most
purposes user space tasks that exclusively run code in kernel mode.

The kernel task that exec's init and tasks that exec user mode
helpers are also user mode tasks that just run kernel code
until they call kernel execve.

Pass kernel_clone_args into copy_thread so these oddball
tasks can be supported more cleanly and easily.

v2: Fix spelling of kenrel_clone_args on h8300
Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With io_uring we have started supporting tasks that are for most
purposes user space tasks that exclusively run code in kernel mode.

The kernel task that exec's init and tasks that exec user mode
helpers are also user mode tasks that just run kernel code
until they call kernel execve.

Pass kernel_clone_args into copy_thread so these oddball
tasks can be supported more cleanly and easily.

v2: Fix spelling of kenrel_clone_args on h8300
Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/exit: remove dead reference to do_exit from copy_thread</title>
<updated>2021-12-16T18:58:06+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2021-12-08T20:25:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=893d4d9c62ecb32bad7843fc31ec9eb740600758'/>
<id>893d4d9c62ecb32bad7843fc31ec9eb740600758</id>
<content type='text'>
My s390 assembly is not particularly good so I have read the history
of the reference to do_exit copy_thread and have been able to
verify that do_exit is not used.

The general argument is that s390 has been changed to use the generic
kernel_thread and kernel_execve and the generic versions do not call
do_exit.  So it is strange to see a do_exit reference sitting there.

The history of the do_exit reference in s390's version of copy_thread
seems conclusive that the do_exit reference is something that lingers
and should have been removed several years ago.

Up through 8d19f15a60be ("[PATCH] s390 update (1/27): arch.")  the
s390 code made a call to the exit(2) system call when a kernel thread
finished.  Then kernel_thread_starter was added which branched
directly to the value in register 11 when the kernel thread finshed.
The value in register 11 was set in kernel_thread to
"regs.gprs[11] = (unsigned long) do_exit"

In commit 37fe5d41f640 ("s390: fold kernel_thread_helper() into
ret_from_fork()") kernel_thread_starter was moved into entry.S and
entry64.S unchanged (except for the syntax differences between inline
assemly and in the assembly file).

In commit f9a7e025dfc2 ("s390: switch to generic kernel_thread()") the
assignment to "gprs[11]" was moved into copy_thread from the old
kernel_thread.  The helper kernel_thread_starter was still being used
and was still branching to "%r11" at the end.

In commit 30dcb0996e40 ("s390: switch to saner kernel_execve()
semantics") kernel_thread_starter was changed to unconditionally
branch to sysc_tracenogo instead to %r11 which held the value of
do_exit.  Unfortunately copy_thread was not updated to stop passing
do_exit in "gprs[11]".

In commit 56e62a737028 ("s390: convert to generic entry")
kernel_thread_starter was replaced by __ret_from_fork.  And the code
still continued to pass do_exit in "gprs[11]" despite __ret_from_fork
not caring in the slightest.

Remove this dead reference to do_exit to make it clear that s390 is
not doing anything with do_exit in copy_thread.

History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Fixes: 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Link: https://lore.kernel.org/r/20211208202532.16409-1-ebiederm@xmission.com
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
My s390 assembly is not particularly good so I have read the history
of the reference to do_exit copy_thread and have been able to
verify that do_exit is not used.

The general argument is that s390 has been changed to use the generic
kernel_thread and kernel_execve and the generic versions do not call
do_exit.  So it is strange to see a do_exit reference sitting there.

The history of the do_exit reference in s390's version of copy_thread
seems conclusive that the do_exit reference is something that lingers
and should have been removed several years ago.

Up through 8d19f15a60be ("[PATCH] s390 update (1/27): arch.")  the
s390 code made a call to the exit(2) system call when a kernel thread
finished.  Then kernel_thread_starter was added which branched
directly to the value in register 11 when the kernel thread finshed.
The value in register 11 was set in kernel_thread to
"regs.gprs[11] = (unsigned long) do_exit"

In commit 37fe5d41f640 ("s390: fold kernel_thread_helper() into
ret_from_fork()") kernel_thread_starter was moved into entry.S and
entry64.S unchanged (except for the syntax differences between inline
assemly and in the assembly file).

In commit f9a7e025dfc2 ("s390: switch to generic kernel_thread()") the
assignment to "gprs[11]" was moved into copy_thread from the old
kernel_thread.  The helper kernel_thread_starter was still being used
and was still branching to "%r11" at the end.

In commit 30dcb0996e40 ("s390: switch to saner kernel_execve()
semantics") kernel_thread_starter was changed to unconditionally
branch to sysc_tracenogo instead to %r11 which held the value of
do_exit.  Unfortunately copy_thread was not updated to stop passing
do_exit in "gprs[11]".

In commit 56e62a737028 ("s390: convert to generic entry")
kernel_thread_starter was replaced by __ret_from_fork.  And the code
still continued to pass do_exit in "gprs[11]" despite __ret_from_fork
not caring in the slightest.

Remove this dead reference to do_exit to make it clear that s390 is
not doing anything with do_exit in copy_thread.

History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Fixes: 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Link: https://lore.kernel.org/r/20211208202532.16409-1-ebiederm@xmission.com
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
