<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/um/kernel/skas, branch v6.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>um: make stub data pages size tweakable</title>
<updated>2023-04-20T21:08:43+00:00</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2023-04-14T13:46:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6032aca0deb9c138df122192f8ef02de1fdccf25'/>
<id>6032aca0deb9c138df122192f8ef02de1fdccf25</id>
<content type='text'>
There's a lot of code here that hard-codes that the
data is a single page, and right now that seems to
be sufficient, but to make it easier to change this
in the future, add a new STUB_DATA_PAGES constant
and use it throughout the code.

Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's a lot of code here that hard-codes that the
data is a single page, and right now that seems to
be sufficient, but to make it easier to change this
in the future, add a new STUB_DATA_PAGES constant
and use it throughout the code.

Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kbuild: remove --include-dir MAKEFLAG from top Makefile</title>
<updated>2023-02-05T09:51:22+00:00</updated>
<author>
<name>Masahiro Yamada</name>
<email>masahiroy@kernel.org</email>
</author>
<published>2023-01-28T09:24:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=67d7c3023a672c2b73d19d6d23684df670fce648'/>
<id>67d7c3023a672c2b73d19d6d23684df670fce648</id>
<content type='text'>
I added $(srctree)/ to some included Makefiles in the following commits:

 - 3204a7fb98a3 ("kbuild: prefix $(srctree)/ to some included Makefiles")
 - d82856395505 ("kbuild: do not require sub-make for separate output tree builds")

They were a preparation for removing --include-dir flag.

I have never thought --include-dir useful. Rather, it _is_ harmful.

For example, run the following commands:

  $ make -s ARCH=x86 mrproper defconfig
  $ make ARCH=arm O=foo dtbs
  make[1]: Entering directory '/tmp/linux/foo'
    HOSTCC  scripts/basic/fixdep
  Error: kernelrelease not valid - run 'make prepare' to update it
    UPD     include/config/kernel.release
  make[1]: Leaving directory '/tmp/linux/foo'

The first command configures the source tree for x86. The next command
tries to build ARM device trees in the separate foo/ directory - this
must stop because the directory foo/ has not been configured yet.

However, due to --include-dir=$(abs_srctree), the top Makefile includes
the wrong include/config/auto.conf from the source tree and continues
building. Kbuild traverses the directory tree, but of course it does
not work correctly. The Error message is also pointless - 'make prepare'
does not help at all for fixing the issue.

This commit fixes more arch Makefile, and finally removes --include-dir
from the top Makefile.

There are more breakages under drivers/, but I do not volunteer to fix
them all. I just moved --include-dir to drivers/Makefile.

With this commit, the second command will stop with a sensible message.

  $ make -s ARCH=x86 mrproper defconfig
  $ make ARCH=arm O=foo dtbs
  make[1]: Entering directory '/tmp/linux/foo'
    SYNC    include/config/auto.conf.cmd
  ***
  *** The source tree is not clean, please run 'make ARCH=arm mrproper'
  *** in /tmp/linux
  ***
  make[2]: *** [../Makefile:646: outputmakefile] Error 1
  /tmp/linux/Makefile:770: include/config/auto.conf.cmd: No such file or directory
  make[1]: *** [/tmp/linux/Makefile:793: include/config/auto.conf.cmd] Error 2
  make[1]: Leaving directory '/tmp/linux/foo'
  make: *** [Makefile:226: __sub-make] Error 2

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I added $(srctree)/ to some included Makefiles in the following commits:

 - 3204a7fb98a3 ("kbuild: prefix $(srctree)/ to some included Makefiles")
 - d82856395505 ("kbuild: do not require sub-make for separate output tree builds")

They were a preparation for removing --include-dir flag.

I have never thought --include-dir useful. Rather, it _is_ harmful.

For example, run the following commands:

  $ make -s ARCH=x86 mrproper defconfig
  $ make ARCH=arm O=foo dtbs
  make[1]: Entering directory '/tmp/linux/foo'
    HOSTCC  scripts/basic/fixdep
  Error: kernelrelease not valid - run 'make prepare' to update it
    UPD     include/config/kernel.release
  make[1]: Leaving directory '/tmp/linux/foo'

The first command configures the source tree for x86. The next command
tries to build ARM device trees in the separate foo/ directory - this
must stop because the directory foo/ has not been configured yet.

However, due to --include-dir=$(abs_srctree), the top Makefile includes
the wrong include/config/auto.conf from the source tree and continues
building. Kbuild traverses the directory tree, but of course it does
not work correctly. The Error message is also pointless - 'make prepare'
does not help at all for fixing the issue.

This commit fixes more arch Makefile, and finally removes --include-dir
from the top Makefile.

There are more breakages under drivers/, but I do not volunteer to fix
them all. I just moved --include-dir to drivers/Makefile.

With this commit, the second command will stop with a sensible message.

  $ make -s ARCH=x86 mrproper defconfig
  $ make ARCH=arm O=foo dtbs
  make[1]: Entering directory '/tmp/linux/foo'
    SYNC    include/config/auto.conf.cmd
  ***
  *** The source tree is not clean, please run 'make ARCH=arm mrproper'
  *** in /tmp/linux
  ***
  make[2]: *** [../Makefile:646: outputmakefile] Error 1
  /tmp/linux/Makefile:770: include/config/auto.conf.cmd: No such file or directory
  make[1]: *** [/tmp/linux/Makefile:793: include/config/auto.conf.cmd] Error 2
  make[1]: Leaving directory '/tmp/linux/foo'
  make: *** [Makefile:226: __sub-make] Error 2

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-01-12T01:24:45+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-01-12T01:24:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=daadb3bd0e8d3e317e36bc2c1542e86c528665e5'/>
<id>daadb3bd0e8d3e317e36bc2c1542e86c528665e5</id>
<content type='text'>
Pull locking updates from Borislav Petkov:
 "Lots of cleanups and preparation. Highlights:

   - futex: Cleanup and remove runtime futex_cmpxchg detection

   - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure

   - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and
     annotate the racy owner-&gt;on_cpu access *once*.

   - atomic64: Dead-Code-Elemination"

[ Description above by Peter Zijlstra ]

* tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomic: atomic64: Remove unusable atomic ops
  futex: Fix additional regressions
  locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h
  x86/mm: Include spinlock_t definition in pgtable.
  locking: Mark racy reads of owner-&gt;on_cpu
  locking: Make owner_on_cpu() into &lt;linux/sched.h&gt;
  lockdep/selftests: Adapt ww-tests for PREEMPT_RT
  lockdep/selftests: Skip the softirq related tests on PREEMPT_RT
  lockdep/selftests: Unbalanced migrate_disable() &amp; rcu_read_lock().
  lockdep/selftests: Avoid using local_lock_{acquire|release}().
  lockdep: Remove softirq accounting on PREEMPT_RT.
  locking/rtmutex: Add rt_mutex_lock_nest_lock() and rt_mutex_lock_killable().
  locking/rtmutex: Squash self-deadlock check for ww_rt_mutex.
  locking: Remove rt_rwlock_is_contended().
  sched: Trigger warning if -&gt;migration_disabled counter underflows.
  futex: Fix sparc32/m68k/nds32 build regression
  futex: Remove futex_cmpxchg detection
  futex: Ensure futex_atomic_cmpxchg_inatomic() is present
  kernel/locking: Use a pointer in ww_mutex_trylock().
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull locking updates from Borislav Petkov:
 "Lots of cleanups and preparation. Highlights:

   - futex: Cleanup and remove runtime futex_cmpxchg detection

   - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure

   - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and
     annotate the racy owner-&gt;on_cpu access *once*.

   - atomic64: Dead-Code-Elemination"

[ Description above by Peter Zijlstra ]

* tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomic: atomic64: Remove unusable atomic ops
  futex: Fix additional regressions
  locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h
  x86/mm: Include spinlock_t definition in pgtable.
  locking: Mark racy reads of owner-&gt;on_cpu
  locking: Make owner_on_cpu() into &lt;linux/sched.h&gt;
  lockdep/selftests: Adapt ww-tests for PREEMPT_RT
  lockdep/selftests: Skip the softirq related tests on PREEMPT_RT
  lockdep/selftests: Unbalanced migrate_disable() &amp; rcu_read_lock().
  lockdep/selftests: Avoid using local_lock_{acquire|release}().
  lockdep: Remove softirq accounting on PREEMPT_RT.
  locking/rtmutex: Add rt_mutex_lock_nest_lock() and rt_mutex_lock_killable().
  locking/rtmutex: Squash self-deadlock check for ww_rt_mutex.
  locking: Remove rt_rwlock_is_contended().
  sched: Trigger warning if -&gt;migration_disabled counter underflows.
  futex: Fix sparc32/m68k/nds32 build regression
  futex: Remove futex_cmpxchg detection
  futex: Ensure futex_atomic_cmpxchg_inatomic() is present
  kernel/locking: Use a pointer in ww_mutex_trylock().
</pre>
</div>
</content>
</entry>
<entry>
<title>um: remove set_fs</title>
<updated>2021-12-22T16:56:56+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-12-15T16:56:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8bb227ac34c062b466a0d5fd21f060010375880b'/>
<id>8bb227ac34c062b466a0d5fd21f060010375880b</id>
<content type='text'>
Remove address space overrides using set_fs() for User Mode Linux.
Note that just like the existing kernel access case of the uaccess
routines the new nofault kernel handlers do not actually have any
exception handling.  This is probably broken, but not change to the
status quo.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove address space overrides using set_fs() for User Mode Linux.
Note that just like the existing kernel access case of the uaccess
routines the new nofault kernel handlers do not actually have any
exception handling.  This is probably broken, but not change to the
status quo.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>futex: Remove futex_cmpxchg detection</title>
<updated>2021-11-24T23:02:28+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2021-10-26T10:03:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3297481d688a5cc2973ea58bd78e66b8639748b1'/>
<id>3297481d688a5cc2973ea58bd78e66b8639748b1</id>
<content type='text'>
Now that all architectures have a working futex implementation in any
configuration, remove the runtime detection code.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt;
Acked-by: Vineet Gupta &lt;vgupta@kernel.org&gt;
Acked-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Link: https://lore.kernel.org/r/20211026100432.1730393-2-arnd@kernel.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that all architectures have a working futex implementation in any
configuration, remove the runtime detection code.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt;
Acked-by: Vineet Gupta &lt;vgupta@kernel.org&gt;
Acked-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Link: https://lore.kernel.org/r/20211026100432.1730393-2-arnd@kernel.org

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml</title>
<updated>2021-09-09T20:45:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-09-09T20:45:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d6c338a741295c04ed84679153448b2fffd2c9cf'/>
<id>d6c338a741295c04ed84679153448b2fffd2c9cf</id>
<content type='text'>
Pull UML updates from Richard Weinberger:

 - Support for VMAP_STACK

 - Support for splice_write in hostfs

 - Fixes for virt-pci

 - Fixes for virtio_uml

 - Various fixes

* tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: fix stub location calculation
  um: virt-pci: fix uapi documentation
  um: enable VMAP_STACK
  um: virt-pci: don't do DMA from stack
  hostfs: support splice_write
  um: virtio_uml: fix memory leak on init failures
  um: virtio_uml: include linux/virtio-uml.h
  lib/logic_iomem: fix sparse warnings
  um: make PCI emulation driver init/exit static
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull UML updates from Richard Weinberger:

 - Support for VMAP_STACK

 - Support for splice_write in hostfs

 - Fixes for virt-pci

 - Fixes for virtio_uml

 - Various fixes

* tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: fix stub location calculation
  um: virt-pci: fix uapi documentation
  um: enable VMAP_STACK
  um: virt-pci: don't do DMA from stack
  hostfs: support splice_write
  um: virtio_uml: fix memory leak on init failures
  um: virtio_uml: include linux/virtio-uml.h
  lib/logic_iomem: fix sparse warnings
  um: make PCI emulation driver init/exit static
</pre>
</div>
</content>
</entry>
<entry>
<title>um: fix stub location calculation</title>
<updated>2021-08-26T20:28:03+00:00</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2021-07-13T21:47:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=adf9ae0d159d3dc94f58d788fc4757c8749ac0df'/>
<id>adf9ae0d159d3dc94f58d788fc4757c8749ac0df</id>
<content type='text'>
In commit 9f0b4807a44f ("um: rework userspace stubs to not hard-code
stub location") I changed stub_segv_handler() to do a calculation with
a pointer to a stack variable to find the data page that we're using
for the stack and the rest of the data. This same commit was meant to
do it as well for stub_clone_handler(), but the change inadvertently
went into commit 84b2789d6115 ("um: separate child and parent errors
in clone stub") instead.

This was reported to not be compiled correctly by gcc 5, causing the
code to crash here. I'm not sure why, perhaps it's UB because the var
isn't initialized? In any case, this trick always seemed bad, so just
create a new inline function that does the calculation in assembly.

Reported-by: subashab@codeaurora.org
Fixes: 9f0b4807a44f ("um: rework userspace stubs to not hard-code stub location")
Fixes: 84b2789d6115 ("um: separate child and parent errors in clone stub")
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit 9f0b4807a44f ("um: rework userspace stubs to not hard-code
stub location") I changed stub_segv_handler() to do a calculation with
a pointer to a stack variable to find the data page that we're using
for the stack and the rest of the data. This same commit was meant to
do it as well for stub_clone_handler(), but the change inadvertently
went into commit 84b2789d6115 ("um: separate child and parent errors
in clone stub") instead.

This was reported to not be compiled correctly by gcc 5, causing the
code to crash here. I'm not sure why, perhaps it's UB because the var
isn't initialized? In any case, this trick always seemed bad, so just
create a new inline function that does the calculation in assembly.

Reported-by: subashab@codeaurora.org
Fixes: 9f0b4807a44f ("um: rework userspace stubs to not hard-code stub location")
Fixes: 84b2789d6115 ("um: separate child and parent errors in clone stub")
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>asm-generic/uaccess.h: remove __strncpy_from_user/__strnlen_user</title>
<updated>2021-07-23T12:39:56+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2020-01-16T12:18:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f27180dd63e1e6eca3230b9d3fdcc33564a81117'/>
<id>f27180dd63e1e6eca3230b9d3fdcc33564a81117</id>
<content type='text'>
This is a preparation for changing over architectures to the
generic implementation one at a time. As there are no callers
of either __strncpy_from_user() or __strnlen_user(), fold these
into the strncpy_from_user() and strnlen_user() functions to make
each implementation independent of the others.

Many of these implementations have known bugs, but the intention
here is to not change behavior at all and stay compatible with
those bugs for the moment.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a preparation for changing over architectures to the
generic implementation one at a time. As there are no callers
of either __strncpy_from_user() or __strnlen_user(), fold these
into the strncpy_from_user() and strnlen_user() functions to make
each implementation independent of the others.

Many of these implementations have known bugs, but the intention
here is to not change behavior at all and stay compatible with
those bugs for the moment.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml</title>
<updated>2021-07-09T17:19:13+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-07-09T17:19:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dcf3c935dd9e8e76c9922e88672fa4ad6a8a4df8'/>
<id>dcf3c935dd9e8e76c9922e88672fa4ad6a8a4df8</id>
<content type='text'>
Pull UML updates from Richard Weinberger:

 - Support for optimized routines based on the host CPU

 - Support for PCI via virtio

 - Various fixes

* tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: remove unneeded semicolon in um_arch.c
  um: Remove the repeated declaration
  um: fix error return code in winch_tramp()
  um: fix error return code in slip_open()
  um: Fix stack pointer alignment
  um: implement flush_cache_vmap/flush_cache_vunmap
  um: add a UML specific futex implementation
  um: enable the use of optimized xor routines in UML
  um: Add support for host CPU flags and alignment
  um: allow not setting extra rpaths in the linux binary
  um: virtio/pci: enable suspend/resume
  um: add PCI over virtio emulation driver
  um: irqs: allow invoking time-travel handler multiple times
  um: time-travel/signals: fix ndelay() in interrupt
  um: expose time-travel mode to userspace side
  um: export signals_enabled directly
  um: remove unused smp_sigio_handler() declaration
  lib: add iomem emulation (logic_iomem)
  um: allow disabling NO_IOMEM
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull UML updates from Richard Weinberger:

 - Support for optimized routines based on the host CPU

 - Support for PCI via virtio

 - Various fixes

* tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: remove unneeded semicolon in um_arch.c
  um: Remove the repeated declaration
  um: fix error return code in winch_tramp()
  um: fix error return code in slip_open()
  um: Fix stack pointer alignment
  um: implement flush_cache_vmap/flush_cache_vunmap
  um: add a UML specific futex implementation
  um: enable the use of optimized xor routines in UML
  um: Add support for host CPU flags and alignment
  um: allow not setting extra rpaths in the linux binary
  um: virtio/pci: enable suspend/resume
  um: add PCI over virtio emulation driver
  um: irqs: allow invoking time-travel handler multiple times
  um: time-travel/signals: fix ndelay() in interrupt
  um: expose time-travel mode to userspace side
  um: export signals_enabled directly
  um: remove unused smp_sigio_handler() declaration
  lib: add iomem emulation (logic_iomem)
  um: allow disabling NO_IOMEM
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Fix stack pointer alignment</title>
<updated>2021-06-17T20:08:31+00:00</updated>
<author>
<name>YiFei Zhu</name>
<email>zhuyifei1999@gmail.com</email>
</author>
<published>2021-04-20T05:56:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=558f9b2f94dbd2d5c5c8292aa13e081cc11ea7d9'/>
<id>558f9b2f94dbd2d5c5c8292aa13e081cc11ea7d9</id>
<content type='text'>
GCC assumes that stack is aligned to 16-byte on call sites [1].
Since GCC 8, GCC began using 16-byte aligned SSE instructions to
implement assignments to structs on stack. When
CC_OPTIMIZE_FOR_PERFORMANCE is enabled, this affects
os-Linux/sigio.c, write_sigio_thread:

  struct pollfds *fds, tmp;
  tmp = current_poll;

Note that struct pollfds is exactly 16 bytes in size.
GCC 8+ generates assembly similar to:

  movdqa (%rdi),%xmm0
  movaps %xmm0,-0x50(%rbp)

This is an issue, because movaps will #GP if -0x50(%rbp) is not
aligned to 16 bytes [2], and how rbp gets assigned to is via glibc
clone thread_start, then function prologue, going though execution
trace similar to (showing only relevant instructions):

  sub    $0x10,%rsi
  mov    %rcx,0x8(%rsi)
  mov    %rdi,(%rsi)
  syscall
  pop    %rax
  pop    %rdi
  callq  *%rax
  push   %rbp
  mov    %rsp,%rbp

The stack pointer always points to the topmost element on stack,
rather then the space right above the topmost. On push, the
pointer decrements first before writing to the memory pointed to
by it. Therefore, there is no need to have the stack pointer
pointer always point to valid memory unless the stack is poped;
so the `- sizeof(void *)` in the code is unnecessary.

On the other hand, glibc reserves the 16 bytes it needs on stack
and pops itself, so by the call instruction the stack pointer
is exactly the caller-supplied sp. It then push the 16 bytes of
the return address and the saved stack pointer, so the base
pointer will be 16-byte aligned if and only if the caller
supplied sp is 16-byte aligned. Therefore, the caller must supply
a 16-byte aligned pointer, which `stack + UM_KERN_PAGE_SIZE`
already satisfies.

On a side note, musl is unaffected by this issue because it forces
16 byte alignment via `and $-16,%rsi` in its clone wrapper.
Similarly, glibc i386 is also unaffected because it has
`andl $0xfffffff0, %ecx`.

To reproduce this bug, enable CONFIG_UML_RTC and
CC_OPTIMIZE_FOR_PERFORMANCE. uml_rtc will call
add_sigio_fd which will then cause write_sigio_thread to either go
into segfault loop or panic with "Segfault with no mm".

Similarly, signal stacks will be aligned by the host kernel upon
signal delivery. `- sizeof(void *)` to sigaltstack is
unconventional and extraneous.

On a related note, initialization of longjmp buffers do require
`- sizeof(void *)`. This is to account for the return address
that would have been pushed to the stack at the call site.

The reason for uml to respect 16-byte alignment, rather than
telling GCC to assume 8-byte alignment like the host kernel since
commit d9b0cde91c60 ("x86-64, gcc: Use
-mpreferred-stack-boundary=3 if supported"), is because uml links
against libc. There is no reason to assume libc is also compiled
with that flag and assumes 8-byte alignment rather than 16-byte.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838
[2] https://c9x.me/x86/html/file_module_x86_id_180.html

Signed-off-by: YiFei Zhu &lt;zhuyifei1999@gmail.com&gt;
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reviewed-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GCC assumes that stack is aligned to 16-byte on call sites [1].
Since GCC 8, GCC began using 16-byte aligned SSE instructions to
implement assignments to structs on stack. When
CC_OPTIMIZE_FOR_PERFORMANCE is enabled, this affects
os-Linux/sigio.c, write_sigio_thread:

  struct pollfds *fds, tmp;
  tmp = current_poll;

Note that struct pollfds is exactly 16 bytes in size.
GCC 8+ generates assembly similar to:

  movdqa (%rdi),%xmm0
  movaps %xmm0,-0x50(%rbp)

This is an issue, because movaps will #GP if -0x50(%rbp) is not
aligned to 16 bytes [2], and how rbp gets assigned to is via glibc
clone thread_start, then function prologue, going though execution
trace similar to (showing only relevant instructions):

  sub    $0x10,%rsi
  mov    %rcx,0x8(%rsi)
  mov    %rdi,(%rsi)
  syscall
  pop    %rax
  pop    %rdi
  callq  *%rax
  push   %rbp
  mov    %rsp,%rbp

The stack pointer always points to the topmost element on stack,
rather then the space right above the topmost. On push, the
pointer decrements first before writing to the memory pointed to
by it. Therefore, there is no need to have the stack pointer
pointer always point to valid memory unless the stack is poped;
so the `- sizeof(void *)` in the code is unnecessary.

On the other hand, glibc reserves the 16 bytes it needs on stack
and pops itself, so by the call instruction the stack pointer
is exactly the caller-supplied sp. It then push the 16 bytes of
the return address and the saved stack pointer, so the base
pointer will be 16-byte aligned if and only if the caller
supplied sp is 16-byte aligned. Therefore, the caller must supply
a 16-byte aligned pointer, which `stack + UM_KERN_PAGE_SIZE`
already satisfies.

On a side note, musl is unaffected by this issue because it forces
16 byte alignment via `and $-16,%rsi` in its clone wrapper.
Similarly, glibc i386 is also unaffected because it has
`andl $0xfffffff0, %ecx`.

To reproduce this bug, enable CONFIG_UML_RTC and
CC_OPTIMIZE_FOR_PERFORMANCE. uml_rtc will call
add_sigio_fd which will then cause write_sigio_thread to either go
into segfault loop or panic with "Segfault with no mm".

Similarly, signal stacks will be aligned by the host kernel upon
signal delivery. `- sizeof(void *)` to sigaltstack is
unconventional and extraneous.

On a related note, initialization of longjmp buffers do require
`- sizeof(void *)`. This is to account for the return address
that would have been pushed to the stack at the call site.

The reason for uml to respect 16-byte alignment, rather than
telling GCC to assume 8-byte alignment like the host kernel since
commit d9b0cde91c60 ("x86-64, gcc: Use
-mpreferred-stack-boundary=3 if supported"), is because uml links
against libc. There is no reason to assume libc is also compiled
with that flag and assumes 8-byte alignment rather than 16-byte.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838
[2] https://c9x.me/x86/html/file_module_x86_id_180.html

Signed-off-by: YiFei Zhu &lt;zhuyifei1999@gmail.com&gt;
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reviewed-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
</feed>
