<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/sparc, branch v4.9.76</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>sparc32: Export vac_cache_size to fix build error</title>
<updated>2017-12-25T13:23:47+00:00</updated>
<author>
<name>Guenter Roeck</name>
<email>linux@roeck-us.net</email>
</author>
<published>2017-04-01T20:47:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6430e166aee863d293a9a3c083af7a64fec57d13'/>
<id>6430e166aee863d293a9a3c083af7a64fec57d13</id>
<content type='text'>
commit 9d262d95114cf2e2ac5e0ff358347fa2e214eda5 upstream.

sparc32:allmodconfig fails to build with the following error.

ERROR: "vac_cache_size" [drivers/infiniband/sw/rxe/rdma_rxe.ko] undefined!

Fixes: cb8864559631 ("infiniband: Fix alignment of mmap cookies ...")
Cc: Jason Gunthorpe &lt;jgunthorpe@obsidianresearch.com&gt;
Cc: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9d262d95114cf2e2ac5e0ff358347fa2e214eda5 upstream.

sparc32:allmodconfig fails to build with the following error.

ERROR: "vac_cache_size" [drivers/infiniband/sw/rxe/rdma_rxe.ko] undefined!

Fixes: cb8864559631 ("infiniband: Fix alignment of mmap cookies ...")
Cc: Jason Gunthorpe &lt;jgunthorpe@obsidianresearch.com&gt;
Cc: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64/mm: set fields in deferred pages</title>
<updated>2017-12-14T08:28:23+00:00</updated>
<author>
<name>Pavel Tatashin</name>
<email>pasha.tatashin@oracle.com</email>
</author>
<published>2017-11-16T01:36:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7da67d1d98f609b0f24fb6c1a317523c965898b7'/>
<id>7da67d1d98f609b0f24fb6c1a317523c965898b7</id>
<content type='text'>
[ Upstream commit 2a20aa171071a334d80c4e5d5af719d8374702fc ]

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:

mem_init() {
     register_page_bootmem_info();
     free_all_bootmem();
     ...
}

When register_page_bootmem_info() is called only non-deferred struct
pages are initialized.  But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page-&gt;freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() &lt;- Only if this is deferred reserved page
    __init_single_pfn()
     __init_single_page()
      memset(0) &lt;-- Loose the set fields here

We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed.  But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin &lt;pasha.tatashin@oracle.com&gt;
Reviewed-by: Steven Sistare &lt;steven.sistare@oracle.com&gt;
Reviewed-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Reviewed-by: Bob Picco &lt;bob.picco@oracle.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2a20aa171071a334d80c4e5d5af719d8374702fc ]

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:

mem_init() {
     register_page_bootmem_info();
     free_all_bootmem();
     ...
}

When register_page_bootmem_info() is called only non-deferred struct
pages are initialized.  But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page-&gt;freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() &lt;- Only if this is deferred reserved page
    __init_single_pfn()
     __init_single_page()
      memset(0) &lt;-- Loose the set fields here

We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed.  But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin &lt;pasha.tatashin@oracle.com&gt;
Reviewed-by: Steven Sistare &lt;steven.sistare@oracle.com&gt;
Reviewed-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Reviewed-by: Bob Picco &lt;bob.picco@oracle.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>security/keys: add CONFIG_KEYS_COMPAT to Kconfig</title>
<updated>2017-11-18T10:22:24+00:00</updated>
<author>
<name>Bilal Amarni</name>
<email>bilal.amarni@gmail.com</email>
</author>
<published>2017-06-08T13:47:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=31c8c49428204431aee6acedbffcf97989165ae5'/>
<id>31c8c49428204431aee6acedbffcf97989165ae5</id>
<content type='text'>
commit 47b2c3fff4932e6fc17ce13d51a43c6969714e20 upstream.

CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for
several 64-bit architectures : mips, parisc, tile.

At the moment and for those architectures, calling in 32-bit userspace the
keyctl syscall would return an ENOSYS error.

This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to
make sure the compatibility wrapper is registered by default for any 64-bit
architecture as long as it is configured with CONFIG_COMPAT.

[DH: Modified to remove arm64 compat enablement also as requested by Eric
 Biggers]

Signed-off-by: Bilal Amarni &lt;bilal.amarni@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
cc: Eric Biggers &lt;ebiggers3@gmail.com&gt;
Signed-off-by: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: James Cowgill &lt;james.cowgill@mips.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 47b2c3fff4932e6fc17ce13d51a43c6969714e20 upstream.

CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for
several 64-bit architectures : mips, parisc, tile.

At the moment and for those architectures, calling in 32-bit userspace the
keyctl syscall would return an ENOSYS error.

This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to
make sure the compatibility wrapper is registered by default for any 64-bit
architecture as long as it is configured with CONFIG_COMPAT.

[DH: Modified to remove arm64 compat enablement also as requested by Eric
 Biggers]

Signed-off-by: Bilal Amarni &lt;bilal.amarni@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
cc: Eric Biggers &lt;ebiggers3@gmail.com&gt;
Signed-off-by: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: James Cowgill &lt;james.cowgill@mips.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Migrate hvcons irq to panicked cpu</title>
<updated>2017-10-21T15:21:35+00:00</updated>
<author>
<name>Vijay Kumar</name>
<email>vijay.ac.kumar@oracle.com</email>
</author>
<published>2017-02-01T19:34:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=84a66ca775438dc3f9918a977a169f49a9700e2d'/>
<id>84a66ca775438dc3f9918a977a169f49a9700e2d</id>
<content type='text'>
[ Upstream commit 7dd4fcf5b70694dc961eb6b954673e4fc9730dbd ]

On panic, all other CPUs are stopped except the one which had
hit panic. To keep console alive, we need to migrate hvcons irq
to panicked CPU.

Signed-off-by: Vijay Kumar &lt;vijay.ac.kumar@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7dd4fcf5b70694dc961eb6b954673e4fc9730dbd ]

On panic, all other CPUs are stopped except the one which had
hit panic. To keep console alive, we need to migrate hvcons irq
to panicked CPU.

Signed-off-by: Vijay Kumar &lt;vijay.ac.kumar@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: remove unnecessary log message</title>
<updated>2017-08-30T08:21:38+00:00</updated>
<author>
<name>Tushar Dave</name>
<email>tushar.n.dave@oracle.com</email>
</author>
<published>2017-08-16T18:09:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d0526eef0bf7b2e6695d3e554433fe7a64d204d9'/>
<id>d0526eef0bf7b2e6695d3e554433fe7a64d204d9</id>
<content type='text'>
[ Upstream commit 6170a506899aee3dd4934c928426505e47b1b466 ]

There is no need to log message if ATU hvapi couldn't get register.
Unlike PCI hvapi, ATU hvapi registration failure is not hard error.
Even if ATU hvapi registration fails (on system with ATU or without
ATU) system continues with legacy IOMMU. So only log message when
ATU hvapi successfully get registered.

Signed-off-by: Tushar Dave &lt;tushar.n.dave@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 6170a506899aee3dd4934c928426505e47b1b466 ]

There is no need to log message if ATU hvapi couldn't get register.
Unlike PCI hvapi, ATU hvapi registration failure is not hard error.
Even if ATU hvapi registration fails (on system with ATU or without
ATU) system continues with legacy IOMMU. So only log message when
ATU hvapi successfully get registered.

Signed-off-by: Tushar Dave &lt;tushar.n.dave@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Prevent perf from running during super critical sections</title>
<updated>2017-08-13T02:31:23+00:00</updated>
<author>
<name>Rob Gardner</name>
<email>rob.gardner@oracle.com</email>
</author>
<published>2017-07-17T15:22:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6309eb77d823393ff427312bfd53e2b7f1bdcab2'/>
<id>6309eb77d823393ff427312bfd53e2b7f1bdcab2</id>
<content type='text'>
commit fc290a114fc6034b0f6a5a46e2fb7d54976cf87a upstream.

This fixes another cause of random segfaults and bus errors that may
occur while running perf with the callgraph option.

Critical sections beginning with spin_lock_irqsave() raise the interrupt
level to PIL_NORMAL_MAX (14) and intentionally do not block performance
counter interrupts, which arrive at PIL_NMI (15).

But some sections of code are "super critical" with respect to perf
because the perf_callchain_user() path accesses user space and may cause
TLB activity as well as faults as it unwinds the user stack.

One particular critical section occurs in switch_mm:

        spin_lock_irqsave(&amp;mm-&gt;context.lock, flags);
        ...
        load_secondary_context(mm);
        tsb_context_switch(mm);
        ...
        spin_unlock_irqrestore(&amp;mm-&gt;context.lock, flags);

If a perf interrupt arrives in between load_secondary_context() and
tsb_context_switch(), then perf_callchain_user() could execute with
the context ID of one process, but with an active TSB for a different
process. When the user stack is accessed, it is very likely to
incur a TLB miss, since the h/w context ID has been changed. The TLB
will then be reloaded with a translation from the TSB for one process,
but using a context ID for another process. This exposes memory from
one process to another, and since it is a mapping for stack memory,
this usually causes the new process to crash quickly.

This super critical section needs more protection than is provided
by spin_lock_irqsave() since perf interrupts must not be allowed in.

Since __tsb_context_switch already goes through the trouble of
disabling interrupts completely, we fix this by moving the secondary
context load down into this better protected region.

Orabug: 25577560

Signed-off-by: Dave Aldridge &lt;david.j.aldridge@oracle.com&gt;
Signed-off-by: Rob Gardner &lt;rob.gardner@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fc290a114fc6034b0f6a5a46e2fb7d54976cf87a upstream.

This fixes another cause of random segfaults and bus errors that may
occur while running perf with the callgraph option.

Critical sections beginning with spin_lock_irqsave() raise the interrupt
level to PIL_NORMAL_MAX (14) and intentionally do not block performance
counter interrupts, which arrive at PIL_NMI (15).

But some sections of code are "super critical" with respect to perf
because the perf_callchain_user() path accesses user space and may cause
TLB activity as well as faults as it unwinds the user stack.

One particular critical section occurs in switch_mm:

        spin_lock_irqsave(&amp;mm-&gt;context.lock, flags);
        ...
        load_secondary_context(mm);
        tsb_context_switch(mm);
        ...
        spin_unlock_irqrestore(&amp;mm-&gt;context.lock, flags);

If a perf interrupt arrives in between load_secondary_context() and
tsb_context_switch(), then perf_callchain_user() could execute with
the context ID of one process, but with an active TSB for a different
process. When the user stack is accessed, it is very likely to
incur a TLB miss, since the h/w context ID has been changed. The TLB
will then be reloaded with a translation from the TSB for one process,
but using a context ID for another process. This exposes memory from
one process to another, and since it is a mapping for stack memory,
this usually causes the new process to crash quickly.

This super critical section needs more protection than is provided
by spin_lock_irqsave() since perf interrupts must not be allowed in.

Since __tsb_context_switch already goes through the trouble of
disabling interrupts completely, we fix this by moving the secondary
context load down into this better protected region.

Orabug: 25577560

Signed-off-by: Dave Aldridge &lt;david.j.aldridge@oracle.com&gt;
Signed-off-by: Rob Gardner &lt;rob.gardner@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Fix exception handling in UltraSPARC-III memcpy.</title>
<updated>2017-08-11T15:49:34+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-08-04T16:47:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b9d68cdce72d49c06c24fb2f5bee5673557b33ac'/>
<id>b9d68cdce72d49c06c24fb2f5bee5673557b33ac</id>
<content type='text'>
[ Upstream commit 0ede1c401332173ab0693121dc6cde04a4dbf131 ]

Mikael Pettersson reported that some test programs in the strace-4.18
testsuite cause an OOPS.

After some debugging it turns out that garbage values are returned
when an exception occurs, causing the fixup memset() to be run with
bogus arguments.

The problem is that two of the exception handler stubs write the
successfully copied length into the wrong register.

Fixes: ee841d0aff64 ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.")
Reported-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Tested-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Reviewed-by: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0ede1c401332173ab0693121dc6cde04a4dbf131 ]

Mikael Pettersson reported that some test programs in the strace-4.18
testsuite cause an OOPS.

After some debugging it turns out that garbage values are returned
when an exception occurs, causing the fixup memset() to be run with
bogus arguments.

The problem is that two of the exception handler stubs write the
successfully copied length into the wrong register.

Fixes: ee841d0aff64 ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.")
Reported-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Tested-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Reviewed-by: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Measure receiver forward progress to avoid send mondo timeout</title>
<updated>2017-08-11T15:49:34+00:00</updated>
<author>
<name>Jane Chu</name>
<email>jane.chu@oracle.com</email>
</author>
<published>2017-07-11T18:00:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bfafa56e6c675adaa4942c4259baacc9650073ff'/>
<id>bfafa56e6c675adaa4942c4259baacc9650073ff</id>
<content type='text'>
[ Upstream commit 9d53caec84c7c5700e7c1ed744ea584fff55f9ac ]

A large sun4v SPARC system may have moments of intensive xcall activities,
usually caused by unmapping many pages on many CPUs concurrently. This can
flood receivers with CPU mondo interrupts for an extended period, causing
some unlucky senders to hit send-mondo timeout. This problem gets worse
as cpu count increases because sometimes mappings must be invalidated on
all CPUs, and sometimes all CPUs may gang up on a single CPU.

But a busy system is not a broken system. In the above scenario, as long
as the receiver is making forward progress processing mondo interrupts,
the sender should continue to retry.

This patch implements the receiver's forward progress meter by introducing
a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
of 0..NR_CPUS. The receiver increments its counter as soon as it receives
a mondo and the sender tracks the receiver's counter. If the receiver has
stopped making forward progress when the retry limit is reached, the sender
declares send-mondo-timeout and panic; otherwise, the receiver is allowed
to keep making forward progress.

In addition, it's been observed that PCIe hotplug events generate Correctable
Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
a guest cpu strand briefly to provide the service. If the cpu strand is
simultaneously the only cpu targeted by a mondo, it may not be available
for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
is the agreed wait time between hypervisor and guest OS, this patch makes
the adjustment.

Orabug: 25476541
Orabug: 26417466

Signed-off-by: Jane Chu &lt;jane.chu@oracle.com&gt;
Reviewed-by: Steve Sistare &lt;steven.sistare@oracle.com&gt;
Reviewed-by: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Reviewed-by: Rob Gardner &lt;rob.gardner@oracle.com&gt;
Reviewed-by: Thomas Tai &lt;thomas.tai@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9d53caec84c7c5700e7c1ed744ea584fff55f9ac ]

A large sun4v SPARC system may have moments of intensive xcall activities,
usually caused by unmapping many pages on many CPUs concurrently. This can
flood receivers with CPU mondo interrupts for an extended period, causing
some unlucky senders to hit send-mondo timeout. This problem gets worse
as cpu count increases because sometimes mappings must be invalidated on
all CPUs, and sometimes all CPUs may gang up on a single CPU.

But a busy system is not a broken system. In the above scenario, as long
as the receiver is making forward progress processing mondo interrupts,
the sender should continue to retry.

This patch implements the receiver's forward progress meter by introducing
a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
of 0..NR_CPUS. The receiver increments its counter as soon as it receives
a mondo and the sender tracks the receiver's counter. If the receiver has
stopped making forward progress when the retry limit is reached, the sender
declares send-mondo-timeout and panic; otherwise, the receiver is allowed
to keep making forward progress.

In addition, it's been observed that PCIe hotplug events generate Correctable
Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
a guest cpu strand briefly to provide the service. If the cpu strand is
simultaneously the only cpu targeted by a mondo, it may not be available
for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
is the agreed wait time between hypervisor and guest OS, this patch makes
the adjustment.

Orabug: 25476541
Orabug: 26417466

Signed-off-by: Jane Chu &lt;jane.chu@oracle.com&gt;
Reviewed-by: Steve Sistare &lt;steven.sistare@oracle.com&gt;
Reviewed-by: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Reviewed-by: Rob Gardner &lt;rob.gardner@oracle.com&gt;
Reviewed-by: Thomas Tai &lt;thomas.tai@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Zero pages on allocation for mondo and error queues.</title>
<updated>2017-07-05T12:40:19+00:00</updated>
<author>
<name>Liam R. Howlett</name>
<email>Liam.Howlett@Oracle.com</email>
</author>
<published>2017-05-24T01:54:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8886196a73204d167b7f8797eb6ebf61e76794d6'/>
<id>8886196a73204d167b7f8797eb6ebf61e76794d6</id>
<content type='text'>
[ Upstream commit 7a7dc961a28b965a0d0303c2e989df17b411708b ]

Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events.  These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.

Note that the false positive overflow does not always occur because the
page allocation for these queues is so early in the boot cycle that
higher number CPUs get fresh pages.  It is only when traps are serviced
with lower number CPUs who were given already used pages that this issue
is exposed.

Signed-off-by: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7a7dc961a28b965a0d0303c2e989df17b411708b ]

Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events.  These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.

Note that the false positive overflow does not always occur because the
page allocation for these queues is so early in the boot cycle that
higher number CPUs get fresh pages.  It is only when traps are serviced
with lower number CPUs who were given already used pages that this issue
is exposed.

Signed-off-by: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Handle PIO &amp; MEM non-resumable errors.</title>
<updated>2017-07-05T12:40:19+00:00</updated>
<author>
<name>Liam R. Howlett</name>
<email>Liam.Howlett@Oracle.com</email>
</author>
<published>2017-05-24T01:54:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=41172b772da4b9d875ed3fb90fe0e1a86742dc2a'/>
<id>41172b772da4b9d875ed3fb90fe0e1a86742dc2a</id>
<content type='text'>
[ Upstream commit 047487241ff59374fded8c477f21453681f5995c ]

User processes trying to access an invalid memory address via PIO will
receive a SIGBUS signal instead of causing a panic.  Memory errors will
receive a SIGKILL since a SIGBUS may result in a coredump which may
attempt to repeat the faulting access.

Signed-off-by: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 047487241ff59374fded8c477f21453681f5995c ]

User processes trying to access an invalid memory address via PIO will
receive a SIGBUS signal instead of causing a panic.  Memory errors will
receive a SIGKILL since a SIGBUS may result in a coredump which may
attempt to repeat the faulting access.

Signed-off-by: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
