<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch, branch v3.2.32</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>efi: initialize efi.runtime_version to make query_variable_info/update_capsule workable</title>
<updated>2012-10-17T02:49:56+00:00</updated>
<author>
<name>Seiji Aguchi</name>
<email>seiji.aguchi@hds.com</email>
</author>
<published>2012-07-24T13:27:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13fcf5d1c5fe491aeb0b0f34a46d88134432edb1'/>
<id>13fcf5d1c5fe491aeb0b0f34a46d88134432edb1</id>
<content type='text'>
commit d6cf86d8f23253225fe2a763d627ecf7dfee9dae upstream.

A value of efi.runtime_version is checked before calling
update_capsule()/query_variable_info() as follows.
But it isn't initialized anywhere.

&lt;snip&gt;
static efi_status_t virt_efi_query_variable_info(u32 attr,
                                                 u64 *storage_space,
                                                 u64 *remaining_space,
                                                 u64 *max_variable_size)
{
        if (efi.runtime_version &lt; EFI_2_00_SYSTEM_TABLE_REVISION)
                return EFI_UNSUPPORTED;
&lt;snip&gt;

This patch initializes a value of efi.runtime_version at boot time.

Signed-off-by: Seiji Aguchi &lt;seiji.aguchi@hds.com&gt;
Acked-by: Matthew Garrett &lt;mjg@redhat.com&gt;
Signed-off-by: Matt Fleming &lt;matt.fleming@intel.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d6cf86d8f23253225fe2a763d627ecf7dfee9dae upstream.

A value of efi.runtime_version is checked before calling
update_capsule()/query_variable_info() as follows.
But it isn't initialized anywhere.

&lt;snip&gt;
static efi_status_t virt_efi_query_variable_info(u32 attr,
                                                 u64 *storage_space,
                                                 u64 *remaining_space,
                                                 u64 *max_variable_size)
{
        if (efi.runtime_version &lt; EFI_2_00_SYSTEM_TABLE_REVISION)
                return EFI_UNSUPPORTED;
&lt;snip&gt;

This patch initializes a value of efi.runtime_version at boot time.

Signed-off-by: Seiji Aguchi &lt;seiji.aguchi@hds.com&gt;
Acked-by: Matthew Garrett &lt;mjg@redhat.com&gt;
Signed-off-by: Matt Fleming &lt;matt.fleming@intel.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: thp: fix pmd_present for split_huge_page and PROT_NONE with THP</title>
<updated>2012-10-17T02:49:45+00:00</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2012-10-08T23:33:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ddd937a27bb51cda5a5400df8e18e9a29d7fa8ec'/>
<id>ddd937a27bb51cda5a5400df8e18e9a29d7fa8ec</id>
<content type='text'>
commit 027ef6c87853b0a9df53175063028edb4950d476 upstream.

In many places !pmd_present has been converted to pmd_none.  For pmds
that's equivalent and pmd_none is quicker so using pmd_none is better.

However (unless we delete pmd_present) we should provide an accurate
pmd_present too.  This will avoid the risk of code thinking the pmd is non
present because it's under __split_huge_page_map, see the pmd_mknotpresent
there and the comment above it.

If the page has been mprotected as PROT_NONE, it would also lead to a
pmd_present false negative in the same way as the race with
split_huge_page.

Because the PSE bit stays on at all times (both during split_huge_page and
when the _PAGE_PROTNONE bit get set), we could only check for the PSE bit,
but checking the PROTNONE bit too is still good to remember pmd_present
must always keep PROT_NONE into account.

This explains a not reproducible BUG_ON that was seldom reported on the
lists.

The same issue is in pmd_large, it would go wrong with both PROT_NONE and
if it races with split_huge_page.

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 027ef6c87853b0a9df53175063028edb4950d476 upstream.

In many places !pmd_present has been converted to pmd_none.  For pmds
that's equivalent and pmd_none is quicker so using pmd_none is better.

However (unless we delete pmd_present) we should provide an accurate
pmd_present too.  This will avoid the risk of code thinking the pmd is non
present because it's under __split_huge_page_map, see the pmd_mknotpresent
there and the comment above it.

If the page has been mprotected as PROT_NONE, it would also lead to a
pmd_present false negative in the same way as the race with
split_huge_page.

Because the PSE bit stays on at all times (both during split_huge_page and
when the _PAGE_PROTNONE bit get set), we could only check for the PSE bit,
but checking the PROTNONE bit too is still good to remember pmd_present
must always keep PROT_NONE into account.

This explains a not reproducible BUG_ON that was seldom reported on the
lists.

The same issue is in pmd_large, it would go wrong with both PROT_NONE and
if it races with split_huge_page.

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: OMAP: counter: add locking to read_persistent_clock</title>
<updated>2012-10-17T02:49:44+00:00</updated>
<author>
<name>Colin Cross</name>
<email>ccross@android.com</email>
</author>
<published>2012-10-08T21:01:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eb04142b260198116be79acafc28aca04606ff51'/>
<id>eb04142b260198116be79acafc28aca04606ff51</id>
<content type='text'>
commit 9d7d6e363b06934221b81a859d509844c97380df upstream.

read_persistent_clock uses a global variable, use a spinlock to
ensure non-atomic updates to the variable don't overlap and cause
time to move backwards.

Signed-off-by: Colin Cross &lt;ccross@android.com&gt;
Signed-off-by: R Sricharan &lt;r.sricharan@ti.com&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9d7d6e363b06934221b81a859d509844c97380df upstream.

read_persistent_clock uses a global variable, use a spinlock to
ensure non-atomic updates to the variable don't overlap and cause
time to move backwards.

Signed-off-by: Colin Cross &lt;ccross@android.com&gt;
Signed-off-by: R Sricharan &lt;r.sricharan@ti.com&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mn10300: only add -mmem-funcs to KBUILD_CFLAGS if gcc supports it</title>
<updated>2012-10-17T02:49:33+00:00</updated>
<author>
<name>Geert Uytterhoeven</name>
<email>geert@linux-m68k.org</email>
</author>
<published>2012-10-05T00:11:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=904e3e1af742e8893f257991aff8eafcfb48e73f'/>
<id>904e3e1af742e8893f257991aff8eafcfb48e73f</id>
<content type='text'>
commit 9957423f035c2071f6d1c5d2f095cdafbeb25ad7 upstream.

It seems the current (gcc 4.6.3) no longer provides this so make it
conditional.

As reported by Tony before, the mn10300 architecture cross-compiles with
gcc-4.6.3 if -mmem-funcs is not added to KBUILD_CFLAGS.

Reported-by: Tony Breeds &lt;tony@bakeyournoodle.com&gt;
Signed-off-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Koichi Yasutake &lt;yasutake.koichi@jp.panasonic.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9957423f035c2071f6d1c5d2f095cdafbeb25ad7 upstream.

It seems the current (gcc 4.6.3) no longer provides this so make it
conditional.

As reported by Tony before, the mn10300 architecture cross-compiles with
gcc-4.6.3 if -mmem-funcs is not added to KBUILD_CFLAGS.

Reported-by: Tony Breeds &lt;tony@bakeyournoodle.com&gt;
Signed-off-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Koichi Yasutake &lt;yasutake.koichi@jp.panasonic.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kbuild: Fix gcc -x syntax</title>
<updated>2012-10-17T02:49:27+00:00</updated>
<author>
<name>Jean Delvare</name>
<email>jdelvare@suse.de</email>
</author>
<published>2012-10-02T14:42:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6454738d4395c8a79843f5c2e3eeb687f9b7e7a9'/>
<id>6454738d4395c8a79843f5c2e3eeb687f9b7e7a9</id>
<content type='text'>
commit b1e0d8b70fa31821ebca3965f2ef8619d7c5e316 upstream.

The correct syntax for gcc -x is "gcc -x assembler", not
"gcc -xassembler". Even though the latter happens to work, the former
is what is documented in the manual page and thus what gcc wrappers
such as icecream do expect.

This isn't a cosmetic change. The missing space prevents icecream from
recognizing compilation tasks it can't handle, leading to silent kernel
miscompilations.

Besides me, credits go to Michael Matz and Dirk Mueller for
investigating the miscompilation issue and tracking it down to this
incorrect -x parameter syntax.

Signed-off-by: Jean Delvare &lt;jdelvare@suse.de&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Bernhard Walle &lt;bernhard@bwalle.de&gt;
Cc: Michal Marek &lt;mmarek@suse.cz&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Michal Marek &lt;mmarek@suse.cz&gt;
[bwh: Backported to 3.2: drop unneeded change to arch/x86/Makefile]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b1e0d8b70fa31821ebca3965f2ef8619d7c5e316 upstream.

The correct syntax for gcc -x is "gcc -x assembler", not
"gcc -xassembler". Even though the latter happens to work, the former
is what is documented in the manual page and thus what gcc wrappers
such as icecream do expect.

This isn't a cosmetic change. The missing space prevents icecream from
recognizing compilation tasks it can't handle, leading to silent kernel
miscompilations.

Besides me, credits go to Michael Matz and Dirk Mueller for
investigating the miscompilation issue and tracking it down to this
incorrect -x parameter syntax.

Signed-off-by: Jean Delvare &lt;jdelvare@suse.de&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Bernhard Walle &lt;bernhard@bwalle.de&gt;
Cc: Michal Marek &lt;mmarek@suse.cz&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Michal Marek &lt;mmarek@suse.cz&gt;
[bwh: Backported to 3.2: drop unneeded change to arch/x86/Makefile]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Lock module while handling EEH event</title>
<updated>2012-10-17T02:48:32+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>shangw@linux.vnet.ibm.com</email>
</author>
<published>2012-09-17T04:34:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7ced5a34e7b6f387c1b50115874fc78c4e4ff84e'/>
<id>7ced5a34e7b6f387c1b50115874fc78c4e4ff84e</id>
<content type='text'>
commit feadf7c0a1a7c08c74bebb4a13b755f8c40e3bbc upstream.

The EEH core is talking with the PCI device driver to determine the
action (purely reset, or PCI device removal). During the period, the
driver might be unloaded and in turn causes kernel crash as follows:

EEH: Detected PCI bus error on PHB#4-PE#10000
EEH: This PCI device has failed 3 times in the last hour
lpfc 0004:01:00.0: 0:2710 PCI channel disable preparing for reset
Unable to handle kernel paging request for data at address 0x00000490
Faulting instruction address: 0xd00000000e682c90
cpu 0x1: Vector: 300 (Data Access) at [c000000fc75ffa20]
    pc: d00000000e682c90: .lpfc_io_error_detected+0x30/0x240 [lpfc]
    lr: d00000000e682c8c: .lpfc_io_error_detected+0x2c/0x240 [lpfc]
    sp: c000000fc75ffca0
   msr: 8000000000009032
   dar: 490
 dsisr: 40000000
  current = 0xc000000fc79b88b0
  paca    = 0xc00000000edb0380	 softe: 0	 irq_happened: 0x00
    pid   = 3386, comm = eehd
enter ? for help
[c000000fc75ffca0] c000000fc75ffd30 (unreliable)
[c000000fc75ffd30] c00000000004fd3c .eeh_report_error+0x7c/0xf0
[c000000fc75ffdc0] c00000000004ee00 .eeh_pe_dev_traverse+0xa0/0x180
[c000000fc75ffe70] c00000000004ffd8 .eeh_handle_event+0x68/0x300
[c000000fc75fff00] c0000000000503a0 .eeh_event_handler+0x130/0x1a0
[c000000fc75fff90] c000000000020138 .kernel_thread+0x54/0x70
1:mon&gt;

The patch increases the reference of the corresponding driver modules
while EEH core does the negotiation with PCI device driver so that the
corresponding driver modules can't be unloaded during the period and
we're safe to refer the callbacks.

Reported-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - Reporting functions return int (success = 0), not void * (success = NULL)
 - Assume that the 'dev' arguments are non-null]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit feadf7c0a1a7c08c74bebb4a13b755f8c40e3bbc upstream.

The EEH core is talking with the PCI device driver to determine the
action (purely reset, or PCI device removal). During the period, the
driver might be unloaded and in turn causes kernel crash as follows:

EEH: Detected PCI bus error on PHB#4-PE#10000
EEH: This PCI device has failed 3 times in the last hour
lpfc 0004:01:00.0: 0:2710 PCI channel disable preparing for reset
Unable to handle kernel paging request for data at address 0x00000490
Faulting instruction address: 0xd00000000e682c90
cpu 0x1: Vector: 300 (Data Access) at [c000000fc75ffa20]
    pc: d00000000e682c90: .lpfc_io_error_detected+0x30/0x240 [lpfc]
    lr: d00000000e682c8c: .lpfc_io_error_detected+0x2c/0x240 [lpfc]
    sp: c000000fc75ffca0
   msr: 8000000000009032
   dar: 490
 dsisr: 40000000
  current = 0xc000000fc79b88b0
  paca    = 0xc00000000edb0380	 softe: 0	 irq_happened: 0x00
    pid   = 3386, comm = eehd
enter ? for help
[c000000fc75ffca0] c000000fc75ffd30 (unreliable)
[c000000fc75ffd30] c00000000004fd3c .eeh_report_error+0x7c/0xf0
[c000000fc75ffdc0] c00000000004ee00 .eeh_pe_dev_traverse+0xa0/0x180
[c000000fc75ffe70] c00000000004ffd8 .eeh_handle_event+0x68/0x300
[c000000fc75fff00] c0000000000503a0 .eeh_event_handler+0x130/0x1a0
[c000000fc75fff90] c000000000020138 .kernel_thread+0x54/0x70
1:mon&gt;

The patch increases the reference of the corresponding driver modules
while EEH core does the negotiation with PCI device driver so that the
corresponding driver modules can't be unloaded during the period and
we're safe to refer the callbacks.

Reported-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - Reporting functions return int (success = 0), not void * (success = NULL)
 - Assume that the 'dev' arguments are non-null]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/alternatives: Fix p6 nops on non-modular kernels</title>
<updated>2012-10-10T02:31:15+00:00</updated>
<author>
<name>Avi Kivity</name>
<email>avi@redhat.com</email>
</author>
<published>2012-08-22T10:03:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=943df77ba7b5f75e16bf47678146fb0c2af5ef70'/>
<id>943df77ba7b5f75e16bf47678146fb0c2af5ef70</id>
<content type='text'>
commit cb09cad44f07044d9810f18f6f9a6a6f3771f979 upstream.

Probably a leftover from the early days of self-patching, p6nops
are marked __initconst_or_module, which causes them to be
discarded in a non-modular kernel.  If something later triggers
patching, it will overwrite kernel code with garbage.

Reported-by: Tomas Racek &lt;tracek@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
Cc: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Borislav Petkov &lt;borislav.petkov@amd.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: qemu-devel@nongnu.org
Cc: Anthony Liguori &lt;anthony@codemonkey.ws&gt;
Cc: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Cc: Alan Cox &lt;alan@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/5034AE84.90708@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit cb09cad44f07044d9810f18f6f9a6a6f3771f979 upstream.

Probably a leftover from the early days of self-patching, p6nops
are marked __initconst_or_module, which causes them to be
discarded in a non-modular kernel.  If something later triggers
patching, it will overwrite kernel code with garbage.

Reported-by: Tomas Racek &lt;tracek@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;
Cc: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Borislav Petkov &lt;borislav.petkov@amd.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: qemu-devel@nongnu.org
Cc: Anthony Liguori &lt;anthony@codemonkey.ws&gt;
Cc: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Cc: Alan Cox &lt;alan@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/5034AE84.90708@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>xen/boot: Disable NUMA for PV guests.</title>
<updated>2012-10-10T02:31:04+00:00</updated>
<author>
<name>Konrad Rzeszutek Wilk</name>
<email>konrad.wilk@oracle.com</email>
</author>
<published>2012-08-17T14:22:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9109ae151b6dd96238b595240ac72e537a84d59d'/>
<id>9109ae151b6dd96238b595240ac72e537a84d59d</id>
<content type='text'>
commit 8d54db795dfb1049d45dc34f0dddbc5347ec5642 upstream.

The hypervisor is in charge of allocating the proper "NUMA" memory
and dealing with the CPU scheduler to keep them bound to the proper
NUMA node. The PV guests (and PVHVM) have no inkling of where they
run and do not need to know that right now. In the future we will
need to inject NUMA configuration data (if a guest spans two or more
NUMA nodes) so that the kernel can make the right choices. But those
patches are not yet present.

In the meantime, disable the NUMA capability in the PV guest, which
also fixes a bootup issue. Andre says:

"we see Dom0 crashes due to the kernel detecting the NUMA topology not
by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA).

This will detect the actual NUMA config of the physical machine, but
will crash about the mismatch with Dom0's virtual memory. Variation of
the theme: Dom0 sees what it's not supposed to see.

This happens with the said config option enabled and on a machine where
this scanning is still enabled (K8 and Fam10h, not Bulldozer class)

We have this dump then:
NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
Scanning NUMA topology in Northbridge 24
Number of physical nodes 4
Node 0 MemBase 0000000000000000 Limit 0000000040000000
Node 1 MemBase 0000000040000000 Limit 0000000138000000
Node 2 MemBase 0000000138000000 Limit 00000001f8000000
Node 3 MemBase 00000001f8000000 Limit 0000000238000000
Initmem setup node 0 0000000000000000-0000000040000000
  NODE_DATA [000000003ffd9000 - 000000003fffffff]
Initmem setup node 1 0000000040000000-0000000138000000
  NODE_DATA [0000000137fd9000 - 0000000137ffffff]
Initmem setup node 2 0000000138000000-00000001f8000000
  NODE_DATA [00000001f095e000 - 00000001f0984fff]
Initmem setup node 3 00000001f8000000-0000000238000000
Cannot find 159744 bytes in node 3
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [&lt;ffffffff81d220e6&gt;] __alloc_bootmem_node+0x43/0x96
Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar
RIP: e030:[&lt;ffffffff81d220e6&gt;]  [&lt;ffffffff81d220e6&gt;] __alloc_bootmem_node+0x43/0x96
.. snip..
  [&lt;ffffffff81d23024&gt;] sparse_early_usemaps_alloc_node+0x64/0x178
  [&lt;ffffffff81d23348&gt;] sparse_init+0xe4/0x25a
  [&lt;ffffffff81d16840&gt;] paging_init+0x13/0x22
  [&lt;ffffffff81d07fbb&gt;] setup_arch+0x9c6/0xa9b
  [&lt;ffffffff81683954&gt;] ? printk+0x3c/0x3e
  [&lt;ffffffff81d01a38&gt;] start_kernel+0xe5/0x468
  [&lt;ffffffff81d012cf&gt;] x86_64_start_reservations+0xba/0xc1
  [&lt;ffffffff81007153&gt;] ? xen_setup_runstate_info+0x2c/0x36
  [&lt;ffffffff81d050ee&gt;] xen_start_kernel+0x565/0x56c
"

so we just disable NUMA scanning by setting numa_off=1.

Reported-and-Tested-by: Andre Przywara &lt;andre.przywara@amd.com&gt;
Acked-by: Andre Przywara &lt;andre.przywara@amd.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8d54db795dfb1049d45dc34f0dddbc5347ec5642 upstream.

The hypervisor is in charge of allocating the proper "NUMA" memory
and dealing with the CPU scheduler to keep them bound to the proper
NUMA node. The PV guests (and PVHVM) have no inkling of where they
run and do not need to know that right now. In the future we will
need to inject NUMA configuration data (if a guest spans two or more
NUMA nodes) so that the kernel can make the right choices. But those
patches are not yet present.

In the meantime, disable the NUMA capability in the PV guest, which
also fixes a bootup issue. Andre says:

"we see Dom0 crashes due to the kernel detecting the NUMA topology not
by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA).

This will detect the actual NUMA config of the physical machine, but
will crash about the mismatch with Dom0's virtual memory. Variation of
the theme: Dom0 sees what it's not supposed to see.

This happens with the said config option enabled and on a machine where
this scanning is still enabled (K8 and Fam10h, not Bulldozer class)

We have this dump then:
NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
Scanning NUMA topology in Northbridge 24
Number of physical nodes 4
Node 0 MemBase 0000000000000000 Limit 0000000040000000
Node 1 MemBase 0000000040000000 Limit 0000000138000000
Node 2 MemBase 0000000138000000 Limit 00000001f8000000
Node 3 MemBase 00000001f8000000 Limit 0000000238000000
Initmem setup node 0 0000000000000000-0000000040000000
  NODE_DATA [000000003ffd9000 - 000000003fffffff]
Initmem setup node 1 0000000040000000-0000000138000000
  NODE_DATA [0000000137fd9000 - 0000000137ffffff]
Initmem setup node 2 0000000138000000-00000001f8000000
  NODE_DATA [00000001f095e000 - 00000001f0984fff]
Initmem setup node 3 00000001f8000000-0000000238000000
Cannot find 159744 bytes in node 3
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [&lt;ffffffff81d220e6&gt;] __alloc_bootmem_node+0x43/0x96
Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar
RIP: e030:[&lt;ffffffff81d220e6&gt;]  [&lt;ffffffff81d220e6&gt;] __alloc_bootmem_node+0x43/0x96
.. snip..
  [&lt;ffffffff81d23024&gt;] sparse_early_usemaps_alloc_node+0x64/0x178
  [&lt;ffffffff81d23348&gt;] sparse_init+0xe4/0x25a
  [&lt;ffffffff81d16840&gt;] paging_init+0x13/0x22
  [&lt;ffffffff81d07fbb&gt;] setup_arch+0x9c6/0xa9b
  [&lt;ffffffff81683954&gt;] ? printk+0x3c/0x3e
  [&lt;ffffffff81d01a38&gt;] start_kernel+0xe5/0x468
  [&lt;ffffffff81d012cf&gt;] x86_64_start_reservations+0xba/0xc1
  [&lt;ffffffff81007153&gt;] ? xen_setup_runstate_info+0x2c/0x36
  [&lt;ffffffff81d050ee&gt;] xen_start_kernel+0x565/0x56c
"

so we just disable NUMA scanning by setting numa_off=1.

Reported-and-Tested-by: Andre Przywara &lt;andre.przywara@amd.com&gt;
Acked-by: Andre Przywara &lt;andre.przywara@amd.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>xen/boot: Disable BIOS SMP MP table search.</title>
<updated>2012-10-10T02:30:59+00:00</updated>
<author>
<name>Konrad Rzeszutek Wilk</name>
<email>konrad.wilk@oracle.com</email>
</author>
<published>2012-09-19T12:30:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0a193b148d6d0ed05ea82f466b6d7eac50b87ac5'/>
<id>0a193b148d6d0ed05ea82f466b6d7eac50b87ac5</id>
<content type='text'>
commit bd49940a35ec7d488ae63bd625639893b3385b97 upstream.

As the initial domain we are able to search/map certain regions
of memory to harvest configuration data. For all low-level we
use ACPI tables - for interrupts we use exclusively ACPI _PRT
(so DSDT) and MADT for INT_SRC_OVR.

The SMP MP table is not used at all. As a matter of fact we do
not even support machines that only have SMP MP but no ACPI tables.

Lets follow how Moorestown does it and just disable searching
for BIOS SMP tables.

This also fixes an issue on HP Proliant BL680c G5 and DL380 G6:

9f-&gt;100 for 1:1 PTE
Freeing 9f-100 pfn range: 97 pages freed
1-1 mapping on 9f-&gt;100
.. snip..
e820: BIOS-provided physical RAM map:
Xen: [mem 0x0000000000000000-0x000000000009efff] usable
Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved
Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable
.. snip..
Scan for SMP in [mem 0x00000000-0x000003ff]
Scan for SMP in [mem 0x0009fc00-0x0009ffff]
Scan for SMP in [mem 0x000f0000-0x000fffff]
found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0]
(XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0
(XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [&lt;ffffffff81ac07e2&gt;] xen_set_pte_init+0x66/0x71
. snip..
Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5
.. snip..
Call Trace:
 [&lt;ffffffff81ad31c6&gt;] __early_ioremap+0x18a/0x248
 [&lt;ffffffff81624731&gt;] ? printk+0x48/0x4a
 [&lt;ffffffff81ad32ac&gt;] early_ioremap+0x13/0x15
 [&lt;ffffffff81acc140&gt;] get_mpc_size+0x2f/0x67
 [&lt;ffffffff81acc284&gt;] smp_scan_config+0x10c/0x136
 [&lt;ffffffff81acc2e4&gt;] default_find_smp_config+0x36/0x5a
 [&lt;ffffffff81ac3085&gt;] setup_arch+0x5b3/0xb5b
 [&lt;ffffffff81624731&gt;] ? printk+0x48/0x4a
 [&lt;ffffffff81abca7f&gt;] start_kernel+0x90/0x390
 [&lt;ffffffff81abc356&gt;] x86_64_start_reservations+0x131/0x136
 [&lt;ffffffff81abfa83&gt;] xen_start_kernel+0x65f/0x661
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

which is that ioremap would end up mapping 0xff using _PAGE_IOMAP
(which is what early_ioremap sticks as a flag) - which meant
we would get MFN 0xFF (pte ff461, which is OK), and then it would
also map 0x100 (b/c ioremap tries to get page aligned request, and
it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page)
as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP
bypasses the P2M lookup we would happily set the PTE to 1000461.
Xen would deny the request since we do not have access to the
Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example
0x80140.

Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665
Acked-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bd49940a35ec7d488ae63bd625639893b3385b97 upstream.

As the initial domain we are able to search/map certain regions
of memory to harvest configuration data. For all low-level we
use ACPI tables - for interrupts we use exclusively ACPI _PRT
(so DSDT) and MADT for INT_SRC_OVR.

The SMP MP table is not used at all. As a matter of fact we do
not even support machines that only have SMP MP but no ACPI tables.

Lets follow how Moorestown does it and just disable searching
for BIOS SMP tables.

This also fixes an issue on HP Proliant BL680c G5 and DL380 G6:

9f-&gt;100 for 1:1 PTE
Freeing 9f-100 pfn range: 97 pages freed
1-1 mapping on 9f-&gt;100
.. snip..
e820: BIOS-provided physical RAM map:
Xen: [mem 0x0000000000000000-0x000000000009efff] usable
Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved
Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable
.. snip..
Scan for SMP in [mem 0x00000000-0x000003ff]
Scan for SMP in [mem 0x0009fc00-0x0009ffff]
Scan for SMP in [mem 0x000f0000-0x000fffff]
found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0]
(XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0
(XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [&lt;ffffffff81ac07e2&gt;] xen_set_pte_init+0x66/0x71
. snip..
Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5
.. snip..
Call Trace:
 [&lt;ffffffff81ad31c6&gt;] __early_ioremap+0x18a/0x248
 [&lt;ffffffff81624731&gt;] ? printk+0x48/0x4a
 [&lt;ffffffff81ad32ac&gt;] early_ioremap+0x13/0x15
 [&lt;ffffffff81acc140&gt;] get_mpc_size+0x2f/0x67
 [&lt;ffffffff81acc284&gt;] smp_scan_config+0x10c/0x136
 [&lt;ffffffff81acc2e4&gt;] default_find_smp_config+0x36/0x5a
 [&lt;ffffffff81ac3085&gt;] setup_arch+0x5b3/0xb5b
 [&lt;ffffffff81624731&gt;] ? printk+0x48/0x4a
 [&lt;ffffffff81abca7f&gt;] start_kernel+0x90/0x390
 [&lt;ffffffff81abc356&gt;] x86_64_start_reservations+0x131/0x136
 [&lt;ffffffff81abfa83&gt;] xen_start_kernel+0x65f/0x661
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

which is that ioremap would end up mapping 0xff using _PAGE_IOMAP
(which is what early_ioremap sticks as a flag) - which meant
we would get MFN 0xFF (pte ff461, which is OK), and then it would
also map 0x100 (b/c ioremap tries to get page aligned request, and
it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page)
as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP
bypasses the P2M lookup we would happily set the PTE to 1000461.
Xen would deny the request since we do not have access to the
Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example
0x80140.

Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665
Acked-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 7532/1: decompressor: reset SCTLR.TRE for VMSA ARMv7 cores</title>
<updated>2012-10-10T02:30:53+00:00</updated>
<author>
<name>Matthew Leach</name>
<email>matthew.leach@arm.com</email>
</author>
<published>2012-09-11T16:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=622bba6d5a6210572325c89a7deb61175a2330a5'/>
<id>622bba6d5a6210572325c89a7deb61175a2330a5</id>
<content type='text'>
commit e1e5b7e4251c7538ca08c2c5545b0c2fbd8a6635 upstream.

This patch zeroes the SCTLR.TRE bit prior to setting the mapping as
cacheable for ARMv7 cores in the decompressor, ensuring that the
memory region attributes are obtained from the C and B bits, not from
the page tables.

Cc: Nicolas Pitre &lt;nico@fluxnic.net&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Matthew Leach &lt;matthew.leach@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e1e5b7e4251c7538ca08c2c5545b0c2fbd8a6635 upstream.

This patch zeroes the SCTLR.TRE bit prior to setting the mapping as
cacheable for ARMv7 cores in the decompressor, ensuring that the
memory region attributes are obtained from the C and B bits, not from
the page tables.

Cc: Nicolas Pitre &lt;nico@fluxnic.net&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Matthew Leach &lt;matthew.leach@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
