linux-toradex.git/arch/x86/kernel/apic, branch v4.2.7

x86/irq: Probe for PIC presence before allocating descs for legacy IRQs

2015-12-09T19:31:13+00:00

commit 8c058b0b9c34d8c8d7912880956543769323e2d8 upstream.

Commit d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain
interfaces") brought a regression for Hyper-V Gen2 instances. These
instances don't have i8259 legacy PIC but they use legacy IRQs for serial
port, rtc, and acpi. With this commit included we end up with these IRQs
not initialized. Earlier, there was a special workaround for legacy IRQs
in mp_map_pin_to_irq() doing mp_irqdomain_map() without looking at
nr_legacy_irqs() and now we fail in __irq_domain_alloc_irqs() when
irq_domain_alloc_descs() returns -EEXIST.

The essence of the issue seems to be that early_irq_init() calls
arch_probe_nr_irqs() to figure out the number of legacy IRQs before
we probe for i8259 and gets 16. Later when init_8259A() is called we switch
to NULL legacy PIC and nr_legacy_irqs() starts to return 0 but we already
have 16 descs allocated.

Solve the issue by separating i8259 probe from init and calling it in
arch_probe_nr_irqs() before we actually use nr_legacy_irqs() information.

Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Signed-off-by: Vitaly Kuznetsov 
Cc: Jiang Liu 
Cc: K. Y. Srinivasan 
Link: http://lkml.kernel.org/r/1446543614-3621-1-git-send-email-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman

x86/ioapic: Prevent NULL pointer dereference in setup_ioapic_dest()

2015-11-09T22:37:32+00:00

commit ababae44108b0e94b58eef6cb5bd830bd040a47f upstream.

Commit 4857c91f0d19 changed the way how irq affinity is setup in
setup_ioapic_dest() from using the core helper function to
unconditionally calling the irq_set_affinity() callback of the
underlying irq chip.

That results in a NULL pointer dereference for the rare case where the
underlying irq chip is lapic_chip which has no irq_set_affinity()
callback. lapic_chip is occasionally used for the timer interrupt (irq
0).

The fix is simple: Check the availability of the callback instead of
calling it unconditionally.

Fixes: 4857c91f0d19 "x86/ioapic: Force affinity setting in setup_ioapic_dest()"
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman

x86/ioapic: Force affinity setting in setup_ioapic_dest()

2015-10-22T21:49:18+00:00

commit 4857c91f0d195f05908fff296ba1ec5fca87066c upstream.

The recent ioapic cleanups changed the affinity setting in
setup_ioapic_dest() from a direct write to the hardware to the delayed
affinity setup via irq_set_affinity().

That results in a warning from chained_irq_exit():
WARNING: CPU: 0 PID: 5 at kernel/irq/migration.c:32 irq_move_masked_irq
[] irq_move_masked_irq+0xb8/0xc0
[] ioapic_ack_level+0x111/0x130
[] intel_gpio_irq_handler+0x148/0x1c0

The reason is that irq_set_affinity() does not write directly to the
hardware. It marks the affinity setting as pending and executes it
from the next interrupt. The chained handler infrastructure does not
take the irq descriptor lock for performance reasons because such a
chained interrupt is not visible to any interfaces. So the delayed
affinity setting triggers the warning in irq_move_masked_irq().

Restore the old behaviour by calling the set_affinity function of the
ioapic chip in setup_ioapic_dest(). This is safe as none of the
interrupts can be on the fly at this point.

Fixes: aa5cb97f14a2 'x86/irq: Remove x86_io_apic_ops.set_affinity and related interfaces'
Reported-and-tested-by: Mika Westerberg 
Signed-off-by: Thomas Gleixner 
Cc: Jiang Liu 
Cc: jarkko.nikula@linux.intel.com
Signed-off-by: Greg Kroah-Hartman

x86/apic: Serialize LVTT and TSC_DEADLINE writes

2015-10-22T21:49:18+00:00

commit 5d7c631d926b59aa16f3c56eaeb83f1036c81dc7 upstream.

The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an
MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not
guaranteed that the write to LVTT has reached the APIC before the
TSC_DEADLINE MSR is written. In such a case the write to the MSR is
ignored and as a consequence the local timer interrupt never fires.

The SDM decribes this issue for xAPIC and x2APIC modes. The
serialization methods recommended by the SDM differ.

xAPIC:
 "1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b.
  2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter.
  3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2.
  4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline."

x2APIC:
 "To allow for efficient access to the APIC registers in x2APIC mode,
  the serializing semantics of WRMSR are relaxed when writing to the
  APIC registers. Thus, system software should not use 'WRMSR to APIC
  registers in x2APIC mode' as a serializing instruction. Read and write
  accesses to the APIC registers will occur in program order. A WRMSR to
  an APIC register may complete before all preceding stores are globally
  visible; software can prevent this by inserting a serializing
  instruction, an SFENCE, or an MFENCE before the WRMSR."

The xAPIC method is to just wait for the memory mapped write to hit
the LVTT by checking whether the MSR write has reached the hardware.
There is no reason why a proper MFENCE after the memory mapped write would
not do the same. Andi Kleen confirmed that MFENCE is sufficient for the
xAPIC case as well.

Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done
unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE
support.

[ tglx: Massaged the changelog ]

Signed-off-by: Shaohua Li 
Reviewed-by: Ingo Molnar 
Cc: 
Cc: 
Cc: 
Cc: Andi Kleen 
Cc: H. Peter Anvin 
Link: http://lkml.kernel.org/r/20150909041352.GA2059853@devbig257.prn2.facebook.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman

x86/apic: Fix fallout from x2apic cleanup

2015-08-22T15:01:48+00:00

In the recent x2apic cleanup I got two things really wrong:
1) The safety check in __disable_x2apic which allows the function to
   be called unconditionally is backwards. The check is there to
   prevent access to the apic MSR in case that the machine has no
   apic. Though right now it returns if the machine has an apic and
   therefor the disabling of x2apic is never invoked.

2) x2apic_disable() sets x2apic_mode to 0 after registering the local
   apic. That's wrong, because register_lapic_address() checks x2apic
   mode and therefor takes the wrong code path.

This results in boot failures on machines with x2apic preenabled by
BIOS and can also lead to an fatal MSR access on machines without
apic.

The solutions are simple:
1) Correct the sanity check for apic availability
2) Clear x2apic_mode _before_ calling register_lapic_address()

Fixes: 659006bf3ae3 'x86/x2apic: Split enable and setup function'
Reported-and-tested-by: Javier Monteagudo 
Signed-off-by: Thomas Gleixner 
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1224764
Cc: stable@vger.kernel.org # 4.0+
Cc: Laura Abbott 
Cc: Jiang Liu 
Cc: Joerg Roedel 
Cc: Tony Luck 
Cc: Borislav Petkov

x86/irq: Build correct vector mapping for multiple MSI interrupts

2015-08-18T16:18:55+00:00

Alex Deucher, Mark Rustad and Alexander Holler reported a regression
with the latest v4.2-rc4 kernel, which breaks some SATA controllers.
With multi-MSI capable SATA controllers, only the first port works,
all other ports time out when executing SATA commands.

This happens because the first argument to assign_irq_vector_policy()
is always the base linux irq number of the multi MSI interrupt block,
so all subsequent vector assignments operate on the base linux irq
number, so all MSI irqs are handled as the first irq number. Therefor
the other MSI irqs of a device are never set up correctly and never
fire.

Add the loop iterator to the base irq number so all vectors are
assigned correctly.

Fixes: b5dc8e6c21e7 "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
Reported-and-tested-by: Alex Deucher 
Reported-and-tested-by: Mark Rustad 
Reported-and-tested-by: Alexander Holler 
Signed-off-by: Jiang Liu 
Cc: Tony Luck 
Link: http://lkml.kernel.org/r/1439911228-9880-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner

x86/irq: Use the caller provided polarity setting in mp_check_pin_attr()

2015-07-30T19:15:29+00:00

Commit d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical
irqdomain interfaces") introduced a regression which causes
malfunction of interrupt lines.

The reason is that the conversion of mp_check_pin_attr() missed to
update the polarity selection of the interrupt pin with the caller
provided setting and instead uses a stale attribute value. That in
turn results in chosing the wrong interrupt flow handler.

Use the caller supplied setting to configure the pin correctly which
also choses the correct interrupt flow handler.

This restores the original behaviour and on the affected
machine/driver (Surface Pro 3, i2c controller) all IOAPIC IRQ
configuration are identical to v4.1.

Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Reported-and-tested-by: Matt Fleming 
Reported-and-tested-by: Chen Yu 
Signed-off-by: Jiang Liu 
Cc: Tony Luck 
Cc: Chen Yu 
Cc: Yinghai Lu 
Link: http://lkml.kernel.org/r/1438242695-23531-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner

x86/irq: Plug irq vector hotplug race

2015-07-07T09:54:04+00:00

Jin debugged a nasty cpu hotplug race which results in leaking a irq
vector on the newly hotplugged cpu.

cpu N				cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu			  free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       setup_vector_irq                  arch_teardown_msi_irq
        __setup_vector_irq		   native_teardown_msi_irq
          lock(vector_lock)		     destroy_irq 
          install vectors
          unlock(vector_lock)
					       lock(vector_lock)
--->                                  	       __clear_irq_vector
                                    	       unlock(vector_lock)
    lock(vector_lock)
    set_cpu_online
    unlock(vector_lock)

This leaves the irq vector(s) which are torn down on CPU M stale in
the vector array of CPU N, because CPU M does not see CPU N online
yet. There is a similar issue with concurrent newly setup interrupts.

The alloc/free protection of irq descriptors does not prevent the
above race, because it merily prevents interrupt descriptors from
going away or changing concurrently.

Prevent this by moving the call to setup_vector_irq() into the
vector_lock held region which protects set_cpu_online():

cpu N				cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu			  free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       lock(vector_lock)                arch_teardown_msi_irq
       setup_vector_irq()
        __setup_vector_irq		   native_teardown_msi_irq
          install vectors		     destroy_irq 
       set_cpu_online
       unlock(vector_lock)
					       lock(vector_lock)
                                  	       __clear_irq_vector
                                    	       unlock(vector_lock)

So cpu M either sees the cpu N online before clearing the vector or
cpu N installs the vectors after cpu M has cleared it.

Reported-by: xiao jin 
Signed-off-by: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: Joerg Roedel 
Cc: Borislav Petkov 
Cc: Yanmin Zhang 
Link: http://lkml.kernel.org/r/20150705171102.141898931@linutronix.de

x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs

2015-05-19T13:51:17+00:00

Implement irq_set_vcpu_affinity for pci_msi_ir_controller.

Signed-off-by: Feng Wu 
Reviewed-by: Jiang Liu 
Link: http://lkml.kernel.org/r/1432026437-16560-3-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner

x86: Use entering[_ack]_irq() instead of open coding it

2015-05-15T14:03:18+00:00

Signed-off-by: Thomas Gleixner