<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/irq/affinity.c, branch v5.12-rc5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>genirq/affinity: Remove const qualifier from node_to_cpumask argument</title>
<updated>2019-08-28T10:20:43+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-08-28T08:58:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=101f85b56d03b36418bbf867f67d81710839b0ec'/>
<id>101f85b56d03b36418bbf867f67d81710839b0ec</id>
<content type='text'>
When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as

'typedef struct cpumask cpumask_var_t[1]',

so the argument 'node_to_cpumask' alloc_nodes_vectors() can't be declared
as 'const cpumask_var_t *'

Fixes the following warning:

   kernel/irq/affinity.c: In function '__irq_build_affinity_masks':
     alloc_nodes_vectors(numvecs, node_to_cpumask, cpu_mask,
                                  ^
   kernel/irq/affinity.c:128:13: note: expected 'const struct cpumask (*)[1]' but argument is of type 'struct cpumask (*)[1]'
    static void alloc_nodes_vectors(unsigned int numvecs,
                ^
Fixes: b1a5a73e64e9 ("genirq/affinity: Spread vectors on node according to nr_cpu ratio")
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20190828085815.19931-1-ming.lei@redhat.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as

'typedef struct cpumask cpumask_var_t[1]',

so the argument 'node_to_cpumask' alloc_nodes_vectors() can't be declared
as 'const cpumask_var_t *'

Fixes the following warning:

   kernel/irq/affinity.c: In function '__irq_build_affinity_masks':
     alloc_nodes_vectors(numvecs, node_to_cpumask, cpu_mask,
                                  ^
   kernel/irq/affinity.c:128:13: note: expected 'const struct cpumask (*)[1]' but argument is of type 'struct cpumask (*)[1]'
    static void alloc_nodes_vectors(unsigned int numvecs,
                ^
Fixes: b1a5a73e64e9 ("genirq/affinity: Spread vectors on node according to nr_cpu ratio")
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20190828085815.19931-1-ming.lei@redhat.com

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Spread vectors on node according to nr_cpu ratio</title>
<updated>2019-08-27T14:31:17+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-08-16T02:28:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b1a5a73e64e99faa5f4deef2ae96d7371a0fb5d0'/>
<id>b1a5a73e64e99faa5f4deef2ae96d7371a0fb5d0</id>
<content type='text'>
Now __irq_build_affinity_masks() spreads vectors evenly per node, but there
is a case that not all vectors have been spread when each numa node has a
different number of CPUs which triggers the warning in the spreading code.

Improve the spreading algorithm by

 - assigning vectors according to the ratio of the number of CPUs on a node
   to the number of remaining CPUs.

 - running the assignment from smaller nodes to bigger nodes to guarantee
   that every active node gets allocated at least one vector.

This ensures that all vectors are spread out. Asided of that the spread
becomes more fair if the nodes have different number of CPUs.

For example, on the following machine:
	CPU(s):              16
	On-line CPU(s) list: 0-15
	Thread(s) per core:  1
	Core(s) per socket:  8
	Socket(s):           2
	NUMA node(s):        2
	...
	NUMA node0 CPU(s):   0,1,3,5-9,11,13-15
	NUMA node1 CPU(s):   2,4,10,12

When a driver requests to allocate 8 vectors, the following spread results:

	irq 31, cpu list 2,4
	irq 32, cpu list 10,12
	irq 33, cpu list 0-1
	irq 34, cpu list 3,5
	irq 35, cpu list 6-7
	irq 36, cpu list 8-9
	irq 37, cpu list 11,13
	irq 38, cpu list 14-15

So Node 0 has now 6 and Node 1 has 2 vectors assigned. The original
algorithm assigned 4 vectors on each node which was unfair versus Node 0.

[ tglx: Massaged changelog ]

Reported-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Link: https://lkml.kernel.org/r/20190816022849.14075-3-ming.lei@redhat.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now __irq_build_affinity_masks() spreads vectors evenly per node, but there
is a case that not all vectors have been spread when each numa node has a
different number of CPUs which triggers the warning in the spreading code.

Improve the spreading algorithm by

 - assigning vectors according to the ratio of the number of CPUs on a node
   to the number of remaining CPUs.

 - running the assignment from smaller nodes to bigger nodes to guarantee
   that every active node gets allocated at least one vector.

This ensures that all vectors are spread out. Asided of that the spread
becomes more fair if the nodes have different number of CPUs.

For example, on the following machine:
	CPU(s):              16
	On-line CPU(s) list: 0-15
	Thread(s) per core:  1
	Core(s) per socket:  8
	Socket(s):           2
	NUMA node(s):        2
	...
	NUMA node0 CPU(s):   0,1,3,5-9,11,13-15
	NUMA node1 CPU(s):   2,4,10,12

When a driver requests to allocate 8 vectors, the following spread results:

	irq 31, cpu list 2,4
	irq 32, cpu list 10,12
	irq 33, cpu list 0-1
	irq 34, cpu list 3,5
	irq 35, cpu list 6-7
	irq 36, cpu list 8-9
	irq 37, cpu list 11,13
	irq 38, cpu list 14-15

So Node 0 has now 6 and Node 1 has 2 vectors assigned. The original
algorithm assigned 4 vectors on each node which was unfair versus Node 0.

[ tglx: Massaged changelog ]

Reported-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Link: https://lkml.kernel.org/r/20190816022849.14075-3-ming.lei@redhat.com

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Improve __irq_build_affinity_masks()</title>
<updated>2019-08-27T14:31:17+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-08-16T02:28:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=53c1788b7d7720565214a466afffdc818d8c6e5f'/>
<id>53c1788b7d7720565214a466afffdc818d8c6e5f</id>
<content type='text'>
One invariant of __irq_build_affinity_masks() is that all CPUs in the
specified masks (cpu_mask AND node_to_cpumask for each node) should be
covered during the spread. Even though all requested vectors have been
reached, it's still required to spread vectors among remained CPUs. A
similar policy has been taken in case of 'numvecs &lt;= nodes' already.

So remove the following check inside the loop:

	if (done &gt;= numvecs)
		break;

Meantime assign at least 1 vector for remaining nodes if 'numvecs' vectors
have been handled already.

Also, if the specified cpumask for one numa node is empty, simply do not
spread vectors on this node.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20190816022849.14075-2-ming.lei@redhat.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
One invariant of __irq_build_affinity_masks() is that all CPUs in the
specified masks (cpu_mask AND node_to_cpumask for each node) should be
covered during the spread. Even though all requested vectors have been
reached, it's still required to spread vectors among remained CPUs. A
similar policy has been taken in case of 'numvecs &lt;= nodes' already.

So remove the following check inside the loop:

	if (done &gt;= numvecs)
		break;

Meantime assign at least 1 vector for remaining nodes if 'numvecs' vectors
have been handled already.

Also, if the specified cpumask for one numa node is empty, simply do not
spread vectors on this node.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20190816022849.14075-2-ming.lei@redhat.com

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Create affinity mask for single vector</title>
<updated>2019-08-08T06:47:55+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-08-05T01:19:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=491beed3b102b6e6c0e7734200661242226e3933'/>
<id>491beed3b102b6e6c0e7734200661242226e3933</id>
<content type='text'>
Since commit c66d4bd110a1f8 ("genirq/affinity: Add new callback for
(re)calculating interrupt sets"), irq_create_affinity_masks() returns
NULL in case of single vector. This change has caused regression on some
drivers, such as lpfc.

The problem is that single vector requests can happen in some generic cases:

  1) kdump kernel

  2) irq vectors resource is close to exhaustion.

If in that situation the affinity mask for a single vector is not created,
every caller has to handle the special case.

There is no reason why the mask cannot be created, so remove the check for
a single vector and create the mask.

Fixes: c66d4bd110a1f8 ("genirq/affinity: Add new callback for (re)calculating interrupt sets")
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190805011906.5020-1-ming.lei@redhat.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit c66d4bd110a1f8 ("genirq/affinity: Add new callback for
(re)calculating interrupt sets"), irq_create_affinity_masks() returns
NULL in case of single vector. This change has caused regression on some
drivers, such as lpfc.

The problem is that single vector requests can happen in some generic cases:

  1) kdump kernel

  2) irq vectors resource is close to exhaustion.

If in that situation the affinity mask for a single vector is not created,
every caller has to handle the special case.

There is no reason why the mask cannot be created, so remove the check for
a single vector and create the mask.

Fixes: c66d4bd110a1f8 ("genirq/affinity: Add new callback for (re)calculating interrupt sets")
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190805011906.5020-1-ming.lei@redhat.com

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks()</title>
<updated>2019-06-12T08:52:45+00:00</updated>
<author>
<name>Minwoo Im</name>
<email>minwoo.im.dev@gmail.com</email>
</author>
<published>2019-06-02T11:21:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0e51833042fccfe882ef3e85a346252550d26c22'/>
<id>0e51833042fccfe882ef3e85a346252550d26c22</id>
<content type='text'>
The *affd argument is neither used in irq_build_affinity_masks() nor
__irq_build_affinity_masks(). Remove it.

Signed-off-by: Minwoo Im &lt;minwoo.im.dev@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: Minwoo Im &lt;minwoo.im@samsung.com&gt;
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602112117.31839-1-minwoo.im.dev@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The *affd argument is neither used in irq_build_affinity_masks() nor
__irq_build_affinity_masks(). Remove it.

Signed-off-by: Minwoo Im &lt;minwoo.im.dev@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: Minwoo Im &lt;minwoo.im@samsung.com&gt;
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602112117.31839-1-minwoo.im.dev@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Remove the leftovers of the original set support</title>
<updated>2019-02-18T10:21:29+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-02-16T17:13:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a6a309edba13866b31dc4d8aebad3864a6d56ade'/>
<id>a6a309edba13866b31dc4d8aebad3864a6d56ade</id>
<content type='text'>
Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.689834224@linutronix.de

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.689834224@linutronix.de

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Add new callback for (re)calculating interrupt sets</title>
<updated>2019-02-18T10:21:28+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-02-16T17:13:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c66d4bd110a1f8a68c1a88bfbf866eb50c6464b7'/>
<id>c66d4bd110a1f8a68c1a88bfbf866eb50c6464b7</id>
<content type='text'>
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources.  This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
  	for nr_sets after the invocation of the callback so it catches
  	broken drivers. Fixed the kernel doc comments for struct
  	irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources.  This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
  	for nr_sets after the invocation of the callback so it catches
  	broken drivers. Fixed the kernel doc comments for struct
  	irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de


</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Store interrupt sets size in struct irq_affinity</title>
<updated>2019-02-18T10:21:27+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-02-16T17:13:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9cfef55bb57e7620c63087be18a76351628f8d0f'/>
<id>9cfef55bb57e7620c63087be18a76351628f8d0f</id>
<content type='text'>
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one
or more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via
a pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a
loop in the driver to determine the maximum number of interrupts which
are provided by the PCI capabilities and the underlying CPU resources.
This loop would have to be replicated in every driver which wants to
utilize this mechanism. That's unwanted code duplication and error
prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and
their size, in the core code. As the core code does not have any
knowledge about the underlying device, a driver specific callback will
be added to struct affinity_desc, which will be invoked by the core
code. The callback will get the number of available interupts as an
argument, so the driver can calculate the corresponding number and size
of interrupt sets.

To support this, two modifications for the handling of struct irq_affinity
are required:

1) The (optional) interrupt sets size information is contained in a
   separate array of integers and struct irq_affinity contains a
   pointer to it.

   This is cumbersome and as the maximum number of interrupt sets is small,
   there is no reason to have separate storage. Moving the size array into
   struct affinity_desc avoids indirections and makes the code simpler.

2) At the moment the struct irq_affinity pointer which is handed in from
   the driver and passed through to several core functions is marked
   'const'.

   With the upcoming callback to recalculate the number and size of
   interrupt sets, it's necessary to remove the 'const'
   qualifier. Otherwise the callback would not be able to update the data.

Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

No functional change.

[ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
  	source. Fixed the kernel doc comments for struct irq_affinity and
  	de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.423723127@linutronix.de


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one
or more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via
a pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a
loop in the driver to determine the maximum number of interrupts which
are provided by the PCI capabilities and the underlying CPU resources.
This loop would have to be replicated in every driver which wants to
utilize this mechanism. That's unwanted code duplication and error
prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and
their size, in the core code. As the core code does not have any
knowledge about the underlying device, a driver specific callback will
be added to struct affinity_desc, which will be invoked by the core
code. The callback will get the number of available interupts as an
argument, so the driver can calculate the corresponding number and size
of interrupt sets.

To support this, two modifications for the handling of struct irq_affinity
are required:

1) The (optional) interrupt sets size information is contained in a
   separate array of integers and struct irq_affinity contains a
   pointer to it.

   This is cumbersome and as the maximum number of interrupt sets is small,
   there is no reason to have separate storage. Moving the size array into
   struct affinity_desc avoids indirections and makes the code simpler.

2) At the moment the struct irq_affinity pointer which is handed in from
   the driver and passed through to several core functions is marked
   'const'.

   With the upcoming callback to recalculate the number and size of
   interrupt sets, it's necessary to remove the 'const'
   qualifier. Otherwise the callback would not be able to update the data.

Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

No functional change.

[ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
  	source. Fixed the kernel doc comments for struct irq_affinity and
  	de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.423723127@linutronix.de


</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Code consolidation</title>
<updated>2019-02-18T10:21:27+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-02-16T17:13:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0145c30e896d26e638d27c957d9eed72893c1c92'/>
<id>0145c30e896d26e638d27c957d9eed72893c1c92</id>
<content type='text'>
All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over to unsigned int.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.336424556@linutronix.de

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over to unsigned int.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Bjorn Helgaas &lt;helgaas@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Sumit Saxena &lt;sumit.saxena@broadcom.com&gt;
Cc: Kashyap Desai &lt;kashyap.desai@broadcom.com&gt;
Cc: Shivasharan Srikanteshwara &lt;shivasharan.srikanteshwara@broadcom.com&gt;
Link: https://lkml.kernel.org/r/20190216172228.336424556@linutronix.de

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks()</title>
<updated>2019-02-10T18:53:55+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-01-25T09:53:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=347253c42d7c673aa2a659d756bc7ff893459247'/>
<id>347253c42d7c673aa2a659d756bc7ff893459247</id>
<content type='text'>
'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(),
so move it into irq_build_affinity_masks().

No functioanl change.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/20190125095347.17950-2-ming.lei@redhat.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(),
so move it into irq_build_affinity_masks().

No functioanl change.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/20190125095347.17950-2-ming.lei@redhat.com

</pre>
</div>
</content>
</entry>
</feed>
