<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/block, branch v3.18.13</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Fix bug in blk_rq_merge_ok</title>
<updated>2015-04-23T03:32:31+00:00</updated>
<author>
<name>Wenbo Wang</name>
<email>wenbo.wang@memblaze.com</email>
</author>
<published>2015-03-20T05:04:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ff06e6e533db84a5cbce79d1b4866e8258b563c9'/>
<id>ff06e6e533db84a5cbce79d1b4866e8258b563c9</id>
<content type='text'>
[ Upstream commit 7ee8e4f3983c4ff700958a6099c8fd212ea67b94 ]

Use the right array index to reference the last
element of rq-&gt;biotail-&gt;bi_io_vec[]

Signed-off-by: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Reviewed-by: Chong Yuan &lt;chong.yuan@memblaze.com&gt;
Fixes: 66cb45aa41315 ("block: add support for limiting gaps in SG lists")
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7ee8e4f3983c4ff700958a6099c8fd212ea67b94 ]

Use the right array index to reference the last
element of rq-&gt;biotail-&gt;bi_io_vec[]

Signed-off-by: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Reviewed-by: Chong Yuan &lt;chong.yuan@memblaze.com&gt;
Fixes: 66cb45aa41315 ("block: add support for limiting gaps in SG lists")
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: fix use of incorrect goto label in blk_mq_init_queue error path</title>
<updated>2015-04-23T03:32:10+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-03-13T03:53:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=06767df2716dd228bc94696862f165d14e5cc756'/>
<id>06767df2716dd228bc94696862f165d14e5cc756</id>
<content type='text'>
[ Upstream commit 9a30b096b543932de218dd3501b5562e00a8792d ]

If percpu_ref_init() fails the allocated q and hctxs must get cleaned
up; using 'err_map' doesn't allow that to happen.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9a30b096b543932de218dd3501b5562e00a8792d ]

If percpu_ref_init() fails the allocated q and hctxs must get cleaned
up; using 'err_map' doesn't allow that to happen.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-throttle: check stats_cpu before reading it from sysfs</title>
<updated>2015-03-06T22:53:05+00:00</updated>
<author>
<name>Thadeu Lima de Souza Cascardo</name>
<email>cascardo@linux.vnet.ibm.com</email>
</author>
<published>2015-02-16T19:16:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b7159073488c359d00dc7ef319dfc1d559ade6fa'/>
<id>b7159073488c359d00dc7ef319dfc1d559ade6fa</id>
<content type='text'>
commit 045c47ca306acf30c740c285a77a4b4bda6be7c5 upstream.

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300   Not tainted  (3.19.0)
[ 1137.734230] MSR: 9000000000009032 &lt;SF,HV,EE,ME,IR,DR,RI&gt;  CR: 42008884  XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 &lt;7cead02a&gt; e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
	int n;
	int status;
	int fd;
	char *buffer;
	buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
	n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
	fd = open(CGPATH "/test/tasks", O_WRONLY);
	write(fd, buffer, n);
	close(fd);
	if (fork() &gt; 0) {
		fd = open("/dev/sda", O_RDONLY | O_DIRECT);
		read(fd, buffer, 512);
		close(fd);
		wait(&amp;status);
	} else {
		fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
		n = read(fd, buffer, BUFFER_SIZE);
		close(fd);
	}
	free(buffer);
	exit(0);
}

void test(void)
{
	int status;
	mkdir(CGPATH "/test", 0666);
	if (fork() &gt; 0)
		wait(&amp;status);
	else
		run(getpid());
	rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
	int i;
	for (i = 0; i &lt; NR_TESTS; i++)
		test();
	return 0;
}

Reported-by: Ricardo Marin Matinata &lt;rmm@br.ibm.com&gt;
Signed-off-by: Thadeu Lima de Souza Cascardo &lt;cascardo@linux.vnet.ibm.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 045c47ca306acf30c740c285a77a4b4bda6be7c5 upstream.

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300   Not tainted  (3.19.0)
[ 1137.734230] MSR: 9000000000009032 &lt;SF,HV,EE,ME,IR,DR,RI&gt;  CR: 42008884  XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 &lt;7cead02a&gt; e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
	int n;
	int status;
	int fd;
	char *buffer;
	buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
	n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
	fd = open(CGPATH "/test/tasks", O_WRONLY);
	write(fd, buffer, n);
	close(fd);
	if (fork() &gt; 0) {
		fd = open("/dev/sda", O_RDONLY | O_DIRECT);
		read(fd, buffer, 512);
		close(fd);
		wait(&amp;status);
	} else {
		fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
		n = read(fd, buffer, BUFFER_SIZE);
		close(fd);
	}
	free(buffer);
	exit(0);
}

void test(void)
{
	int status;
	mkdir(CGPATH "/test", 0666);
	if (fork() &gt; 0)
		wait(&amp;status);
	else
		run(getpid());
	rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
	int i;
	for (i = 0; i &lt; NR_TESTS; i++)
		test();
	return 0;
}

Reported-by: Ricardo Marin Matinata &lt;rmm@br.ibm.com&gt;
Signed-off-by: Thadeu Lima de Souza Cascardo &lt;cascardo@linux.vnet.ibm.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>cfq-iosched: fix incorrect filing of rt async cfqq</title>
<updated>2015-03-06T22:52:59+00:00</updated>
<author>
<name>Jeff Moyer</name>
<email>jmoyer@redhat.com</email>
</author>
<published>2015-01-12T20:21:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4a07d7db39ae546d590ab9eff4749a455dc52fef'/>
<id>4a07d7db39ae546d590ab9eff4749a455dc52fef</id>
<content type='text'>
commit c6ce194325cef342313e3d27620411ce90a89c50 upstream.

Hi,

If you can manage to submit an async write as the first async I/O from
the context of a process with realtime scheduling priority, then a
cfq_queue is allocated, but filed into the wrong async_cfqq bucket.  It
ends up in the best effort array, but actually has realtime I/O
scheduling priority set in cfqq-&gt;ioprio.

The reason is that cfq_get_queue assumes the default scheduling class and
priority when there is no information present (i.e. when the async cfqq
is created):

static struct cfq_queue *
cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
	      struct bio *bio, gfp_t gfp_mask)
{
	const int ioprio_class = IOPRIO_PRIO_CLASS(cic-&gt;ioprio);
	const int ioprio = IOPRIO_PRIO_DATA(cic-&gt;ioprio);

cic-&gt;ioprio starts out as 0, which is "invalid".  So, class of 0
(IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:

		async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);

static struct cfq_queue **
cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
{
        switch (ioprio_class) {
        case IOPRIO_CLASS_RT:
                return &amp;cfqd-&gt;async_cfqq[0][ioprio];
        case IOPRIO_CLASS_NONE:
                ioprio = IOPRIO_NORM;
                /* fall through */
        case IOPRIO_CLASS_BE:
                return &amp;cfqd-&gt;async_cfqq[1][ioprio];
        case IOPRIO_CLASS_IDLE:
                return &amp;cfqd-&gt;async_idle_cfqq;
        default:
                BUG();
        }
}

Here, instead of returning a class mapped from the process' scheduling
priority, we get back the bucket associated with IOPRIO_CLASS_BE.

Now, there is no queue allocated there yet, so we create it:

		cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);

That function ends up doing this:

			cfq_init_cfqq(cfqd, cfqq, current-&gt;pid, is_sync);
			cfq_init_prio_data(cfqq, cic);

cfq_init_cfqq marks the priority as having changed.  Then, cfq_init_prio
data does this:

	ioprio_class = IOPRIO_PRIO_CLASS(cic-&gt;ioprio);
	switch (ioprio_class) {
	default:
		printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
	case IOPRIO_CLASS_NONE:
		/*
		 * no prio set, inherit CPU scheduling settings
		 */
		cfqq-&gt;ioprio = task_nice_ioprio(tsk);
		cfqq-&gt;ioprio_class = task_nice_ioclass(tsk);
		break;

So we basically have two code paths that treat IOPRIO_CLASS_NONE
differently, which results in an RT async cfqq filed into a best effort
bucket.

Attached is a patch which fixes the problem.  I'm not sure how to make
it cleaner.  Suggestions would be welcome.

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Tested-by: Hidehiro Kawai &lt;hidehiro.kawai.ez@hitachi.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c6ce194325cef342313e3d27620411ce90a89c50 upstream.

Hi,

If you can manage to submit an async write as the first async I/O from
the context of a process with realtime scheduling priority, then a
cfq_queue is allocated, but filed into the wrong async_cfqq bucket.  It
ends up in the best effort array, but actually has realtime I/O
scheduling priority set in cfqq-&gt;ioprio.

The reason is that cfq_get_queue assumes the default scheduling class and
priority when there is no information present (i.e. when the async cfqq
is created):

static struct cfq_queue *
cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
	      struct bio *bio, gfp_t gfp_mask)
{
	const int ioprio_class = IOPRIO_PRIO_CLASS(cic-&gt;ioprio);
	const int ioprio = IOPRIO_PRIO_DATA(cic-&gt;ioprio);

cic-&gt;ioprio starts out as 0, which is "invalid".  So, class of 0
(IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:

		async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);

static struct cfq_queue **
cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
{
        switch (ioprio_class) {
        case IOPRIO_CLASS_RT:
                return &amp;cfqd-&gt;async_cfqq[0][ioprio];
        case IOPRIO_CLASS_NONE:
                ioprio = IOPRIO_NORM;
                /* fall through */
        case IOPRIO_CLASS_BE:
                return &amp;cfqd-&gt;async_cfqq[1][ioprio];
        case IOPRIO_CLASS_IDLE:
                return &amp;cfqd-&gt;async_idle_cfqq;
        default:
                BUG();
        }
}

Here, instead of returning a class mapped from the process' scheduling
priority, we get back the bucket associated with IOPRIO_CLASS_BE.

Now, there is no queue allocated there yet, so we create it:

		cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);

That function ends up doing this:

			cfq_init_cfqq(cfqd, cfqq, current-&gt;pid, is_sync);
			cfq_init_prio_data(cfqq, cic);

cfq_init_cfqq marks the priority as having changed.  Then, cfq_init_prio
data does this:

	ioprio_class = IOPRIO_PRIO_CLASS(cic-&gt;ioprio);
	switch (ioprio_class) {
	default:
		printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
	case IOPRIO_CLASS_NONE:
		/*
		 * no prio set, inherit CPU scheduling settings
		 */
		cfqq-&gt;ioprio = task_nice_ioprio(tsk);
		cfqq-&gt;ioprio_class = task_nice_ioclass(tsk);
		break;

So we basically have two code paths that treat IOPRIO_CLASS_NONE
differently, which results in an RT async cfqq filed into a best effort
bucket.

Attached is a patch which fixes the problem.  I'm not sure how to make
it cleaner.  Suggestions would be welcome.

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Tested-by: Hidehiro Kawai &lt;hidehiro.kawai.ez@hitachi.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>cfq-iosched: handle failure of cfq group allocation</title>
<updated>2015-03-06T22:52:59+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2015-02-09T13:42:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e83be4d104e90d3f4cee593b6a8c5d1b5743751f'/>
<id>e83be4d104e90d3f4cee593b6a8c5d1b5743751f</id>
<content type='text'>
commit 69abaffec7d47a083739b79e3066cb3730eba72e upstream.

Cfq_lookup_create_cfqg() allocates struct blkcg_gq using GFP_ATOMIC.
In cfq_find_alloc_queue() possible allocation failure is not handled.
As a result kernel oopses on NULL pointer dereference when
cfq_link_cfqq_cfqg() calls cfqg_get() for NULL pointer.

Bug was introduced in v3.5 in commit cd1604fab4f9 ("blkcg: factor
out blkio_group creation"). Prior to that commit cfq group lookup
had returned pointer to root group as fallback.

This patch handles this error using existing fallback oom_cfqq.

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Fixes: cd1604fab4f9 ("blkcg: factor out blkio_group creation")
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 69abaffec7d47a083739b79e3066cb3730eba72e upstream.

Cfq_lookup_create_cfqg() allocates struct blkcg_gq using GFP_ATOMIC.
In cfq_find_alloc_queue() possible allocation failure is not handled.
As a result kernel oopses on NULL pointer dereference when
cfq_link_cfqq_cfqg() calls cfqg_get() for NULL pointer.

Bug was introduced in v3.5 in commit cd1604fab4f9 ("blkcg: factor
out blkio_group creation"). Prior to that commit cfq group lookup
had returned pointer to root group as fallback.

This patch handles this error using existing fallback oom_cfqq.

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Fixes: cd1604fab4f9 ("blkcg: factor out blkio_group creation")
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: fix double-free in error path</title>
<updated>2015-03-06T22:52:56+00:00</updated>
<author>
<name>Tony Battersby</name>
<email>tonyb@cybernetics.com</email>
</author>
<published>2015-02-11T16:32:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7da36aa36365915732662a45f808c221d1a6c111'/>
<id>7da36aa36365915732662a45f808c221d1a6c111</id>
<content type='text'>
commit 564e559f2baf6a868768d0cac286980b3cfd6e30 upstream.

If the allocation of bt-&gt;bs fails, then bt-&gt;map can be freed twice, once
in blk_mq_init_bitmap_tags() -&gt; bt_alloc(), and once in
blk_mq_init_bitmap_tags() -&gt; bt_free().  Fix by setting the pointer to
NULL after the first free.

Signed-off-by: Tony Battersby &lt;tonyb@cybernetics.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 564e559f2baf6a868768d0cac286980b3cfd6e30 upstream.

If the allocation of bt-&gt;bs fails, then bt-&gt;map can be freed twice, once
in blk_mq_init_bitmap_tags() -&gt; bt_alloc(), and once in
blk_mq_init_bitmap_tags() -&gt; bt_free().  Fix by setting the pointer to
NULL after the first free.

Signed-off-by: Tony Battersby &lt;tonyb@cybernetics.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>genhd: check for int overflow in disk_expand_part_tbl()</title>
<updated>2015-01-16T14:59:52+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-11-19T20:06:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ee9b142838a989550587a27fb3bb8ebbe8ab6fba'/>
<id>ee9b142838a989550587a27fb3bb8ebbe8ab6fba</id>
<content type='text'>
commit 5fabcb4c33fe11c7e3afdf805fde26c1a54d0953 upstream.

We can get here from blkdev_ioctl() -&gt; blkpg_ioctl() -&gt; add_partition()
with a user passed in partno value. If we pass in 0x7fffffff, the
new target in disk_expand_part_tbl() overflows the 'int' and we
access beyond the end of ptbl-&gt;part[] and even write to it when we
do the rcu_assign_pointer() to assign the new partition.

Reported-by: David Ramos &lt;daramos@stanford.edu&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5fabcb4c33fe11c7e3afdf805fde26c1a54d0953 upstream.

We can get here from blkdev_ioctl() -&gt; blkpg_ioctl() -&gt; add_partition()
with a user passed in partno value. If we pass in 0x7fffffff, the
new target in disk_expand_part_tbl() overflows the 'int' and we
access beyond the end of ptbl-&gt;part[] and even write to it when we
do the rcu_assign_pointer() to assign the new partition.

Reported-by: David Ramos &lt;daramos@stanford.edu&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: Fix uninitialized kobject at CPU hotplugging</title>
<updated>2015-01-16T14:59:48+00:00</updated>
<author>
<name>Takashi Iwai</name>
<email>tiwai@suse.de</email>
</author>
<published>2014-12-10T15:38:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fa5e4747af360dc65fdded3cc0ff1a6b8c227a71'/>
<id>fa5e4747af360dc65fdded3cc0ff1a6b8c227a71</id>
<content type='text'>
commit 06a41a99d13d8e919e9a00a4849e6b85ae492592 upstream.

When a CPU is hotplugged, the current blk-mq spews a warning like:

  kobject '(null)' (ffffe8ffffc8b5d8): tried to add an uninitialized object, something is seriously wrong.
  CPU: 1 PID: 1386 Comm: systemd-udevd Not tainted 3.18.0-rc7-2.g088d59b-default #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_171129-lamiak 04/01/2014
   0000000000000000 0000000000000002 ffffffff81605f07 ffffe8ffffc8b5d8
   ffffffff8132c7a0 ffff88023341d370 0000000000000020 ffff8800bb05bd58
   ffff8800bb05bd08 000000000000a0a0 000000003f441940 0000000000000007
  Call Trace:
   [&lt;ffffffff81005306&gt;] dump_trace+0x86/0x330
   [&lt;ffffffff81005644&gt;] show_stack_log_lvl+0x94/0x170
   [&lt;ffffffff81006d21&gt;] show_stack+0x21/0x50
   [&lt;ffffffff81605f07&gt;] dump_stack+0x41/0x51
   [&lt;ffffffff8132c7a0&gt;] kobject_add+0xa0/0xb0
   [&lt;ffffffff8130aee1&gt;] blk_mq_register_hctx+0x91/0xb0
   [&lt;ffffffff8130b82e&gt;] blk_mq_sysfs_register+0x3e/0x60
   [&lt;ffffffff81309298&gt;] blk_mq_queue_reinit_notify+0xf8/0x190
   [&lt;ffffffff8107cfdc&gt;] notifier_call_chain+0x4c/0x70
   [&lt;ffffffff8105fd23&gt;] cpu_notify+0x23/0x50
   [&lt;ffffffff81060037&gt;] _cpu_up+0x157/0x170
   [&lt;ffffffff810600d9&gt;] cpu_up+0x89/0xb0
   [&lt;ffffffff815fa5b5&gt;] cpu_subsys_online+0x35/0x80
   [&lt;ffffffff814323cd&gt;] device_online+0x5d/0xa0
   [&lt;ffffffff81432485&gt;] online_store+0x75/0x80
   [&lt;ffffffff81236a5a&gt;] kernfs_fop_write+0xda/0x150
   [&lt;ffffffff811c5532&gt;] vfs_write+0xb2/0x1f0
   [&lt;ffffffff811c5f42&gt;] SyS_write+0x42/0xb0
   [&lt;ffffffff8160c4ed&gt;] system_call_fastpath+0x16/0x1b
   [&lt;00007f0132fb24e0&gt;] 0x7f0132fb24e0

This is indeed because of an uninitialized kobject for blk_mq_ctx.
The blk_mq_ctx kobjects are initialized in blk_mq_sysfs_init(), but it
goes loop over hctx_for_each_ctx(), i.e. it initializes only for
online CPUs.  Thus, when a CPU is hotplugged, the ctx for the newly
onlined CPU is registered without initialization.

This patch fixes the issue by initializing the all ctx kobjects
belonging to each queue.

Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=908794
Signed-off-by: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 06a41a99d13d8e919e9a00a4849e6b85ae492592 upstream.

When a CPU is hotplugged, the current blk-mq spews a warning like:

  kobject '(null)' (ffffe8ffffc8b5d8): tried to add an uninitialized object, something is seriously wrong.
  CPU: 1 PID: 1386 Comm: systemd-udevd Not tainted 3.18.0-rc7-2.g088d59b-default #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_171129-lamiak 04/01/2014
   0000000000000000 0000000000000002 ffffffff81605f07 ffffe8ffffc8b5d8
   ffffffff8132c7a0 ffff88023341d370 0000000000000020 ffff8800bb05bd58
   ffff8800bb05bd08 000000000000a0a0 000000003f441940 0000000000000007
  Call Trace:
   [&lt;ffffffff81005306&gt;] dump_trace+0x86/0x330
   [&lt;ffffffff81005644&gt;] show_stack_log_lvl+0x94/0x170
   [&lt;ffffffff81006d21&gt;] show_stack+0x21/0x50
   [&lt;ffffffff81605f07&gt;] dump_stack+0x41/0x51
   [&lt;ffffffff8132c7a0&gt;] kobject_add+0xa0/0xb0
   [&lt;ffffffff8130aee1&gt;] blk_mq_register_hctx+0x91/0xb0
   [&lt;ffffffff8130b82e&gt;] blk_mq_sysfs_register+0x3e/0x60
   [&lt;ffffffff81309298&gt;] blk_mq_queue_reinit_notify+0xf8/0x190
   [&lt;ffffffff8107cfdc&gt;] notifier_call_chain+0x4c/0x70
   [&lt;ffffffff8105fd23&gt;] cpu_notify+0x23/0x50
   [&lt;ffffffff81060037&gt;] _cpu_up+0x157/0x170
   [&lt;ffffffff810600d9&gt;] cpu_up+0x89/0xb0
   [&lt;ffffffff815fa5b5&gt;] cpu_subsys_online+0x35/0x80
   [&lt;ffffffff814323cd&gt;] device_online+0x5d/0xa0
   [&lt;ffffffff81432485&gt;] online_store+0x75/0x80
   [&lt;ffffffff81236a5a&gt;] kernfs_fop_write+0xda/0x150
   [&lt;ffffffff811c5532&gt;] vfs_write+0xb2/0x1f0
   [&lt;ffffffff811c5f42&gt;] SyS_write+0x42/0xb0
   [&lt;ffffffff8160c4ed&gt;] system_call_fastpath+0x16/0x1b
   [&lt;00007f0132fb24e0&gt;] 0x7f0132fb24e0

This is indeed because of an uninitialized kobject for blk_mq_ctx.
The blk_mq_ctx kobjects are initialized in blk_mq_sysfs_init(), but it
goes loop over hctx_for_each_ctx(), i.e. it initializes only for
online CPUs.  Thus, when a CPU is hotplugged, the ctx for the newly
onlined CPU is registered without initialization.

This patch fixes the issue by initializing the all ctx kobjects
belonging to each queue.

Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=908794
Signed-off-by: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: Fix a race between bt_clear_tag() and bt_get()</title>
<updated>2015-01-16T14:59:48+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2014-12-09T15:58:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3a6d400572ee7c6f6a82d1c385fcaefebd6062fc'/>
<id>3a6d400572ee7c6f6a82d1c385fcaefebd6062fc</id>
<content type='text'>
commit c38d185d4af12e8be63ca4b6745d99449c450f12 upstream.

What we need is the following two guarantees:
* Any thread that observes the effect of the test_and_set_bit() by
  __bt_get_word() also observes the preceding addition of 'current'
  to the appropriate wait list. This is guaranteed by the semantics
  of the spin_unlock() operation performed by prepare_and_wait().
  Hence the conversion of test_and_set_bit_lock() into
  test_and_set_bit().
* The wait lists are examined by bt_clear() after the tag bit has
  been cleared. clear_bit_unlock() guarantees that any thread that
  observes that the bit has been cleared also observes the store
  operations preceding clear_bit_unlock(). However,
  clear_bit_unlock() does not prevent that the wait lists are examined
  before that the tag bit is cleared. Hence the addition of a memory
  barrier between clear_bit() and the wait list examination.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Robert Elliott &lt;elliott@hp.com&gt;
Cc: Ming Lei &lt;ming.lei@canonical.com&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c38d185d4af12e8be63ca4b6745d99449c450f12 upstream.

What we need is the following two guarantees:
* Any thread that observes the effect of the test_and_set_bit() by
  __bt_get_word() also observes the preceding addition of 'current'
  to the appropriate wait list. This is guaranteed by the semantics
  of the spin_unlock() operation performed by prepare_and_wait().
  Hence the conversion of test_and_set_bit_lock() into
  test_and_set_bit().
* The wait lists are examined by bt_clear() after the tag bit has
  been cleared. clear_bit_unlock() guarantees that any thread that
  observes that the bit has been cleared also observes the store
  operations preceding clear_bit_unlock(). However,
  clear_bit_unlock() does not prevent that the wait lists are examined
  before that the tag bit is cleared. Hence the addition of a memory
  barrier between clear_bit() and the wait list examination.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Robert Elliott &lt;elliott@hp.com&gt;
Cc: Ming Lei &lt;ming.lei@canonical.com&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: Avoid that __bt_get_word() wraps multiple times</title>
<updated>2015-01-16T14:59:48+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2014-12-09T15:58:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d04e14ab4713a186700ef70b2e4d994618cbc64a'/>
<id>d04e14ab4713a186700ef70b2e4d994618cbc64a</id>
<content type='text'>
commit 9e98e9d7cf6e9d2ec1cce45e8d5ccaf3f9b386f3 upstream.

If __bt_get_word() is called with last_tag != 0, if the first
find_next_zero_bit() fails, if after wrap-around the
test_and_set_bit() call fails and find_next_zero_bit() succeeds,
if the next test_and_set_bit() call fails and subsequently
find_next_zero_bit() does not find a zero bit, then another
wrap-around will occur. Avoid this by introducing an additional
local variable.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Robert Elliott &lt;elliott@hp.com&gt;
Cc: Ming Lei &lt;ming.lei@canonical.com&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9e98e9d7cf6e9d2ec1cce45e8d5ccaf3f9b386f3 upstream.

If __bt_get_word() is called with last_tag != 0, if the first
find_next_zero_bit() fails, if after wrap-around the
test_and_set_bit() call fails and find_next_zero_bit() succeeds,
if the next test_and_set_bit() call fails and subsequently
find_next_zero_bit() does not find a zero bit, then another
wrap-around will occur. Avoid this by introducing an additional
local variable.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Robert Elliott &lt;elliott@hp.com&gt;
Cc: Ming Lei &lt;ming.lei@canonical.com&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
