linux-toradex.git/include/linux/elevator.h, branch v4.11-rc3

blk-mq: pass bio to blk_mq_sched_get_rq_priv

2017-02-10T16:09:59+00:00

bio is used in bfq-mq's get_rq_priv, to get the request group. We could
pass directly the group here, but I thought that passing the bio was
more general, giving the possibility to get other pieces of information
if needed.

Signed-off-by: Paolo Valente 
Signed-off-by: Jens Axboe

block: optionally merge discontiguous discard bios into a single request

2017-02-08T20:43:08+00:00

Add a new merge strategy that merges discard bios into a request until the
maximum number of discard ranges (or the maximum discard size) is reached
from the plug merging code.  I/O scheduler merging is not wired up yet
but might also be useful, although not for fast devices like NVMe which
are the only user for now.

Note that for now we don't support limiting the size of each discard range,
but if needed that can be added later.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe

block: enumify ELEVATOR_*_MERGE

2017-02-08T20:43:06+00:00

Switch these constants to an enum, and make let the compiler ensure that
all callers of blk_try_merge and elv_merge handle all potential values.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe

blk-mq-sched: change ->dispatch_requests() to ->dispatch_request()

2017-01-27T15:20:35+00:00

When we invoke dispatch_requests(), the scheduler empties everything
into the passed in list. This isn't always a good thing, since it
means that we remove items that we could have potentially merged
with.

Change the function to dispatch single requests at the time. If
we do that, we can backoff exactly at the point where the device
can't consume more IO, and leave the rest with the scheduler for
better merging and future dispatch decision making.

Signed-off-by: Jens Axboe 
Reviewed-by: Omar Sandoval 
Tested-by: Hannes Reinecke

blk-mq-sched: add framework for MQ capable IO schedulers

2017-01-17T17:04:20+00:00

This adds a set of hooks that intercepts the blk-mq path of
allocating/inserting/issuing/completing requests, allowing
us to develop a scheduler within that framework.

We reuse the existing elevator scheduler API on the registration
side, but augment that with the scheduler flagging support for
the blk-mq interfce, and with a separate set of ops hooks for MQ
devices.

We split driver and scheduler tags, so we can run the scheduling
independently of device queue depth.

Signed-off-by: Jens Axboe 
Reviewed-by: Bart Van Assche 
Reviewed-by: Omar Sandoval

block: move existing elevator ops to union

2017-01-17T17:03:33+00:00

Prep patch for adding MQ ops as well, since doing anon unions with
named initializers doesn't work on older compilers.

Signed-off-by: Jens Axboe 
Reviewed-by: Johannes Thumshirn 
Reviewed-by: Bart Van Assche 
Reviewed-by: Omar Sandoval

elevator: make the rqhash helpers exported

2016-12-09T16:03:02+00:00

Signed-off-by: Jens Axboe 
Reviewed-by: Hannes Reinecke

block: better op and flags encoding

2016-10-28T14:48:16+00:00

Now that we don't need the common flags to overflow outside the range
of a 32-bit type we can encode them the same way for both the bio and
request fields.  This in addition allows us to place the operation
first (and make some room for more ops while we're at it) and to
stop having to shift around the operation values.

In addition this allows passing around only one value in the block layer
instead of two (and eventuall also in the file systems, but we can do
that later) and thus clean up a lot of code.

Last but not least this allows decreasing the size of the cmd_flags
field in struct request to 32-bits.  Various functions passing this
value could also be updated, but I'd like to avoid the churn for now.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe

block: do not merge requests without consulting with io scheduler

2016-07-21T03:35:12+00:00

Before merging a bio into an existing request, io scheduler is called to
get its approval first. However, the requests that come from a plug
flush may get merged by block layer without consulting with io
scheduler.

In case of CFQ, this can cause fairness problems. For instance, if a
request gets merged into a low weight cgroup's request, high weight cgroup
now will depend on low weight cgroup to get scheduled. If high weigt cgroup
needs that io request to complete before submitting more requests, then it
will also lose its timeslice.

Following script demonstrates the problem. Group g1 has a low weight, g2
and g3 have equal high weights but g2's requests are adjacent to g1's
requests so they are subject to merging. Due to these merges, g2 gets
poor disk time allocation.

cat > cfq-merge-repro.sh << "EOF"
#!/bin/bash
set -e

IO_ROOT=/mnt-cgroup/io

mkdir -p $IO_ROOT

if ! mount | grep -qw $IO_ROOT; then
  mount -t cgroup none -oblkio $IO_ROOT
fi

cd $IO_ROOT

for i in g1 g2 g3; do
  if [ -d $i ]; then
    rmdir $i
  fi
done

mkdir g1 && echo 10 > g1/blkio.weight
mkdir g2 && echo 495 > g2/blkio.weight
mkdir g3 && echo 495 > g3/blkio.weight

RUNTIME=10

(echo $BASHPID > g1/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=0k &> /dev/null)&

(echo $BASHPID > g2/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=64k &> /dev/null)&

(echo $BASHPID > g3/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=256k &> /dev/null)&

sleep $((RUNTIME+1))

for i in g1 g2 g3; do
  echo ---- $i ----
  cat $i/blkio.time
done

EOF
# ./cfq-merge-repro.sh
---- g1 ----
8:16 162
---- g2 ----
8:16 165
---- g3 ----
8:16 686

After applying the patch:

# ./cfq-merge-repro.sh
---- g1 ----
8:16 90
---- g2 ----
8:16 445
---- g3 ----
8:16 471

Signed-off-by: Tahsin Erdogan 
Signed-off-by: Jens Axboe

block: prepare elevator to use REQ_OPs.

2016-06-07T19:41:38+00:00

This patch converts the elevator code to use separate variables
for the operation and flags, and to check req_op for the REQ_OP.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Jens Axboe