linux-toradex.git/net, branch v4.15-rc3

tcp: evaluate packet losses upon RTT change

2017-12-08T19:14:11+00:00

RACK skips an ACK unless it advances the most recently delivered
TX timestamp (rack.mstamp). Since RACK also uses the most recent
RTT to decide if a packet is lost, RACK should still run the
loss detection whenever the most recent RTT changes. For example,
an ACK that does not advance the timestamp but triggers the cwnd
undo due to reordering, would then use the most recent (higher)
RTT measurement to detect further losses.

Signed-off-by: Yuchung Cheng 
Reviewed-by: Neal Cardwell 
Reviewed-by: Priyaranjan Jha 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller

tcp: fix off-by-one bug in RACK

2017-12-08T19:14:11+00:00

RACK should mark a packet lost when remaining wait time is zero.

Signed-off-by: Yuchung Cheng 
Reviewed-by: Neal Cardwell 
Reviewed-by: Priyaranjan Jha 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller

tcp: always evaluate losses in RACK upon undo

2017-12-08T19:14:11+00:00

When sender detects spurious retransmission, all packets
marked lost are remarked to be in-flight. However some may
be considered lost based on its timestamps in RACK. This patch
forces RACK to re-evaluate, which may be skipped previously if
the ACK does not advance RACK timestamp.

Signed-off-by: Yuchung Cheng 
Reviewed-by: Neal Cardwell 
Reviewed-by: Priyaranjan Jha 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller

tcp: correctly test congestion state in RACK

2017-12-08T19:14:11+00:00

RACK does not test the loss recovery state correctly to compute
the reordering window. It assumes if lost_out is zero then TCP is
not in loss recovery. But it can be zero during recovery before
calling tcp_rack_detect_loss(): when an ACK acknowledges all
packets marked lost before receiving this ACK, but has not yet
to discover new ones by tcp_rack_detect_loss(). The fix is to
simply test the congestion state directly.

Signed-off-by: Yuchung Cheng 
Reviewed-by: Neal Cardwell 
Reviewed-by: Priyaranjan Jha 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller

tcp_bbr: reset long-term bandwidth sampling on loss recovery undo

2017-12-08T18:27:43+00:00

Fix BBR so that upon notification of a loss recovery undo BBR resets
long-term bandwidth sampling.

Under high reordering, reordering events can be interpreted as loss.
If the reordering and spurious loss estimates are high enough, this
can cause BBR to spuriously estimate that we are seeing loss rates
high enough to trigger long-term bandwidth estimation. To avoid that
problem, this commit resets long-term bandwidth sampling on loss
recovery undo events.

Signed-off-by: Neal Cardwell 
Reviewed-by: Yuchung Cheng 
Acked-by: Soheil Hassas Yeganeh 
Signed-off-by: David S. Miller

tcp_bbr: reset full pipe detection on loss recovery undo

2017-12-08T18:27:43+00:00

Fix BBR so that upon notification of a loss recovery undo BBR resets
the full pipe detection (STARTUP exit) state machine.

Under high reordering, reordering events can be interpreted as loss.
If the reordering and spurious loss estimates are high enough, this
could previously cause BBR to spuriously estimate that the pipe is
full.

Since spurious loss recovery means that our overall sending will have
slowed down spuriously, this commit gives a flow more time to probe
robustly for bandwidth and decide the pipe is really full.

Signed-off-by: Neal Cardwell 
Reviewed-by: Yuchung Cheng 
Acked-by: Soheil Hassas Yeganeh 
Signed-off-by: David S. Miller

tcp_bbr: record "full bw reached" decision in new full_bw_reached bit

2017-12-08T18:27:43+00:00

This commit records the "full bw reached" decision in a new
full_bw_reached bit. This is a pure refactor that does not change the
current behavior, but enables subsequent fixes and improvements.

In particular, this enables simple and clean fixes because the full_bw
and full_bw_cnt can be unconditionally zeroed without worrying about
forgetting that we estimated we filled the pipe in Startup. And it
enables future improvements because multiple code paths can be used
for estimating that we filled the pipe in Startup; any new code paths
only need to set this bit when they think the pipe is full.

Note that this fix intentionally reduces the width of the full_bw_cnt
counter, since we have never used the most significant bit.

Signed-off-by: Neal Cardwell 
Reviewed-by: Yuchung Cheng 
Acked-by: Soheil Hassas Yeganeh 
Signed-off-by: David S. Miller

tcp: invalidate rate samples during SACK reneging

2017-12-08T15:07:02+00:00

Mark tcp_sock during a SACK reneging event and invalidate rate samples
while marked. Such rate samples may overestimate bw by including packets
that were SACKed before reneging.

< ack 6001 win 10000 sack 7001:38001
< ack 7001 win 0 sack 8001:38001 // Reneg detected
> seq 7001:8001 // RTO, SACK cleared.
< ack 38001 win 10000

In above example the rate sample taken after the last ack will count
7001-38001 as delivered while the actual delivery rate likely could
be much lower i.e. 7001-8001.

This patch adds a new field tcp_sock.sack_reneg and marks it when we
declare SACK reneging and entering TCP_CA_Loss, and unmarks it after
the last rate sample was taken before moving back to TCP_CA_Open. This
patch also invalidates rate samples taken while tcp_sock.is_sack_reneg
is set.

Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection")
Signed-off-by: Yousuk Seung 
Signed-off-by: Neal Cardwell 
Signed-off-by: Yuchung Cheng 
Acked-by: Soheil Hassas Yeganeh 
Acked-by: Eric Dumazet 
Acked-by: Priyaranjan Jha 
Signed-off-by: David S. Miller

tcp: use current time in tcp_rcv_space_adjust()

2017-12-07T19:31:03+00:00

When I switched rcv_rtt_est to high resolution timestamps, I forgot
that tp->tcp_mstamp needed to be refreshed in tcp_rcv_space_adjust()

Using an old timestamp leads to autotuning lags.

Fixes: 645f4c6f2ebd ("tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps")
Signed-off-by: Eric Dumazet 
Cc: Wei Wang 
Cc: Neal Cardwell 
Cc: Yuchung Cheng 
Acked-by: Neal Cardwell 
Signed-off-by: David S. Miller

adding missing rcu_read_unlock in ipxip6_rcv

2017-12-07T18:59:37+00:00

commit 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
introduced new exit point in  ipxip6_rcv. however rcu_read_unlock is
missing there. this diff is fixing this

v1->v2:
 instead of doing rcu_read_unlock in place, we are going to "drop"
 section (to prevent skb leakage)

Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Signed-off-by: Nikita V. Shirokov 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller