From 29f834aa326e659ed354c406056e94ea3d29706a Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 2 Oct 2023 13:17:37 +0000 Subject: net_sched: sch_fq: add 3 bands and WRR scheduling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Before Google adopted FQ for its production servers, we had to ensure AF4 packets would get a higher share than BE1 ones. As discussed this week in Netconf 2023 in Paris, it is time to upstream this for public use. After this patch FQ can replace pfifo_fast, with the following differences : - FQ uses WRR instead of strict prio, to avoid starvation of low priority packets. - We make sure each band/prio tracks its own usage against sch->limit. This was done to make sure flood of low priority packets would not prevent AF4 packets to be queued. Contributed by Willem. - priomap can be changed, if needed (default value are the ones coming from pfifo_fast). In this patch, we set default band weights so that : - high prio (band=0) packets get 90% of the bandwidth if they compete with low prio (band=2) packets. - high prio packets get 75% of the bandwidth if they compete with medium prio (band=1) packets. Following patch in this series adds the possibility to tune the per-band weights. As we added many fields in 'struct fq_sched_data', we had to make sure to have the first cache line read-mostly, and avoid wasting precious cache lines. More optimizations are possible but will be sent separately. Signed-off-by: Eric Dumazet Acked-by: Dave Taht Reviewed-by: Willem de Bruijn Acked-by: Soheil Hassas Yeganeh Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Paolo Abeni --- include/uapi/linux/pkt_sched.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index 579f641846b8..ec5ab44d41a2 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -941,15 +941,19 @@ enum { TCA_FQ_HORIZON_DROP, /* drop packets beyond horizon, or cap their EDT */ + TCA_FQ_PRIOMAP, /* prio2band */ + __TCA_FQ_MAX }; #define TCA_FQ_MAX (__TCA_FQ_MAX - 1) +#define FQ_BANDS 3 + struct tc_fq_qd_stats { __u64 gc_flows; - __u64 highprio_packets; - __u64 tcp_retrans; + __u64 highprio_packets; /* obsolete */ + __u64 tcp_retrans; /* obsolete */ __u64 throttled; __u64 flows_plimit; __u64 pkts_too_long; @@ -963,6 +967,9 @@ struct tc_fq_qd_stats { __u64 horizon_drops; __u64 horizon_caps; __u64 fastpath_packets; + __u64 band_drops[FQ_BANDS]; + __u32 band_pkt_count[FQ_BANDS]; + __u32 pad; }; /* Heavy-Hitter Filter */ -- cgit v1.2.3