linux-toradex.git - Linux kernel for Apalis and Colibri modules

Age	Commit message (Collapse)	Author
2022-06-04	Merge tag 'v5.10.115' of ↵	Vignesh Raghavendra
	https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux into ti-linux-5.10.y This is the 5.10.115 stable release * tag 'v5.10.115' of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux: (1162 commits) Linux 5.10.115 mmc: rtsx: add 74 Clocks in power on flow PCI: aardvark: Fix reading MSI interrupt number PCI: aardvark: Clear all MSIs at setup dm: interlock pending dm_io and dm_wait_for_bios_completion block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern rcu: Apply callbacks processing time limit only on softirq rcu: Fix callbacks processing time limit retaining cond_resched() KVM: LAPIC: Enable timer posted-interrupt only when mwait/hlt is advertised KVM: x86/mmu: avoid NULL-pointer dereference on page freeing bugs KVM: x86: Do not change ICR on write to APIC_SELF_IPI x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume net/mlx5: Fix slab-out-of-bounds while reading resource dump menu kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU net: igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter() btrfs: always log symlinks in full mode smsc911x: allow using IRQ0 selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer bnxt_en: Fix unnecessary dropping of RX packets bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flag ... Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	i2c: add I2C Address Translator (ATR) support	Luca Ceresoli
	An ATR is a device that looks similar to an i2c-mux: it has an I2C slave "upstream" port and N master "downstream" ports, and forwards transactions from upstream to the appropriate downstream port. But is is different in that the forwarded transaction has a different slave address. The address used on the upstream bus is called the "alias" and is (potentially) different from the physical slave address of the downstream chip. Add a helper file (just like i2c-mux.c for a mux or switch) to allow implementing ATR features in a device driver. The helper takes care or adapter creation/destruction and translates addresses at each transaction. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	i2c: core: let adapters be notified of client attach/detach	Luca Ceresoli
	An adapter might need to know when a new device is about to be added. This will soon bee needed to implement an "I2C address translator" (ATR for short), a device that propagates I2C transactions with a different slave address (an "alias" address). An ATR driver needs to know when a slave is being added to find a suitable alias and program the device translation map. Add an attach/detach callback pair to allow adapter drivers to be notified of clients being added and removed. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: add V4L2_SUBDEV_CAP_MPLEXED	Tomi Valkeinen
	Add a subdev capability flag to expose to userspace if a subdev supports multiplexed streams. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add v4l2_routing_simple_verify() helper	Tomi Valkeinen
	Add a helper for verifying routing for the common case of non-overlapping 1-to-1 streams. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add v4l2_subdev_set_routing_with_fmt() helper	Tomi Valkeinen
	v4l2_subdev_set_routing_with_fmt() is the same as v4l2_subdev_set_routing(), but additionally initializes all the streams with the given format. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add v4l2_subdev_get_fmt() helper function	Tomi Valkeinen
	Add v4l2_subdev_get_fmt() helper function which implements v4l2_subdev_pad_ops.get_fmt using streams. Subdev drivers that do not need to do anything special in their get_fmt op can use this helper directly for v4l2_subdev_pad_ops.get_fmt. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add "opposite" stream helper funcs	Tomi Valkeinen
	Add two helper functions to make dealing with streams easier: v4l2_state_find_opposite_end - given a routing table and a pad + stream, return the pad + stream on the opposite side of the subdev. v4l2_state_get_opposite_stream_format - return a pointer to the format on the pad + stream on the opposite side from the given pad + stream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add stream based configuration	Tomi Valkeinen
	Add support to manage configurations (format, crop, compose) per stream, instead of per pad. This is accomplished with data structures that hold an array of all subdev's stream configurations. The number of streams can vary at runtime based on routing. Every time the routing is changed, the stream configurations need to be re-initialized. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add v4l2_subdev_set_routing helper()	Tomi Valkeinen
	Add a helper function to set the subdev routing. The helper can be used from subdev driver's set_routing op to store the routing table. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add v4l2_subdev_has_route()	Tomi Valkeinen
	Add a v4l2_subdev_has_route() helper function which can be used for media_entity_operations.has_route op. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: Add [GS]_ROUTING subdev ioctls and operations	Laurent Pinchart
	Add support for subdev internal routing. A route is defined as a single stream from a sink pad to a source pad. The userspace can configure the routing via two new ioctls, VIDIOC_SUBDEV_G_ROUTING and VIDIOC_SUBDEV_S_ROUTING, and subdevs can implement the functionality with v4l2_subdev_pad_ops.set_routing(). Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> - Add sink and source streams for multiplexed links - Copy the argument back in case of an error. This is needed to let the caller know the number of routes. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> - Expand and refine documentation. - Make the 'routes' pointer a __u64 __user pointer so that a compat32 version of the ioctl is not required. - Add struct v4l2_subdev_krouting to be used for subdevice operations. Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org> - Fix typecasing warnings - Check sink & source pad types - Add 'which' field - Add V4L2_SUBDEV_ROUTE_FL_SOURCE - Routing to subdev state - Dropped get_routing subdev op Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: add V4L2_SUBDEV_FL_MULTIPLEXED	Tomi Valkeinen
	Add subdev flag V4L2_SUBDEV_FL_MULTIPLEXED. It is used to indicate that the subdev supports the new API with multiplexed streams (routing, stream configs). Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: Add v4l2_subdev_validate_and_lock_state()	Tomi Valkeinen
	All suitable subdev ops are now passed either the TRY or the ACTIVE state by the v4l2 core. However, other subdev drivers can still call the ops passing NULL as the state, implying the active case. For all current upstream drivers this doesn't matter, as they do not expect to get a valid state for ACTIVE case. But future drivers which support multiplexed streaming and routing will depend on getting a state for both active and try cases. For new drivers we can mandate that the pipelines where the drivers are used need to pass the state properly, or preferably, not call such subdev ops at all. However, if an existing subdev driver is changed to support multiplexed streams, the driver has to consider cases where its ops will be called with NULL state. The problem can easily be solved by using the v4l2_subdev_validate_and_lock_state() helper, introduced here. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add subdev state locking	Tomi Valkeinen
	The V4L2 subdevs have managed without centralized locking for the state (previously pad_config), as the TRY state is supposedly safe (although I believe two TRY ioctls for the same fd would race), and the ACTIVE state, and its locking, is managed by the drivers internally. We now have ACTIVE state in a centralized position, and need locking. Strictly speaking the locking is only needed for new drivers that use the new state, as the current drivers continue behaving as they used to. Add a mutex to the struct v4l2_subdev_state, along with a few helper functions for locking/unlocking. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: add active state to struct v4l2_subdev	Tomi Valkeinen
	Add a new 'state' field to struct v4l2_subdev to which we can store the active state of a subdev. This will place the subdev configuration into a known place, allowing us to use the state directly from the v4l2 framework, thus simplifying the drivers. Also add functions v4l2_subdev_init_finalize() and v4l2_subdev_cleanup(), which will allocate and free the active state. The functions are named in a generic way so that they can be also used for other subdev initialization work. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: subdev: rename subdev-state alloc & free	Tomi Valkeinen
	v4l2_subdev_alloc_state() and v4l2_subdev_free_state() are not supposed to be used by the drivers. However, we do have a few drivers that use those at the moment, so we need to expose these functions for the time being. Prefix the functions with __ to mark the functions as internal. At the same time, rename them to v4l2_subdev_state_alloc and v4l2_subdev_state_free to match the style used for other functions like video_device_alloc() and media_request_alloc(). Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: sync v4l2 subdev state to upstream	Tomi Valkeinen
	The v4l2 subdev state version that went into upstream was slighly different than the version applied to TI kernel. Sync to upstream version. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: core headers: fix kernel-doc warnings	Hans Verkuil
	commit f12b81e47f48940a6ec82ff308a7d97cd2307442 upstream. This patch fixes the following kernel-doc warnings: include/uapi/linux/videodev2.h:996: warning: Function parameter or member 'm' not described in 'v4l2_plane' include/uapi/linux/videodev2.h:996: warning: Function parameter or member 'reserved' not described in 'v4l2_plane' include/uapi/linux/videodev2.h:1057: warning: Function parameter or member 'm' not described in 'v4l2_buffer' include/uapi/linux/videodev2.h:1057: warning: Function parameter or member 'reserved2' not described in 'v4l2_buffer' include/uapi/linux/videodev2.h:1057: warning: Function parameter or member 'reserved' not described in 'v4l2_buffer' include/uapi/linux/videodev2.h:1068: warning: Function parameter or member 'tv' not described in 'v4l2_timeval_to_ns' include/uapi/linux/videodev2.h:1068: warning: Excess function parameter 'ts' description in 'v4l2_timeval_to_ns' include/uapi/linux/videodev2.h:1138: warning: Function parameter or member 'reserved' not described in 'v4l2_exportbuffer' include/uapi/linux/videodev2.h:2237: warning: Function parameter or member 'reserved' not described in 'v4l2_plane_pix_format' include/uapi/linux/videodev2.h:2270: warning: Function parameter or member 'hsv_enc' not described in 'v4l2_pix_format_mplane' include/uapi/linux/videodev2.h:2270: warning: Function parameter or member 'reserved' not described in 'v4l2_pix_format_mplane' include/uapi/linux/videodev2.h:2281: warning: Function parameter or member 'reserved' not described in 'v4l2_sdr_format' include/uapi/linux/videodev2.h:2315: warning: Function parameter or member 'fmt' not described in 'v4l2_format' include/uapi/linux/v4l2-subdev.h:53: warning: Function parameter or member 'reserved' not described in 'v4l2_subdev_format' include/uapi/linux/v4l2-subdev.h:66: warning: Function parameter or member 'reserved' not described in 'v4l2_subdev_crop' include/uapi/linux/v4l2-subdev.h:89: warning: Function parameter or member 'reserved' not described in 'v4l2_subdev_mbus_code_enum' include/uapi/linux/v4l2-subdev.h:108: warning: Function parameter or member 'min_width' not described in 'v4l2_subdev_frame_size_enum' include/uapi/linux/v4l2-subdev.h:108: warning: Function parameter or member 'max_width' not described in 'v4l2_subdev_frame_size_enum' include/uapi/linux/v4l2-subdev.h:108: warning: Function parameter or member 'min_height' not described in 'v4l2_subdev_frame_size_enum' include/uapi/linux/v4l2-subdev.h:108: warning: Function parameter or member 'max_height' not described in 'v4l2_subdev_frame_size_enum' include/uapi/linux/v4l2-subdev.h:108: warning: Function parameter or member 'reserved' not described in 'v4l2_subdev_frame_size_enum' include/uapi/linux/v4l2-subdev.h:119: warning: Function parameter or member 'reserved' not described in 'v4l2_subdev_frame_interval' include/uapi/linux/v4l2-subdev.h:140: warning: Function parameter or member 'reserved' not described in 'v4l2_subdev_frame_interval_enum' include/uapi/linux/cec.h:406: warning: Function parameter or member 'raw' not described in 'cec_connector_info' include/uapi/linux/cec.h:470: warning: Function parameter or member 'flags' not described in 'cec_event' include/media/v4l2-h264.h:82: warning: Function parameter or member 'reflist' not described in 'v4l2_h264_build_p_ref_list' include/media/v4l2-h264.h:82: warning: expecting prototype for v4l2_h264_build_b_ref_lists(). Prototype was for v4l2_h264_build_p_ref_list() instead include/media/cec.h:50: warning: Function parameter or member 'lock' not described in 'cec_devnode' include/media/v4l2-jpeg.h:122: warning: Function parameter or member 'num_dht' not described in 'v4l2_jpeg_header' include/media/v4l2-jpeg.h:122: warning: Function parameter or member 'num_dqt' not described in 'v4l2_jpeg_header' Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	media: fix kernel-doc markups	Mauro Carvalho Chehab
	commit b064945517ee368bfb6343bf3fb4613d537c4bbb upstream. Some identifiers have different names between their prototypes and the kernel-doc markup. Seome seems to be due to cut-and-paste related issues. Others need to be fixed, as kernel-doc markups should use this format: identifier - description Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> # IPU3 and V4L2 Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: Add [GS]_ROUTING subdev ioctls and operations"	Tomi Valkeinen
	This reverts commit 592346612eb821c7b340166b9668fb543611610d. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: add V4L2_SUBDEV_ROUTE_FL_SOURCE"	Tomi Valkeinen
	This reverts commit 62e03e2f9c43d801af1d904e6bfd268131755038. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: routing kernel helper functions"	Tomi Valkeinen
	This reverts commit 2f17f25e047b3372b8fb2aab3fe4a3bd372a5fea. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: add stream based configuration"	Tomi Valkeinen
	This reverts commit 88e59be9ebcda011cfc03fa00698d81471e6ad75. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: add 'stream' to subdev ioctls"	Tomi Valkeinen
	This reverts commit b6655203ef9b7590e022140da947418c6ce69b57. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: add routing & stream config to v4l2_subdev_state"	Tomi Valkeinen
	This reverts commit ef60a6b719203969891792874f24a84b8394fb60. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-31	Revert "v4l: subdev: add V4L2_SUBDEV_FL_MULTIPLEXED"	Tomi Valkeinen
	This reverts commit 0b7c9af563e9fc3ae92ed25eadd4224f53ccbbe2. This is reverted so that we can apply a more recent version submitted to upstream. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
2022-05-12	net: stmmac: disable Split Header (SPH) for Intel platforms	Tan Tee Min
	commit 47f753c1108e287edb3e27fad8a7511a9d55578e upstream. Based on DesignWare Ethernet QoS datasheet, we are seeing the limitation of Split Header (SPH) feature is not supported for Ipv4 fragmented packet. This SPH limitation will cause ping failure when the packets size exceed the MTU size. For example, the issue happens once the basic ping packet size is larger than the configured MTU size and the data is lost inside the fragmented packet, replaced by zeros/corrupted values, and leads to ping fail. So, disable the Split Header for Intel platforms. v2: Add fixes tag in commit message. Fixes: 67afd6d1cfdf("net: stmmac: Add Split Header support and enable it in XGMAC cores") Cc: <stable@vger.kernel.org> # 5.10.x Suggested-by: Ong, Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09	tcp: make sure treq->af_specific is initialized	Eric Dumazet
	[ Upstream commit ba5a4fdd63ae0c575707030db0b634b160baddd7 ] syzbot complained about a recent change in TCP stack, hitting a NULL pointer [1] tcp request sockets have an af_specific pointer, which was used before the blamed change only for SYNACK generation in non SYNCOOKIE mode. tcp requests sockets momentarily created when third packet coming from client in SYNCOOKIE mode were not using treq->af_specific. Make sure this field is populated, in the same way normal TCP requests sockets do in tcp_conn_request(). [1] TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies. Check SNMP counters. general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534 Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48 RSP: 0018:ffffc90000de0588 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100 RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008 RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0 R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline] tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413 napi_poll net/core/dev.c:6480 [inline] net_rx_action+0x8ec/0xc60 net/core/dev.c:6567 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097 Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09	tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT	Eric Dumazet
	[ Upstream commit 4bfe744ff1644fbc0a991a2677dc874475dd6776 ] I had this bug sitting for too long in my pile, it is time to fix it. Thanks to Doug Porter for reminding me of it! We had various attempts in the past, including commit 0cbe6a8f089e ("tcp: remove SOCK_QUEUE_SHRUNK"), but the issue is that TCP stack currently only generates EPOLLOUT from input path, when tp->snd_una has advanced and skb(s) cleaned from rtx queue. If a flow has a big RTT, and/or receives SACKs, it is possible that the notsent part (tp->write_seq - tp->snd_nxt) reaches 0 and no more data can be sent until tp->snd_una finally advances. What is needed is to also check if POLLOUT needs to be generated whenever tp->snd_nxt is advanced, from output path. This bug triggers more often after an idle period, as we do not receive ACK for at least one RTT. tcp_notsent_lowat could be a fraction of what CWND and pacing rate would allow to send during this RTT. In a followup patch, I will remove the bogus call to tcp_chrono_stop(sk, TCP_CHRONO_SNDBUF_LIMITED) from tcp_check_space(). Fact that we have decided to generate an EPOLLOUT does not mean the application has immediately refilled the transmit queue. This optimistic call might have been the reason the bug seemed not too serious. Tested: 200 ms rtt, 1% packet loss, 32 MB tcp_rmem[2] and tcp_wmem[2] $ echo 500000 >/proc/sys/net/ipv4/tcp_notsent_lowat $ cat bench_rr.sh SUM=0 for i in {1..10} do V=`netperf -H remote_host -l30 -t TCP_RR -- -r 10000000,10000 -o LOCAL_BYTES_SENT \| egrep -v "MIGRATED\|Bytes"` echo $V SUM=$(($SUM + $V)) done echo SUM=$SUM Before patch: $ bench_rr.sh 130000000 80000000 140000000 140000000 140000000 140000000 130000000 40000000 90000000 110000000 SUM=1140000000 After patch: $ bench_rr.sh 430000000 590000000 530000000 450000000 450000000 350000000 450000000 490000000 480000000 460000000 SUM=4680000000 # This is 410 % of the value before patch. Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Doug Porter <dsp@fb.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09	ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode	Peilin Ye
	[ Upstream commit 31c417c948d7f6909cb63f0ac3298f3c38f8ce20 ] As pointed out by Jakub Kicinski, currently using TUNNEL_SEQ in collect_md mode is racy for [IP6]GRE[TAP] devices. Consider the following sequence of events: 1. An [IP6]GRE[TAP] device is created in collect_md mode using "ip link add ... external". "ip" ignores "[o]seq" if "external" is specified, so TUNNEL_SEQ is off, and the device is marked as NETIF_F_LLTX (i.e. it uses lockless TX); 2. Someone sets TUNNEL_SEQ on outgoing skb's, using e.g. bpf_skb_set_tunnel_key() in an eBPF program attached to this device; 3. gre_fb_xmit() or __gre6_xmit() processes these skb's: gre_build_header(skb, tun_hlen, flags, protocol, tunnel_id_to_key32(tun_info->key.tun_id), (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0); ^^^^^^^^^^^^^^^^^ Since we are not using the TX lock (&txq->_xmit_lock), multiple CPUs may try to do this tunnel->o_seqno++ in parallel, which is racy. Fix it by making o_seqno atomic_t. As mentioned by Eric Dumazet in commit b790e01aee74 ("ip_gre: lockless xmit"), making o_seqno atomic_t increases "chance for packets being out of order at receiver" when NETIF_F_LLTX is on. Maybe a better fix would be: 1. Do not ignore "oseq" in external mode. Users MUST specify "oseq" if they want the kernel to allow sequencing of outgoing packets; 2. Reject all outgoing TUNNEL_SEQ packets if the device was not created with "oseq". Unfortunately, that would break userspace. We could now make [IP6]GRE[TAP] devices always NETIF_F_LLTX, but let us do it in separate patches to keep this fix minimal. Suggested-by: Jakub Kicinski <kuba@kernel.org> Fixes: 77a5196a804e ("gre: add sequence number for collect md mode.") Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09	tcp: ensure to use the most recently sent skb when filling the rate sample	Pengcheng Yang
	[ Upstream commit b253a0680ceadc5d7b4acca7aa2d870326cad8ad ] If an ACK (s)acks multiple skbs, we favor the information from the most recently sent skb by choosing the skb with the highest prior_delivered count. But in the interval between receiving ACKs, we send multiple skbs with the same prior_delivered, because the tp->delivered only changes when we receive an ACK. We used RACK's solution, copying tcp_rack_sent_after() as tcp_skb_sent_after() helper to determine "which packet was sent last?". Later, we will use tcp_skb_sent_after() instead in RACK. Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1650422081-22153-1-git-send-email-yangpc@wangsu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09	memory: renesas-rpc-if: Fix HF/OSPI data transfer in Manual Mode	Geert Uytterhoeven
	[ Upstream commit 7e842d70fe599bc13594b650b2144c4b6e6d6bf1 ] HyperFlash devices fail to probe: rpc-if-hyperflash rpc-if-hyperflash: probing of hyperbus device failed In HyperFlash or Octal-SPI Flash mode, the Transfer Data Enable bits (SPIDE) in the Manual Mode Enable Setting Register (SMENR) are derived from half of the transfer size, cfr. the rpcif_bits_set() helper function. However, rpcif_reg_{read,write}() does not take the bus size into account, and does not double all Manual Mode Data Register access sizes when communicating with a HyperFlash or Octal-SPI Flash device. Fix this, and avoid the back-and-forth conversion between transfer size and Transfer Data Enable bits, by explicitly storing the transfer size in struct rpcif, and using that value to determine access size in rpcif_reg_{read,write}(). Enforce that the "high" Manual Mode Read/Write Data Registers (SM[RW]DR1) are only used for 8-byte data accesses. While at it, forbid writing to the Manual Mode Read Data Registers, as they are read-only. Fixes: fff53a551db50f5e ("memory: renesas-rpc-if: Correct QSPI data transfer in Manual mode") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/cde9bfacf704c81865f57b15d1b48a4793da4286.1649681476.git.geert+renesas@glider.be Link: https://lore.kernel.org/r/20220420070526.9367-1-krzysztof.kozlowski@linaro.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09	mtd: fix 'part' field data corruption in mtd_info	Oleksandr Ocheretnyi
	[ Upstream commit 37c5f9e80e015d0df17d0c377c18523002986851 ] Commit 46b5889cc2c5 ("mtd: implement proper partition handling") started using "mtd_get_master_ofs()" in mtd callbacks to determine memory offsets by means of 'part' field from mtd_info, what previously was smashed accessing 'master' field in the mtd_set_dev_defaults() method. That provides wrong offset what causes hardware access errors. Just make 'part', 'master' as separate fields, rather than using union type to avoid 'part' data corruption when mtd_set_dev_defaults() is called. Fixes: 46b5889cc2c5 ("mtd: implement proper partition handling") Signed-off-by: Oleksandr Ocheretnyi <oocheret@cisco.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220417184649.449289-1-oocheret@cisco.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09	hex2bin: make the function hex_to_bin constant-time	Mikulas Patocka
	commit e5be15767e7e284351853cbaba80cde8620341fb upstream. The function hex2bin is used to load cryptographic keys into device mapper targets dm-crypt and dm-integrity. It should take constant time independent on the processed data, so that concurrently running unprivileged code can't infer any information about the keys via microarchitectural convert channels. This patch changes the function hex_to_bin so that it contains no branches and no memory accesses. Note that this shouldn't cause performance degradation because the size of the new function is the same as the size of the old function (on x86-64) - and the new function causes no branch misprediction penalties. I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64 i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32 sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64 powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no branches in the generated code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-04	firmware: ti_sci: Introduce Power Management Ops	Dave Gerlach
	Introduce power management ops to the ti_sci driver and add a single prepare_sleep op that is used by the kernel suspend layer to provide details to firmware about the state being entered. Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
2022-05-04	mailbox: ti-msgmgr: Operate mailbox in polled mode during system suspend	Dave Gerlach
	commit df227dc8a68d81b7fe3416eca16d1f21d0b537f0 upstream. During the system suspend path we must set all queues to operate in polled mode as it is possible for any protocol built using this mailbox, such as TISCI, to require communication during the no irq phase of suspend, and we cannot rely on interrupts there. Polled mode is implemented by allowing the mailbox user to define an RX channel as part of the message that is sent which is what gets polled for a response. If polled mode is enabled, this will immediately be polled for a response at the end of the mailbox send_data op before returning success for the data send or timing out if no response is received. Finally, to ensure polled mode is always enabled during system suspend, iterate through all queues to set RX queues to polled mode during system suspend and disable polled mode for all in the resume handler. Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> [dgerlach@ti.com: Switch DEFINE_SIMPLE_DEV_PM_OPS to SIMPLE_DEV_PM_OPS] Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
2022-04-27	oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup	Nico Pache
	commit e4a38402c36e42df28eb1a5394be87e6571fb48a upstream. The pthread struct is allocated on PRIVATE\|ANONYMOUS memory [1] which can be targeted by the oom reaper. This mapping is used to store the futex robust list head; the kernel does not keep a copy of the robust list and instead references a userspace address to maintain the robustness during a process death. A race can occur between exit_mm and the oom reaper that allows the oom reaper to free the memory of the futex robust list before the exit path has handled the futex death: CPU1 CPU2 -------------------------------------------------------------------- page_fault do_exit "signal" wake_oom_reaper oom_reaper oom_reap_task_mm (invalidates mm) exit_mm exit_mm_release futex_exit_release futex_cleanup exit_robust_list get_user (EFAULT- can't access memory) If the get_user EFAULT's, the kernel will be unable to recover the waiters on the robust_list, leaving userspace mutexes hung indefinitely. Delay the OOM reaper, allowing more time for the exit path to perform the futex cleanup. Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer Based on a patch by Michal Hocko. Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1] Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: Joel Savitz <jsavitz@redhat.com> Signed-off-by: Nico Pache <npache@redhat.com> Co-developed-by: Joel Savitz <jsavitz@redhat.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Waiman Long <longman@redhat.com> Cc: Herton R. Krzesinski <herton@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Darren Hart <dvhart@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27	mm, hugetlb: allow for "high" userspace addresses	Christophe Leroy
	commit 5f24d5a579d1eace79d505b148808a850b417d4c upstream. This is a fix for commit f6795053dac8 ("mm: mmap: Allow for "high" userspace addresses") for hugetlb. This patch adds support for "high" userspace addresses that are optionally supported on the system and have to be requested via a hint mechanism ("high" addr parameter to mmap). Architectures such as powerpc and x86 achieve this by making changes to their architectural versions of hugetlb_get_unmapped_area() function. However, arm64 uses the generic version of that function. So take into account arch_get_mmap_base() and arch_get_mmap_end() in hugetlb_get_unmapped_area(). To allow that, move those two macros out of mm/mmap.c into include/linux/sched/mm.h If these macros are not defined in architectural code then they default to (TASK_SIZE) and (base) so should not introduce any behavioural changes to architectures that do not define them. For the time being, only ARM64 is affected by this change. Catalin (ARM64) said "We should have fixed hugetlb_get_unmapped_area() as well when we added support for 52-bit VA. The reason for commit f6795053dac8 was to prevent normal mmap() from returning addresses above 48-bit by default as some user-space had hard assumptions about this. It's a slight ABI change if you do this for hugetlb_get_unmapped_area() but I doubt anyone would notice. It's more likely that the current behaviour would cause issues, so I'd rather have them consistent. Basically when arm64 gained support for 52-bit addresses we did not want user-space calling mmap() to suddenly get such high addresses, otherwise we could have inadvertently broken some programs (similar behaviour to x86 here). Hence we added commit f6795053dac8. But we missed hugetlbfs which could still get such high mmap() addresses. So in theory that's a potential regression that should have bee addressed at the same time as commit f6795053dac8 (and before arm64 enabled 52-bit addresses)" Link: https://lkml.kernel.org/r/ab847b6edb197bffdfe189e70fb4ac76bfe79e0d.1650033747.git.christophe.leroy@csgroup.eu Fixes: f6795053dac8 ("mm: mmap: Allow for "high" userspace addresses") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> [5.0.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27	ipv6: make ip6_rt_gc_expire an atomic_t	Eric Dumazet
	[ Upstream commit 9cb7c013420f98fa6fd12fc6a5dc055170c108db ] Reads and Writes to ip6_rt_gc_expire always have been racy, as syzbot reported lately [1] There is a possible risk of under-flow, leading to unexpected high value passed to fib6_run_gc(), although I have not observed this in the field. Hosts hitting ip6_dst_gc() very hard are under pretty bad state anyway. [1] BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1: ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311 dst_alloc+0x9b/0x160 net/core/dst.c:86 ip6_dst_alloc net/ipv6/route.c:344 [inline] icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807 mld_send_cr net/ipv6/mcast.c:2119 [inline] mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0: ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311 dst_alloc+0x9b/0x160 net/core/dst.c:86 ip6_dst_alloc net/ipv6/route.c:344 [inline] icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807 mld_send_cr net/ipv6/mcast.c:2119 [inline] mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 value changed: 0x00000bb3 -> 0x00000ba9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: mld mld_ifc_work Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220413181333.649424-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27	esp: limit skb_page_frag_refill use to a single page	Sabrina Dubroca
	[ Upstream commit 5bd8baab087dff657e05387aee802e70304cc813 ] Commit ebe48d368e97 ("esp: Fix possible buffer overflow in ESP transformation") tried to fix skb_page_frag_refill usage in ESP by capping allocsize to 32k, but that doesn't completely solve the issue, as skb_page_frag_refill may return a single page. If that happens, we will write out of bounds, despite the check introduced in the previous patch. This patch forces COW in cases where we would end up calling skb_page_frag_refill with a size larger than a page (first in esp_output_head with tailen, then in esp_output_tail with skb->data_len). Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27	etherdevice: Adjust ether_addr* prototypes to silence -Wstringop-overead	Kees Cook
	commit 2618a0dae09ef37728dab89ff60418cbe25ae6bd upstream. With GCC 12, -Wstringop-overread was warning about an implicit cast from char[6] to char[8]. However, the extra 2 bytes are always thrown away, alignment doesn't matter, and the risk of hitting the edge of unallocated memory has been accepted, so this prototype can just be converted to a regular char *. Silences: net/core/dev.c: In function ‘bpf_prog_run_generic_xdp’: net/core/dev.c:4618:21: warning: ‘ether_addr_equal_64bits’ reading 8 bytes from a region of size 6 [-Wstringop-overread] 4618 \| orig_host = ether_addr_equal_64bits(eth->h_dest, > skb->dev->dev_addr); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/dev.c:4618:21: note: referencing argument 1 of type ‘const u8[8]’ {aka ‘const unsigned char[8]’} net/core/dev.c:4618:21: note: referencing argument 2 of type ‘const u8[8]’ {aka ‘const unsigned char[8]’} In file included from net/core/dev.c:91: include/linux/etherdevice.h:375:20: note: in a call to function ‘ether_addr_equal_64bits’ 375 \| static inline bool ether_addr_equal_64bits(const u8 addr1[6+2], \| ^~~~~~~~~~~~~~~~~~~~~~~ Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/netdev/20220212090811.uuzk6d76agw2vv73@pengutronix.de Cc: Jakub Kicinski <kuba@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Khem Raj <raj.khem@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20	ax25: fix reference count leaks of ax25_dev	Duoming Zhou
	commit 87563a043cef044fed5db7967a75741cc16ad2b1 upstream. The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") introduces refcount into ax25_dev, but there are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(), ax25_rt_add(), ax25_rt_del() and ax25_rt_opt(). This patch uses ax25_dev_put() and adjusts the position of ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> [OP: backport to 5.10: adjust context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20	ax25: add refcount in ax25_dev to avoid UAF bugs	Duoming Zhou
	commit d01ffb9eee4af165d83b08dd73ebdf9fe94a519b upstream. If we dereference ax25_dev after we call kfree(ax25_dev) in ax25_dev_device_down(), it will lead to concurrency UAF bugs. There are eight syscall functions suffer from UAF bugs, include ax25_bind(), ax25_release(), ax25_connect(), ax25_ioctl(), ax25_getname(), ax25_sendmsg(), ax25_getsockopt() and ax25_info_show(). One of the concurrency UAF can be shown as below: (USE) \| (FREE) \| ax25_device_event \| ax25_dev_device_down ax25_bind \| ... ... \| kfree(ax25_dev) ax25_fillin_cb() \| ... ax25_fillin_cb_from_dev() \| ... \| The root cause of UAF bugs is that kfree(ax25_dev) in ax25_dev_device_down() is not protected by any locks. When ax25_dev, which there are still pointers point to, is released, the concurrency UAF bug will happen. This patch introduces refcount into ax25_dev in order to guarantee that there are no pointers point to it when ax25_dev is released. Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> [OP: backport to 5.10: adjusted context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20	tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry	Steve Capper
	[ Upstream commit 697a1d44af8ba0477ee729e632f4ade37999249a ] tlb_remove_huge_tlb_entry only considers PMD_SIZE and PUD_SIZE when updating the mmu_gather structure. Unfortunately on arm64 there are two additional huge page sizes that need to be covered: CONT_PTE_SIZE and CONT_PMD_SIZE. Where an end-user attempts to employ contiguous huge pages, a VM_BUG_ON can be experienced due to the fact that the tlb structure hasn't been correctly updated by the relevant tlb_flush_p.._range() call from tlb_remove_huge_tlb_entry. This patch adds inequality logic to the generic implementation of tlb_remove_huge_tlb_entry s.t. CONT_PTE_SIZE and CONT_PMD_SIZE are effectively covered on arm64. Also, as well as ptes, pmds and puds; p4ds are now considered too. Reported-by: David Hildenbrand <david@redhat.com> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/linux-mm/811c5c8e-b3a2-85d2-049c-717f17c3a03a@redhat.com/ Signed-off-by: Steve Capper <steve.capper@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220330112543.863-1-steve.capper@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20	scsi: iscsi: Fix conn cleanup and stop race during iscsid restart	Mike Christie
	[ Upstream commit 7c6e99c18167ed89729bf167ccb4a7e3ab3115ba ] If iscsid is doing a stop_conn at the same time the kernel is starting error recovery we can hit a race that allows the cleanup work to run on a valid connection. In the race, iscsi_if_stop_conn sees the cleanup bit set, but it calls flush_work on the clean_work before iscsi_conn_error_event has queued it. The flush then returns before the queueing and so the cleanup_work can run later and disconnect/stop a conn while it's in a connected state. The patch: Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") added the late stop_conn call bug originally, and the patch: Commit 23d6fefbb3f6 ("scsi: iscsi: Fix in-kernel conn failure handling") attempted to fix it but only fixed the normal EH case and left the above race for the iscsid restart case. For the normal EH case we don't hit the race because we only signal userspace to start recovery after we have done the queueing, so the flush will always catch the queued work or see it completed. For iscsid restart cases like boot, we can hit the race because iscsid will call down to the kernel before the kernel has signaled any error, so both code paths can be running at the same time. This adds a lock around the setting of the cleanup bit and queueing so they happen together. Link: https://lore.kernel.org/r/20220408001314.5014-6-michael.christie@oracle.com Fixes: 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20	scsi: iscsi: Fix in-kernel conn failure handling	Mike Christie
	[ Upstream commit 23d6fefbb3f6b1cc29794427588b470ed06ff64e ] Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") has the following regressions/bugs that this patch fixes: 1. It can return cmds to upper layers like dm-multipath where that can retry them. After they are successful the fs/app can send new I/O to the same sectors, but we've left the cmds running in FW or in the net layer. We need to be calling ep_disconnect if userspace is not up. This patch only fixes the issue for offload drivers. iscsi_tcp will be fixed in separate commit because it doesn't have a ep_disconnect call. 2. The drivers that implement ep_disconnect expect that it's called before conn_stop. Besides crashes, if the cleanup_task callout is called before ep_disconnect it might free up driver/card resources for session1 then they could be allocated for session2. But because the driver's ep_disconnect is not called it has not cleaned up the firmware so the card is still using the resources for the original cmd. 3. The stop_conn_work_fn can run after userspace has done its recovery and we are happily using the session. We will then end up with various bugs depending on what is going on at the time. We may also run stop_conn_work_fn late after userspace has called stop_conn and ep_disconnect and is now going to call start/bind conn. If stop_conn_work_fn runs after bind but before start, we would leave the conn in a unbound but sort of started state where IO might be allowed even though the drivers have been set in a state where they no longer expect I/O. 4. Returning -EAGAIN in iscsi_if_destroy_conn if we haven't yet run the in kernel stop_conn function is breaking userspace. We should have been doing this for the caller. Link: https://lore.kernel.org/r/20210525181821.7617-8-michael.christie@oracle.com Fixes: 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20	scsi: iscsi: Rel ref after iscsi_lookup_endpoint()	Mike Christie
	[ Upstream commit 9e5fe1700896c85040943fdc0d3fee0dd3e0d36f ] Subsequent commits allow the kernel to do ep_disconnect. In that case we will have to get a proper refcount on the ep so one thread does not delete it from under another. Link: https://lore.kernel.org/r/20210525181821.7617-7-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20	scsi: iscsi: Stop queueing during ep_disconnect	Mike Christie
	[ Upstream commit 891e2639deae721dc43764a44fa255890dc34313 ] During ep_disconnect we have been doing iscsi_suspend_tx/queue to block new I/O but every driver except cxgbi and iscsi_tcp can still get I/O from __iscsi_conn_send_pdu() if we haven't called iscsi_conn_failure() before ep_disconnect. This could happen if we were terminating the session, and the logout timed out before it was even sent to libiscsi. Fix the issue by adding a helper which reverses the bind_conn call that allows new I/O to be queued. Drivers implementing ep_disconnect can use this to make sure new I/O is not queued to them when handling the disconnect. Link: https://lore.kernel.org/r/20210525181821.7617-3-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20	net/sched: flower: fix parsing of ethertype following VLAN header	Vlad Buslov
	[ Upstream commit 2105f700b53c24aa48b65c15652acc386044d26a ] A tc flower filter matching TCA_FLOWER_KEY_VLAN_ETH_TYPE is expected to match the L2 ethertype following the first VLAN header, as confirmed by linked discussion with the maintainer. However, such rule also matches packets that have additional second VLAN header, even though filter has both eth_type and vlan_ethtype set to "ipv4". Looking at the code this seems to be mostly an artifact of the way flower uses flow dissector. First, even though looking at the uAPI eth_type and vlan_ethtype appear like a distinct fields, in flower they are all mapped to the same key->basic.n_proto. Second, flow dissector skips following VLAN header as no keys for FLOW_DISSECTOR_KEY_CVLAN are set and eventually assigns the value of n_proto to last parsed header. With these, such filters ignore any headers present between first VLAN header and first "non magic" header (ipv4 in this case) that doesn't result FLOW_DISSECT_RET_PROTO_AGAIN. Fix the issue by extending flow dissector VLAN key structure with new 'vlan_eth_type' field that matches first ethertype following previously parsed VLAN header. Modify flower classifier to set the new flow_dissector_key_vlan->vlan_eth_type with value obtained from TCA_FLOWER_KEY_VLAN_ETH_TYPE/TCA_FLOWER_KEY_CVLAN_ETH_TYPE uAPIs. Link: https://lore.kernel.org/all/Yjhgi48BpTGh6dig@nanopsycho/ Fixes: 9399ae9a6cb2 ("net_sched: flower: Add vlan support") Fixes: d64efd0926ba ("net/sched: flower: Add supprt for matching on QinQ vlan headers") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>