611-01-v6.11-udp-Allow-GSO-transmit-from-devices-with-no-checksum.patch 3.4 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
  1. From: Jakub Sitnicki <[email protected]>
  2. Date: Wed, 26 Jun 2024 19:51:26 +0200
  3. Subject: [PATCH] udp: Allow GSO transmit from devices with no checksum offload
  4. Today sending a UDP GSO packet from a TUN device results in an EIO error:
  5. import fcntl, os, struct
  6. from socket import *
  7. TUNSETIFF = 0x400454CA
  8. IFF_TUN = 0x0001
  9. IFF_NO_PI = 0x1000
  10. UDP_SEGMENT = 103
  11. tun_fd = os.open("/dev/net/tun", os.O_RDWR)
  12. ifr = struct.pack("16sH", b"tun0", IFF_TUN | IFF_NO_PI)
  13. fcntl.ioctl(tun_fd, TUNSETIFF, ifr)
  14. os.system("ip addr add 192.0.2.1/24 dev tun0")
  15. os.system("ip link set dev tun0 up")
  16. s = socket(AF_INET, SOCK_DGRAM)
  17. s.setsockopt(SOL_UDP, UDP_SEGMENT, 1200)
  18. s.sendto(b"x" * 3000, ("192.0.2.2", 9)) # EIO
  19. This is due to a check in the udp stack if the egress device offers
  20. checksum offload. While TUN/TAP devices, by default, don't advertise this
  21. capability because it requires support from the TUN/TAP reader.
  22. However, the GSO stack has a software fallback for checksum calculation,
  23. which we can use. This way we don't force UDP_SEGMENT users to handle the
  24. EIO error and implement a segmentation fallback.
  25. Lift the restriction so that UDP_SEGMENT can be used with any egress
  26. device. We also need to adjust the UDP GSO code to match the GSO stack
  27. expectation about ip_summed field, as set in commit 8d63bee643f1 ("net:
  28. avoid skb_warn_bad_offload false positives on UFO"). Otherwise we will hit
  29. the bad offload check.
  30. Users should, however, expect a potential performance impact when
  31. batch-sending packets with UDP_SEGMENT without checksum offload on the
  32. egress device. In such case the packet payload is read twice: first during
  33. the sendmsg syscall when copying data from user memory, and then in the GSO
  34. stack for checksum computation. This double memory read can be less
  35. efficient than a regular sendmsg where the checksum is calculated during
  36. the initial data copy from user memory.
  37. Signed-off-by: Jakub Sitnicki <[email protected]>
  38. Reviewed-by: Willem de Bruijn <[email protected]>
  39. Link: https://patch.msgid.link/[email protected]
  40. Signed-off-by: Jakub Kicinski <[email protected]>
  41. ---
  42. --- a/net/ipv4/udp.c
  43. +++ b/net/ipv4/udp.c
  44. @@ -942,8 +942,7 @@ static int udp_send_skb(struct sk_buff *
  45. kfree_skb(skb);
  46. return -EINVAL;
  47. }
  48. - if (skb->ip_summed != CHECKSUM_PARTIAL || is_udplite ||
  49. - dst_xfrm(skb_dst(skb))) {
  50. + if (is_udplite || dst_xfrm(skb_dst(skb))) {
  51. kfree_skb(skb);
  52. return -EIO;
  53. }
  54. --- a/net/ipv4/udp_offload.c
  55. +++ b/net/ipv4/udp_offload.c
  56. @@ -361,6 +361,14 @@ struct sk_buff *__udp_gso_segment(struct
  57. else
  58. uh->check = gso_make_checksum(seg, ~check) ? : CSUM_MANGLED_0;
  59. + /* On the TX path, CHECKSUM_NONE and CHECKSUM_UNNECESSARY have the same
  60. + * meaning. However, check for bad offloads in the GSO stack expects the
  61. + * latter, if the checksum was calculated in software. To vouch for the
  62. + * segment skbs we actually need to set it on the gso_skb.
  63. + */
  64. + if (gso_skb->ip_summed == CHECKSUM_NONE)
  65. + gso_skb->ip_summed = CHECKSUM_UNNECESSARY;
  66. +
  67. /* update refcount for the packet */
  68. if (copy_dtor) {
  69. int delta = sum_truesize - gso_skb->truesize;
  70. --- a/net/ipv6/udp.c
  71. +++ b/net/ipv6/udp.c
  72. @@ -1261,8 +1261,7 @@ static int udp_v6_send_skb(struct sk_buf
  73. kfree_skb(skb);
  74. return -EINVAL;
  75. }
  76. - if (skb->ip_summed != CHECKSUM_PARTIAL || is_udplite ||
  77. - dst_xfrm(skb_dst(skb))) {
  78. + if (is_udplite || dst_xfrm(skb_dst(skb))) {
  79. kfree_skb(skb);
  80. return -EIO;
  81. }