701-net-0367-net-mscc-ocelot-Workaround-to-allow-traffic-to-CPU-i.patch 6.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149
  1. From 937bf9496489cb4b491e75fe4436348bf3454dcd Mon Sep 17 00:00:00 2001
  2. From: Vladimir Oltean <[email protected]>
  3. Date: Sat, 21 Dec 2019 23:19:20 +0200
  4. Subject: [PATCH] net: mscc: ocelot: Workaround to allow traffic to CPU in
  5. standalone mode
  6. The Ocelot switches have what is, in my opinion, a design flaw: their
  7. DSA header is in front of the Ethernet header, which means that they
  8. subvert the DSA master's RX filter, which for all practical purposes,
  9. either needs to be in promiscuous mode, or the OCELOT_TAG_PREFIX_LONG
  10. needs to be used for extraction, which makes the switch add a fake DMAC
  11. of ff:ff:ff:ff:ff:ff so that the DSA master accepts the frame.
  12. The issue with this design, of course, is that the CPU will be spammed
  13. with frames that it doesn't want to respond to, and there isn't any
  14. hardware offload in place by default to drop them.
  15. What is being done in the VSC7514 Ocelot driver is a process of
  16. selective whitelisting. The "MAC address" of each Ocelot switch net
  17. device, with all VLANs installed on that port, is being added as a FDB
  18. entry towards PGID_CPU.
  19. Some background first: Port Group IDs (PGIDs) are masks of destination
  20. ports. The switch performs 3 lookups in the PGID table for each frame,
  21. and forwards the frame to the ports that are present in the logical AND
  22. of all 3 PGIDs (for the most part, see below).
  23. The first PGID lookup is for the destination masks and the PGID table is
  24. indexed by the DEST_IDX field from the MAC table (FDB).
  25. The PGID can be an unicast set: PGIDs 0-11 are the per-port PGIDs, and
  26. by convention PGID i has only BIT(i) set, aka only this port is set in
  27. the destination mask.
  28. Or the PGID can be a multicast set: PGIDs 12-63 can (again, still by
  29. convention) hold a richer destination mask comprised of multiple ports.
  30. [ Ignoring the second PGID lookup, for aggregation, since it doesn't
  31. interfere. ]
  32. The third PGID lookup is for source masks: PGID entries 80-91 answer the
  33. question: is port i allowed to forward traffic to port j? If yes, then
  34. BIT(j) of PGID 80+i will be found set.
  35. What is interesting about the CPU port in this whole story is that, in
  36. the way the driver sets up the PGIDs, its bit isn't set in any source
  37. mask PGID of any other port (therefore, the third lookup would always
  38. decide to exclude the CPU port from this list). So frames are never
  39. _forwarded_ to the CPU.
  40. There is a loophole in this PGID mechanism which is described in the
  41. VSC7514 manual:
  42. If an entry is found in the MAC table entry of ENTRY_TYPE 0 or 1
  43. and the CPU port is set in the PGID pointed to by the MAC table
  44. entry, CPU extraction queue PGID.DST_PGID is added to the CPUQ.
  45. In other words, the CPU port is special, and frames are "copied" to the
  46. CPU, disregarding the source masks (third PGID lookup), if BIT(cpu) is
  47. found to be set in the destination masks (first PGID lookup).
  48. Now back to the story: what is PGID_CPU? It is a multicast set
  49. containing only BIT(cpu). I don't know why it was chosen to be a
  50. multicast PGID (59) and not simply the unicast one of this port, but it
  51. doesn't matter.
  52. The point is that frames that match the FDB will go to PGID_CPU by
  53. virtue of the DEST_IDX from the respective MAC table entry, and frames
  54. that don't will go to PGID_UC or PGID_MC, by virtue of the FLD_UNICAST,
  55. FLD_BROADCAST etc settings for flooding. And that is where the
  56. distinction is made: flooded frames will be subject to the third PGID
  57. lookup, while frames that are whitelisted to the PGID_CPU by the MAC
  58. table aren't.
  59. So we can use this mechanism to simulate an RX filter, given that we are
  60. subverting the DSA master's implicit one, as mentioned in the first
  61. paragraph. But this has some limitations:
  62. - In Ocelot each net device has its own MAC address. When simulating
  63. this with MAC table entries, it will practically result in having N
  64. MAC addresses for each of the N front-panel ports (because FDB entries
  65. are not per source port). A bit strange, I think.
  66. - In DSA we don't have the infrastructure in place to support this
  67. whitelisting mechanism. Calling .port_fdb_add on the CPU port for each
  68. slave net device dev_addr isn't, in itself, hard. The problem is with
  69. the VLANs that this port is part of. We would need to keep a duplicate
  70. list of the VLANs from the bridge, plus the ones added from 8021q, for
  71. each port. And we would need reference counting on each MAC address,
  72. such that when a front-panel port changes its MAC address and we need
  73. to delete the old FDB entry, we don't actually delete it if the other
  74. front-panel ports are still using it. Not to mention that this FDB
  75. entry would have to be added on the whole net of upstream DSA switches.
  76. So... it's complicated. What this patch does is to simply allow frames
  77. to be flooded to the CPU, which is anyway what the Ocelot driver is
  78. doing after removing the bridge from the net devices, see this snippet
  79. from ocelot_bridge_stp_state_set:
  80. /* Apply FWD mask. The loop is needed to add/remove the current port as
  81. * a source for the other ports.
  82. */
  83. for (p = 0; p < ocelot->num_phys_ports; p++) {
  84. if (p == ocelot->cpu || (ocelot->bridge_fwd_mask & BIT(p))) {
  85. (...)
  86. } else {
  87. /* Only the CPU port, this is compatible with link
  88. * aggregation.
  89. */
  90. ocelot_write_rix(ocelot,
  91. BIT(ocelot->cpu),
  92. ANA_PGID_PGID, PGID_SRC + p);
  93. }
  94. Otherwise said, the ocelot driver itself is already not self-coherent,
  95. since immediately after probe time, and immediately after removal from a
  96. bridge, it behaves in different ways, although the front panel ports are
  97. standalone in both cases.
  98. While standalone traffic _does_ work for the Felix DSA wrapper after
  99. enslaving and removing the ports from a bridge, this patch makes
  100. standalone traffic work at probe time too, with the caveat that even
  101. irrelevant frames will get processed by software, making it more
  102. susceptible to denial of service.
  103. Signed-off-by: Vladimir Oltean <[email protected]>
  104. ---
  105. drivers/net/ethernet/mscc/ocelot.c | 12 ++++++++++++
  106. 1 file changed, 12 insertions(+)
  107. --- a/drivers/net/ethernet/mscc/ocelot.c
  108. +++ b/drivers/net/ethernet/mscc/ocelot.c
  109. @@ -2294,6 +2294,18 @@ void ocelot_set_cpu_port(struct ocelot *
  110. enum ocelot_tag_prefix injection,
  111. enum ocelot_tag_prefix extraction)
  112. {
  113. + int port;
  114. +
  115. + for (port = 0; port < ocelot->num_phys_ports; port++) {
  116. + /* Disable old CPU port and enable new one */
  117. + ocelot_rmw_rix(ocelot, 0, BIT(ocelot->cpu),
  118. + ANA_PGID_PGID, PGID_SRC + port);
  119. + if (port == cpu)
  120. + continue;
  121. + ocelot_rmw_rix(ocelot, BIT(cpu), BIT(cpu),
  122. + ANA_PGID_PGID, PGID_SRC + port);
  123. + }
  124. +
  125. /* Configure and enable the CPU port. */
  126. ocelot_write_rix(ocelot, 0, ANA_PGID_PGID, cpu);
  127. ocelot_write_rix(ocelot, BIT(cpu), ANA_PGID_PGID, PGID_CPU);