Hi all and hi Nemael!
I think we need to define better the GSoC about cable purpose
auto-detection. In my opinion, during the meeting the goal was
broadened and became confused (at least for me).
During the meeting Gio mentioned that faulty commercial
routers could be detected and a notification in the lime-web interface
could appear.
But for implementing this we need to know more info:
* what exactly should be detected?
* why?
Can anyone share some more situations where any kind of detection
could be useful, according to their experience/opinion?
I would propose to fragment the project in small well defined tasks,
as independent as possible.
I would suggest starting from:
0) gather necessities from the communities (share your thoughts please!!!);
1) small lua or bash scripts for detecting specific things about an
ethernet port. Things that can be useful for the identified
necessities;
then, if time is enough:
2) small lua script that creates a interface-specific configuration
for a few identified cases, allowing the user to modify or disable the
resulting automatic configuration (e.g. like what lime-hwd-openwrt-wan
package does)
3) integrate each detection script with the corresponding
configuration script and create a new "lime-hwd-" package like
lime-hwd-ground-routing.
https://github.com/libremesh/lime-packages/blob/master/packages/lime-hwd-ground-routing/files/usr/lib/lua/lime/hwd/ground_routing.lua
It could be something like lime-hwd-autodetect-ethernet-
mesh or lime-hwd-autodetect-ethernet-client
so that they will run and configure stuff every time lime-config is
run, but only if this module is explicitly activated in
/etc/config/lime-* files (e.g. lime-hwd-ground-routing does not do
anything unless a hwd_gr section is found).
Instead of making packages, we can also have the files as lime-assets.
lime-assets can be run at the first boot or every time lime-config is
executed. Some documentation here:
https://github.com/libremesh/lime-packages/issues/719
and implemented here:
https://github.com/libremesh/lime-packages/blob/master/packages/lime-system/files/usr/lib/lua/lime/generic_config.lua
4) for the modules where it makes sense, have it running when an
interface goes up or down (this was suggested by Javier). OpenWrt has a very useful function for this:
https://openwrt.org/docs/guide-user/base-system/hotplug#iface
5) for the modules where it makes sense, have a system for running
constantly (e.g. we already have a script for checking if the internet
connection is working: babeld-auto-gw-mode this is just a rule for
watchping. Watchping will continuously ping an IP on the internet and
execute the rule when it fails/succeeds
https://github.com/libremesh/lime-packages/tree/master/packages/babeld-auto-gw-mode/files/etc/watchping
)
6) share found info via shared-state (this was proposed during the
meeting but I am not sure why should we do this, maybe is required for
lime-app integration?)
7) Integrate the configuration modules from point 2 with the lime-app. Similarly to what LuCI (the web interface from OpenWrt) does, lime-app could allow the user to manually configure some interfaces for some common uses.
8) Integrate the auto-detection modules from point 1 with lime-app notifications. From Gio during the meeting:
"shared-state has a reference state, so lime-app can show what changed
and show a notification saying it is broken now. The goal is to make
troubleshooting easier for the user. The configuration of batman-adv
vs client connected to ethernet is useful but could be suggested to
the user from the lime-app, not necessarily applied automatically.
Troubleshooting is very important, and this could help."
About point 0 we need your opinion!
i) Gio suggested detecting misconfigured commercial routers for
helping the debugging of network problems in large networks. I am not
sure which kind of fault should be detected...? We already have the
watchping+babeld-auto-gw-mode combo that detects if the advertised
internet connection does not work. Should we reimplement this inside
this GSoC's framework? Is there anything else we could look for in
commercial routers?
ii) an ethernet port used for meshing (connected to other LibreMesh
routers) should not be included in br-lan bridge, for avoiding
possible loops (unsure if Batman-adv can really manage all the loops,
but it throws a creepy error message). There is a bit of old
discussion from this comment on:
https://github.com/libremesh/lime-packages/issues/56#issuecomment-637598835
and during BattleMesh v15 in 2023 we configured that manually
iii) from Gio during the meeting "we can also inform user the user
about what changed on their network"
iv) cut the broadcast when two clouds with different ap_name (the
parameters that defines the nodes routing with batman-adv, when the
ap_name is different batman-adv is separated between the two networks
and Babeld is doing the layer3 routing). Currently some people (from
what Nico Pace wrote years ago) are using cabled WAN-WAN connections
for this. Clearly, it would be much more elegant to do it LAN-LAN (so
that you can use the WAN port for its internet-gateway function) and
still avoid that all the broadcast traffic goes to the other cloud.
v) ???
Starting to share ideas about the point 1 (please share your ideas!):
a) As Gio suggested during the meeting, we could use a small piece of
shared-state which if the tool for detecting the neighbors. If any
neighbor is found, we can assume there is a LibreMesh device there. We
can be sure about that until when shared-state will be moved to
OpenWrt repositories (devs, are there any plans for this migration?).
The script is this one:
https://github.com/libremesh/lime-packages/blob/master/packages/shared-state/files/usr/bin/shared-state-get_candidates_neigh
and it works with ping6 to ipv6 link local broadcast.
a2) we could do the same using ping6 link local but without using the
shared-state thing
a3) we should start detecting LibreMesh nodes using the output from
the routing protocols, as they already run their detection
doubt: in the routers with swconfig (like all the ath79 ones, which
still have to be supported by DSA), usually all the LAN ports are
inside the same interface, right? Like the eth0.1 on TP-Link WDR3600.
In this case, can we detect and configure a single port (not all of
them at the same time?). Is it something like what
lime-hwd-ground-routing does (it requires the user to specify the CPU
port, which allows it to identify a specific ethernet port inside the
LAN interface)?
b) We could detect if the WAN ethernet port has a DHCP server connected
b2) if this DHCP server offers a default route
b3) one way to do this would be what Gio suggested in the meeting:
modify the DHCP client of OpenWrt for showing the received information
without applying it (like the -T option of dhcpcd, see an example of
output in the meeting minute of the 5th of June 2024). As everything that could be accepted on OpenWrt repositories, this should go to OpenWrt as soon as it is good enough.
b3) if this default route is a working internet connection (but
Watchping already works for this...)
c) when another LibreMesh router is detected, check if it is part of
the same cloud (for example if babeld can see the other node but
batman-adv cannot see it)
Ciao!
Ilario