Re: [lime] Results on distance setting influence on performa…

このメッセージを削除

このメッセージに返信
著者: Ilario
日付:  
To: LibreMesh, gothos
題目: Re: [lime] Results on distance setting influence on performances
TL;DR:
The links that we analysed don't represent poor quality links, where the
distance parameter is expected to have a stronger impact.
Yet, I would say that shipping LibreMesh releases with a default
distance set to 10 km for 5 GHz radios is OK.
Plenty of other weird and interesting-but-unrelated things can be
observed from the data.



Thanks Sam for the data!
As requested by SAn, I attempt a first analysis...
But I have no idea of what I am writing about, some professional should
have a look at this!
My comments are prepended with "***"

Some background info is available here:
https://github.com/libremesh/lime-packages/issues/201

1) summary of the data from Sam's last email:

1a) link with real distance 1200 m:

signal -63 dBm on one direction and -64 dBm on the other

*** the connection has a moderately good signal, maybe the link is not
bad enough to see the effects of the distance setting?


checking the metric from the BMX6 output, seems that both nodes reach
each other directly (without intermediate hops), at least when the
lime-report command was issued

*** good


the packet loss can be estimated by these lines in one direction:
    tx packets:    400408
    tx retries:    18313 (4.6 %)
    tx failed:    20
and in the other direction:
    tx packets:    150349
    tx retries:    21877 (14.6 %)
    tx failed:    0


*** this link is a bit lossy, hopefully the effects of different
distance settings will be visible...?


1b) link with real distance 2100 m:

signal -51 dBm on one direction and -55 dBm on the other

*** the connection has an extremely good signal, maybe the link is not
bad enough to see the effects of the distance setting?


checking the metric from the BMX6 output, seems that both nodes reach
each other directly (without intermediate hops), at least when the
lime-report command was issued

*** good


the packet loss can be estimated by these lines in one direction:
    tx packets:    1920948
    tx retries:    58679 (3 %)
    tx failed:    7
and in the other direction:
    tx packets:    2864851
    tx retries:    139427 (4.9 %)
    tx failed:    8


*** maybe this link is not failing enough to see the effects of
different distance settings?


2) analysis of the measurement data

The data is here:
https://github.com/ilario/wifi-distance-setting-exploration

*** I will process together all the data from each link, even if taken
over different days.

2a) link with real distance of 1200 m

You can create yourself the aggregated plots with:
python iperf3_log_analysis.py distance <(cat
*realdistance_1200*/iperf3-client.log)
Or check the plots divided day by day available on the repository:
https://github.com/ilario/wifi-distance-setting-exploration

Two files are attached and commented:
* realdistance_1200-violin.png
which represents the distribution of the amount of data transferred in
one second. These distributions are divided by the distance parameter
that was in place when that measure has been taken and by whether the
data was being transmitted or received.

*** the average speeds are not much influenced by the distance
parameter. There is a large spread in speeds. Increasing the distance
setting, there are more seconds in which zero data is transferred, which
could be a problem. If we were really interested in studying this, we
could fit it with a logistic regression (zero speed or non-zero speed vs
time). For now, this is just quantified as a percentage in the
-describe.txt files, for example here:
https://github.com/ilario/wifi-distance-setting-exploration/blob/567be4d7333f4bd8bea1a947f4de4090e4cad04b/20230704-realdistance_1200-cetonia_server-ieee80211s-tarlo_client/iperf3-client.log-describe.txt#L11-L16

* realdistance_1200-vs_single_meas_time_scatter.jpg
here all the measured points are represented, overlapped. Each point
indicates how much data has been transferred in one second. The colors
indicate the distance parameter that was set in that moment, the two
plots are one for receiving speed and the other for transmitting speed.
Each distance was measured for 8 minutes, so you can see that the
horizontal time axis goes from zero to 8 minutes. All the measurements
are overlapped here so that you can see if there are weird things
happening always at the same time.

*** after 5 min from the beginning of the test, we expect a cronjob to
interrupt it for performing a scan of the wifi channels, as indicated in
Gothos' email. There are other events happening always after 2 and 6
minutes of the measurement start. This looks like some other automated
process happening every 4 minutes, no idea which one. Anyway, this
affects all the distances in the same way, so it is irrelevant for our
study.

*** Other observations we can get from looking at the
iperf3-client.log-vs_single_meas_time_scatter.png files:
there is not a performance degradation as the router warms up during the
8 minutes long measurement (after the measurement, it cools down for
2+10 minutes). This can also be confirmed looking at the small
correlation value between "Single Meas. Time" and Bitrate here:
https://github.com/ilario/wifi-distance-setting-exploration/blob/567be4d7333f4bd8bea1a947f4de4090e4cad04b/20230627-realdistance_2100-ninux-59a9ea_server-ieee80211s-tarlo_client/iperf3-client.log-describe.txt#L70

*** Other observations we can get from looking at the
iperf3-client.log-vs_total_meas_time_scatter.png files:
during the night the performances are stable, there is not a drift
depending on the hour of the day (measurements started at 1 AM and ended
at 8 AM). This can also be confirmed looking at the small correlation
value between "Total Meas. Time" and Bitrate here:
https://github.com/ilario/wifi-distance-setting-exploration/blob/567be4d7333f4bd8bea1a947f4de4090e4cad04b/20230627-realdistance_2100-ninux-59a9ea_server-ieee80211s-tarlo_client/iperf3-client.log-describe.txt#L71


2b) link with real distance of 2100 m

You can create yourself the aggregated plots with:
python iperf3_log_analysis.py distance <(cat
*realdistance_2100*/iperf3-client.log)
Or check the plots divided day by day available on the repository:
https://github.com/ilario/wifi-distance-setting-exploration

Three files are attached and commented:

* realdistance_2100-violin.png

*** The distributions are strongly bimodal, with a low and a high speed
regime. When increasing the distance parameter to 10km, some more
intermediate speeds seem to appear. Increasing the distance setting
seems to improve the average speed, but I would not state it for sure.
What's sure is that it does not make it worse (as we expected to). What
we observed in the 1200 m long link, about the link being more often at
zero-speed when increasing the distance parameter, cannot be seen here.

* 20230627-realdistance_2100-iperf3-client.log-vs_time_line.jpg

*** The bimodal distribution is caused by the change from a "slow RX
fast TX" regime to a "fast RX slow TX" one. So, when the speed is high
in one direction, it is low in another. Like if some driver was taking
some decision there... We can also see that these regimes alternate
every few minutes when the distance setting is at 2100 m, but alternate
much faster when the distance is at 10 km. This is likely the origin of
the intermediate speeds observed in the violin plots for 10 km distance
setting. These two alternating regimes can be confirmed looking at the
correlation values here:
https://github.com/ilario/wifi-distance-setting-exploration/blob/567be4d7333f4bd8bea1a947f4de4090e4cad04b/20230627-realdistance_2100-ninux-59a9ea_server-ieee80211s-tarlo_client/iperf3-client.log-describe.txt#L18-L23
Interestingly, this does not happen on the 1200 m long link.
Anyway, this is not clearly in favour of a small or large default
distance setting.

* realdistance_2100-vs_single_meas_time_scatter.jpg

*** The cronjob-related holes in the speed plot are the same as in the
1200 m long link. Additionally, here we can clearly see that there are
different regimes, in the first minutes, additionally to the low and
high speeds, there are also intermediate ones. From minute 2 to minute
5, the intermediate speeds disappear and all the points are either high
or low. No idea why. This happens with all distance settings, so we
don't really care.


CONCLUDING:

With the available data, obtained from good quality links (where the
distance parameter should not be much important) seems that setting 10
km as the default distance setting for the 5 GHz radios in the LibreMesh
releases is ok.

Above, I noted some tiny issue with a large distance setting, but it is
much better to have a too large value with these tiny issues than a too
small value with a non-working link.

Obviously we have to recommend the users to change this parameter, but
if it is too large seems that nothing catastrofic happens.
It would be great to have this data also for bad quality lossy links,
but for now these are my conclusions.

So I would recommend to increase the default value from 1 km to 10 km.

For the 2.4 GHz radios I suppose that the conclusions should be not too
different, and the current value of 100 m is exhaggerately small, so I
would increase it to 1 km.

Ciao!
Ilario

On 8/23/23 13:14, gothos wrote:
> On 8/15/23 16:26, Ilario wrote:
>> Dear all,
>> with Gothos (a.k.a. Samlo, our GSoC student for this year), we
>> performed some measurement on the influence of a too-large distance
>> setting on two mesh (iee802.11s on 5 GHz) wifi links, one of 1200 m
>> and another of 2100 m. Actually, the real-world measurements have been
>> taken by Gothos on some real nodes of this community:
>> https://antennine.noblogs.org/
>>
>> We checked what happens, on both links, when changing the distance
>> setting between 2100 m, 5 km and 10 km, without changing anything
>> else, just a parameter in OpenWrt. And measuring the real bandwidth
>> with bidirectional iperf3.
>>
>> This has been done for deciding over this topic:
>> https://github.com/libremesh/lime-packages/issues/201
>> (which is a good default value for the distance setting)
>> and we could (should) implement the output of this discussion (i.e. a
>> better default value) in the upcoming releases.
>>
>> This email is for sharing with you the results and the code, in case
>> anyone is willing to measure this on their routers, or to perform more
>> optimizations of other parameters!
>> In a following email, Gothos will tell us more about the physical
>> setup at the testing site, router models, signal level etc.
>> Also, anyone is welcome to share their opinion on the data!
>>
>
> Hi all!
>
> I just add the info about the physical setup
>
> And attach the lime-reports of each device that contain among others the
> output of these commands:
> iwinfo
> iw dev wlan0-mesh station dump
>
> To have an idea about the current* ratio `tx retries / tx packets` and
> of noise/interference
>
> *all the nodes have today a distance set on 10000m because I've never
> modify them anymore after running the tests :)
>
>
> ## the setup
>
> The 3 devices are ar71xx/generic Ubiquiti Litebeam M5 with different
> versions of libremesh based on openwrt 19.07
>
> LBE-M5-23 https://dl.ubnt.com/datasheets/LiteBeam/LiteBeam_DS.pdf
>
>
> ## Tests on days from 2023-06-26 and 2023-06-27
> real distance: 2100m
> nodes:
>   ninux-59a9ea: iperf3-server (C)
>   tarlo: iperf3-client (B)
>
>
> ## Tests on days from 2023-07-04 and 2023-07-06
> real distance: 1200m
> nodes:
>   cetonia: iperf3-server (A)
>   tarlo: iperf3-client (B)
>
>
> ## nodes list:
> ninux-59a9ea (C on the attached map)
>   ipv4: 10.170.169.234
>   Wireless Mac: f0:9f:c2:58:a9:ea
>   libremesh: libremesh-2020.1
>   openwrt: OpenWrt 19.07.7 r11306-c4a6851c72
>   signal: -51 [-51] dBm (44:d9:e7:da:9a:67)
>
> tarlo (B on the attached map)
>   ipv4: 10.170.154.103
>   Wireless MAC: 44:d9:e7:da:9a:67
>   libremesh: libremesh-2020.1
>   openwrt: OpenWrt 19.07.7 r11306-c4a6851c72
>   signal: -55 [-55] dBm (f0:9f:c2:58:a9:ea)
>   signal: -64 [-64] dBm (f0:9f:c2:58:aa:c4)
>
> cetonia (A on the attached map)
>   ipv4: 10.170.170.196
>   wlan0-mesh MAC: f0:9f:c2:58:aa:c4
>   libremesh: libremesh-2020.3
>   openwrt: OpenWrt 19.07.10 r11427-9ce6aa9d8d
>   signal: -63 [-63] dBm (44:d9:e7:da:9a:67)
>
>
>> Here you are:
>> https://github.com/ilario/wifi-distance-setting-exploration
>>
>> The script in Python, I wrote for allowing anyone to plot data taken
>> modifying any parameter, not only the distance one. But the data has
>> to follow the format generated by the bash commands reported in the
>> README file that you can find in the repository.
>> Also these commands will have to be slightly adapted for exploring
>> other parameters, obviously.
>>
>> One thing I want to say already: it is possible that at minute 0 and
>> minute 5 of the measurements (Gothos, is this right?) there is an
>> effect of a cron job that was running on the nodes performing a wifi
>> scan every 5 minutes (equivalent to job of the wifi-unstuck-wa package
>> that you can find in the lime-packages repository).
>>
> Yes, unfortunately we are full of devices with this bug on ath9k
>
> since they have only one radio we are using this simple cronjob to unstuck
>
> */5 * * * * /usr/bin/iwinfo phy0 scan
>
>
>> Ciao!
>> Ilario & Gothos
>>
>>
>>


--
Ilario
iochesonome@???
ilario@???