I just checked the diffs between the two versions, and I cannot imagine how
your problem could be caused by the changes. I suppose a coincidence with
another problem.
I would assume that the restart of fhem somehow triggered some
initialization sequence that made the FHTs more verbose again.
Regards,
Boris
There are only cosmetic changes between 5.0 and 5.1 for the FHZ+FHT
combination. You could try to remove the fhtsoftbuffer attribute, and check if
it makes a difference. But scheduling refreshvalues for 10 FHTs at once is
asking, or rather begging for trouble.
Any ideas on how to diagnose further ?
thanks,
Al
You are calling for ideas, here they are :)
- Trace what fhem writes to the FHZ, and compare the two versions.
- Monitor the air traffic between FHZ and FHT with a CUL in X61 mode.
Why are you sticking to the FHTsoftbuffer? It is only needed if you are sending
a lot of commands, which is due to the shaky communication between the FHZ and
FHT a problem in itself. Not that the CUL<->FHT would be any better.
I sometimes send several commands to the FHTs. For example I have a
spreadsheet that presents the from & to times for each FHT for each
day in an easy to read and update format. This is then easily
condensed into a list of "set" commands. I can then "include" a file
containing this list of commands into fhem to upload them. Lazy Mode
prevents most of them actually being transmitted.
As an example, changing the From 1, From 2, To 1 and To 2 times for 10
FHTs for 7 days is 240 commands. Yes I am aware I can combine multiple
commands on one "set" line, and I do this as much as possible.
With fhtsoftbuffer enabled, these commands trickle down to the FHTs
over a period of time, and this seems to work fine on version 5.0.
I appear to be in the minority in having problems with dropped receive
messages with 5.1 and fhtsoftbuffer enabled.
Thanks
Al
But Im really really certain, that already the
define FHTxy at +*06:00:00 set FHTxy report1 255 report2
255
stuff you do for TEN FHTs at the same time is calling for serious
trouble.
This should block the FHT communication for hours!
report 1 255 is the command causing the most traffic of all and you
are sending that to 10 FHTs at the same time, PLUS report 2 255 to 10
FHTs. There is no way, that this will work without problems. (Even if
you might not be able to really see them as you don't have a CUL)
From my experience I'm surprised that this even ever showed good
results.
I would assume that you even may run into problems with the maxium of
1% airtime ("Duty Cycle"). Putting retry to 10 may even worsen the
problem. This is a very high retry count.
Note, that sending single commands to an FHT usually happens only
every 2 minutes. So a retry count of 10 may consume up to 20 Minutes;
making it quite likely that FHT communications of multiple FHTs
interrupt each other.
Your method with the spreadsheet is also somewhat daring and probably
only works due to lazy mode.
To make this more reliable you may want to consider tuning this a bit.
May I suggest that you
1. spread the reports. Do the first room at f.e. 02:00:00 and the next
at 02:30:00 and then the next at 03:00:00 etc.
2. consider not doing the report1 255 every day but only once or twice
a week. Best spread the FHTs over the weekdays. I mean: do the daily
programs really change that often?
I use:
define fht_reportwz_o1 at *06:20:00 {if ($wday == 3) { fhem("set
hzg_wz_o report1 255") } }
3. consider doing report2 255 only every other day for groups of 5
FHTs or better spread them over the weekdays. Note, that the FHTs
usually send most of the report2 values anyhow every 15 Minutes. I
figured that I can even omit the whole report2 thing by using a
refresh-watchdog:
define wd_FHTwz_o watchdog hzg_wz_o:measured-temp.* 01:00 SAME set
hzg_wz_o time
This sets the time if the FHT does not send temp infos anymore, and
setting the time causes the FHT to send most of the report2 data
again.
4. lower retry count... a lot.
Let me explore retry count a bit further, if I may.
Why do you need that at all? Because some FHTs seem to not get the
commands, right?
This may have 2 reasons:
a) the radio situation is not optimal, the coverage is at its
borderline to some FHTs.
b) there is so much traffic in the air that packages cannot be
delivered undisturbed.
I do not know if a) is true with your installation or not. IF thats
the cause you should usually be abel to isolate the 2-3 FHTs which are
not covered well and then set the retry count for these up to...
say ... 4. Put the rest to 2.
Since ALL of your FHTs have a 10, I assume that's not your problem.
Then it is b).
If you set the retry count from (default) 3 to 10 you potentially more
than tripple the radio traffic. Having MORE of what causes the problem
in the first place might be not such a good idea. It does not only
worsen the traffic-problem, but also most likely leads to problems
with the 1% "Duty cycle". If that is reached, then NO communication
will take place at all for quite some time.
Using softbuffer guarantees that once the duty cycle waiting time is
over, so many commands are in the queue that the next duty cycle
problem is only seconds away. I do not know how the duty cycle is
implemented in the FHZ1X00 (which you probably use) but if it is done
similar to the CUL, then ONE "report1 255" will be already sufficient
to get problems again.
So basically I guess that the real reason of your problem lies in the
huge traffic you might have. Your large retry count makes me think,
that you already encounter the problems that may be caused by this.
So whatever the difference between the 2 versions is: In your current
setup, only minimal changes in the timing or behavior of the
softbuffer can make a huge difference, as I assume that your setup
runs at the very verge of not working.
Thank you very much for such a comprehensive explanation.
I like the watchdog idea, and have put that in. I have staggered the
report 1s as you suggest. And changed the retrys down to 2.
I am still using v5.0 of the 11_FHT.pm
thanks again!
Al
I got that idea from somebody else here on the group and it works very
well for me.
> I am still using v5.0 of the 11_FHT.pm
There still might be another problem, but I have no idea what. I only
saw after reading your conf. that you might be in trouble anyhow. Lets
us know if the changes work for you.
> thanks again!
NP