Hi Merlin,
I think the AAR paper mostly deals with non-determinism that is introduced by input alphabet abstractions. Let's say you have an abstract input symbol "numeric value" and your SUL driver decides to always map this abstract value to the number "0". Now, you receive a counterexample that tells you "if you input 42, you observe a different behavior". Mapping the 42 back to "numeric value" would then yield a non-deterministic behavior on the abstract level.
The AAR approach would be able to automatically detect the difference and handle the concrete inputs 0 and 42 separately. From your description it sounds like your SUL is already answering the concrete symbols non-deterministically. So here, the AAR approach would probably not help you (and it is currently not implemented in LearnLib anyway). I think implementing your own heuristic (e.g. via a "probabilistic oracle") is probably the easiest solution to begin with.
Regarding your second question: Reconstructing a learner's state from a cache is often very complicated, because you have to deal with the internal structure of the learner and often its structure depends on the order in which the learner has received information such as counterexamples. So given only cache entries it may even not be possible at all.
However, you can use your cache data to "cheat" a little bit on the performance of membership and equivalence queries in a new learning setup. Using a simple map from input to output data, you can write your own membership oracle instance, that first checks if your cache data contains information for the posed query and only delegate to the actual SUL in case of a cache-miss (similar to the DFAHashCacheOracle [0]). Similarly, you can use the SampleSetEQOracle [1] (and its subtypes) to check if the current hypothesis is consistent with your previously observed cache data and automatically return a counterexample if necessary.
Integrating information via these external interfaces (MQs, EQs) allows you to abstract from all the internal clutter of the learners and lets you setup some interesting mixed approaches (e.g. begin learning with L* and finish with TTT) which may lead to interesting effects.
Kind regards,
Markus
[0] -
https://github.com/LearnLib/learnlib/blob/develop/oracles/filters/cache/src/main/java/de/learnlib/filter/cache/dfa/DFAHashCacheOracle.java
[1] -
https://github.com/LearnLib/learnlib/blob/develop/oracles/equivalence-oracles/src/main/java/de/learnlib/oracle/equivalence/SampleSetEQOracle.java
On Mon, Sep 14, 2020 at 07:04:05AM -0700, Merlin Chlosta wrote:
> Hi Tiago,
> Hi Markus,
>
> I got a similar setup and some follow-up questions. My caching detects
> non-determinism, usually the 'probabilistic oracle' works around that.
> However, one of my SULs has 50%ish non-determinism in some paths.
> How would you deal with that? I found a paper "Automata Learning with
> Automated Alphabet Abstraction Refinement" [1] which seems to implement
> just that -- but is that available in LearnLib, and if so, could you point
> me in the right direction?
>
> I'd be handy to know LearnLib's progress finer than the Learning-Refining
> iterations. For example, I have already a very big cache file that did
> produce a learned model before. After increasing the input alphabet and
> adding new queries to the cache, I can't reproduce the learned model for
> the initial (reduced) input alphabet. I'd like to figure out why. My SUL is
> rather slow with just a few queries per second, so running on the cache and
> re-producing previous results is pretty valuable.
>
> Thanks,
> Merlin
>
> [1]
>
https://learnlib.de/wp-content/uploads/2017/10/automated-alphabet-abstraction-refinement-vmcai2011.pdf
> [...]