Hi all,
I would like to use Unitex for building grammars whose purpose is to detect certain sentences or expressions of interest (e.g. cardinal numbers) for a chatbot that provides services. The construction of cardinal numbers is quite regular with the exception of some cases (e.g. 100 is written "cien" in Spanish, while numbers from 101 to 199 are written as "ciento uno" - "ciento noventa y nueve"). Is there a way of developing grammars with the assistance of an annotated corpus as unit testing cases? For instance, I would develop an annotated corpus with a few examples of each case, such as:
uno/1
dos/2
diez/10
cien/100
ciento uno/101
mil/1000
etc.
Then it would be nice to have in Unitex some button that would apply the grammar to all the unit test cases and list those where the grammar failed to properly translate the input into the output.
Now extrapolate cardinal numbers to a big grammar detecting different kinds of requests the user may ask the chatbot. As the number of services to be provided by the chatbot increases and the grammar is adapted to cover the new cases, the probability of breaking something that was previously working increases, and the only way of having some control is to validate the grammar against an annotated corpus. Is there something like this in Unitex? Thank you.
Regards,
Javier