New user question - Trying to resolve - Discarded instruction(didn't match expected format)

18 views
Skip to first unread message

Len DiMaggio

unread,
Jun 2, 2024, 5:05:21 PM6/2/24
to us...@instructlab.ai
Hey everyone - first email/question from a new user...many thanks!

Not sure if this is a user error - or a lack of error information in the generate log - hoping that someone can help with this.

In attempting to add knowledge to the merlinite-7b-lab-Q4_K_M.gguf model:

With this wikipedia entry:
https://en.wikipedia.org/wiki/Norumbega_Tower

And this github repo:
https://github.com/ldimaggi/norumbega-tower-knowledge.git

And ilab generate output fragment:
`INFO 2024-05-30 10:12:16,709 generate_data.py:468 Selected taxonomy path knowledge->textbook->history->norumbega-tower
INFO 2024-05-30 10:12:52,881 generate_data.py:468 Selected taxonomy path knowledge->textbook->history->norumbega-tower
Q> What is the significance of the Norumbega Tower in terms of historical interpretation?
I>
A> The tower has played a significant role in shaping Americans’ understanding and appreciation of Norse history and its impact on North America. Its construction and design reflect Horsford’s conviction that Norse explorers had settled in New England, although there is no physical evidence to support this claim. The addition of the tower to Norumbega Park, a recreational complex located across the river in Newton, Massachusetts, also influenced how Americans perceived Norse history and its influence on North America.`

I am seeing this error in the generate log - can anyone explain what the expected format should be?

`2024-05-30T09:45:19 - Discarded instruction(didn't match expected format): "\nInstruction\nWhat evidence exists to support or refute Horsford's claim that there were Norse settlements anywhere in New England?\nOutput\nThere is no archaeological or other evidence to support Horsford's claim that there were Norse settlements anywhere in New England. This highlights the importance of critical thinking and skepticism when considering historical claims, especially those made without physical evidence.\n" `

--

Len DiMaggio (ldim...@redhat.com)
Red Hat
314 Littleton Road
Westford, MA 01886  USA


Ben Browning

unread,
Jun 4, 2024, 7:26:55 AM6/4/24
to Len DiMaggio, us...@instructlab.ai
On Sun, Jun 2, 2024 at 5:05 PM Len DiMaggio <ldim...@redhat.com> wrote:
Hey everyone - first email/question from a new user...many thanks!

Hi Len - welcome to the project!
What the error above means is that the synthetic question and answer pair generated by the model was not in a format that we were able to parse based on the expected format of responses. We asked the model to generate some synthetic examples, it did so, but they weren't formatted as we expected and so we were unable to parse the response. It's nothing you did wrong, but just a case of the model not providing its output in the format we wanted. It happens quite regularly, and is an area I have an active interest in improving and some ideas how to do so. It mostly comes down to the prompt we're passing the model as well as our output parsing code that needs to become more robust here, which would create fewer discarded instructions due to format parsing issues like this.

In my anecdotal experience, this seems to happen more often with knowledge than with skills. I suspect it's due to the larger prompts we supply (since we embed chunks of the knowledge docs in the prompt) and the contents of those larger prompts may be providing mixed signals a bit to the model as far as the format we want for the expected response.

Ben

Len DiMaggio

unread,
Jun 4, 2024, 3:02:10 PM6/4/24
to Ben Browning, us...@instructlab.ai
Thanks for the detailed explanation Ben! One question - is there any workaround that I can try to get around this?
-- Len D.

--
Red Hat
314 Littleton Road
Westford, MA 01886  USA
Reply all
Reply to author
Forward
0 new messages