Recommender Project

20 views

Skip to first unread message

Aubrey Sherman

unread,

Aug 11, 2025, 6:43:57 AMAug 11

to Machine Learning for Kids

Hi Dale,

I've been working through the film recommendation project from your book. The 'Describe your model!' feature on the website has been really helpful in demystifying how the text recognition model is operating under the hood.

As some of the concepts are still new to me, could you explain a bit more about how this training works? More specifically, I'm trying to figure out how to create better training examples.

All three of the films in the book's demo have 5 examples each, and they share language in common: things like "I enjoy" or "I want" or "I like". Do words or phrases like those, that appear across all classes, get given less weight? What types of phrasing or language is best to use (e.g. longer statements that may have more filler or common language, or more concise/directed statements?) If a user provides an input that doesn't match any of the classes, what determines which one it will default to?

Thanks!

Aubrey

Dale Lane

unread,

Aug 11, 2025, 3:24:25 PMAug 11

to Machine Learning for Kids

The general answer is that there is no definitive "best" training examples to use - rather that you should focus on examples that are representative of what you want the model to be good at recognising.

e.g. If you want the model to be good at recognising longer sentences with more filler words, then you need examples of longer sentences with filler words. If you want it to be good at recognising very short sentence fragments, you need examples of short sentences fragments. etc.

If there are training examples that have things in common across multiple labels (e.g. "I want") then it is likely that the model will learn that those words have a low significance.

When you test with something that you expect to not match any of the labels, I would hope that the model will reflect this with a low confidence score. The label that is chosen will be a reflection of something the example has in common with a pattern identified during training. There will be hundreds of features that go into the classifying, so even for an example that intuitively has no match for any label, there will be likely to be some feature scorer that triggers in a way that gives one label a slightly higher score than the others.