Clarification on designing features in lecture 5

49 views
Skip to first unread message

Cameron Braunstein

unread,
Jun 26, 2022, 8:06:38 AM6/26/22
to AKBC2022

I have clarifying questions on Lecture 5/6, slide 23. Step 4 of Supervised ML relation extraction is to design a set of features. Do we tell our model to observe features (features such as bag of words, entity level, parse features, etc.,) of our choosing during training and evaluation? If so, are there any guidelines on which features to choose? For a sufficiently complex model, would it be able to learn important features without our instruction? 

Simon Razniewski

unread,
Jun 27, 2022, 4:05:19 PM6/27/22
to AKBC2022
On Sunday, 26 June 2022 at 14:06:38 UTC+2 cameronb...@gmail.com wrote:

I have clarifying questions on Lecture 5/6, slide 23. Step 4 of Supervised ML relation extraction is to design a set of features. Do we tell our model to observe features (features such as bag of words, entity level, parse features, etc.,) of our choosing during training and evaluation? If so, are there any guidelines on which features to choose? For a sufficiently complex model, would it be able to learn important features without our instruction? 

Good question!

Slide 23 ("design a set of features"). presents the pre-neural state of the art. And if you aim for interpretable systems, the message remains valid - features must be chosen manually. And no, there is no easy way to do this, doing this well requires domain knowledge (about the relation of interest, and about natural language in general). An non-domain specific example can be seen in this paper: https://aclanthology.org/P04-3022.pdf (page 2 lower right column, and Tables 2 and 3 on their individual significance). In general, a common technique is to over-propose, i.e., design as many features as you can think of, without too much concern about their helpfulness, then apply a feature selection step that determines which of the many ones are likely helpful or not, then for the final model, only use a smaller relevant subset of features.

One of the promises of deep learning is indeed that it does away with feature engineering. And while that's no 100% clean story (there are many cases where domain knowledge/feature engineering on top of neural models helps), the trend is true: With neural networks trained on large training data, much can be achieved without manual feature engineering, i.e., the networks learn features by themselves (which are, however, not easy to extract from the models). Pre-trained language models are a special and very prominent case of this line, but the principle applies also to neural networks without general pre-training.

Hope this helps!
Simon


Reply all
Reply to author
Forward
0 new messages