Impact of Increasing Data Volume on Classification Performance
18 views
Skip to first unread message
Wided MEFLAH
unread,
Feb 7, 2024, 10:46:46 AM2/7/24
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Dataverse Big Data
Hi, is it possible that increasing the amount of data could impact negatively the performance of a classification??
In a multilabel classification problem of articles with specialized vocabulary, a classifier is initially trained using 100k rows (summary + label). However, upon incorporating data from theses, , a decline in precision and recall values relative to the initial training is observed. It's important to note that the same vocabulary is used; however, with the inclusion of thesis data, the training dataset expands to 600k rows.
If you have any insights, I'd appreciate it.
qqqm...@gmail.com
unread,
Feb 7, 2024, 11:23:18 AM2/7/24
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Dataverse Big Data
FWIW: This list relates to https://dataverse.org/ and it sounds like your question is for Microsoft's product.