Impact of Increasing Data Volume on Classification Performance

18 views
Skip to first unread message

Wided MEFLAH

unread,
Feb 7, 2024, 10:46:46 AM2/7/24
to Dataverse Big Data
Hi,  is it possible that increasing the amount of data could impact negatively  the performance of a classification??

In a multilabel classification problem of articles with specialized vocabulary, a classifier is initially trained using 100k rows (summary + label). However, upon incorporating data from theses, , a decline in precision and recall values relative to the initial training is observed. It's important to note that the same  vocabulary is used; however, with the inclusion of thesis data, the training dataset expands to 600k rows.

If you have any insights, I'd appreciate it.

qqqm...@gmail.com

unread,
Feb 7, 2024, 11:23:18 AM2/7/24
to Dataverse Big Data
FWIW: This list relates to https://dataverse.org/ and it sounds like your question is for Microsoft's product.
Reply all
Reply to author
Forward
0 new messages