[Lesson 5.2 Q2] remove all attributes with 33% or more missing values.

228 views
Skip to first unread message

ju...@elsotanillo.net

unread,
Apr 8, 2014, 2:19:23 PM4/8/14
to wekamooc...@googlegroups.com
Hi,

The question is asking:
Return to the Preprocess tab and remove all attributes with 33% or more missing values.

Doing this manually is a time consuming and prone to error task, so I suppose that there must be a filter to achieve that. I have been looking for a filter to remove them automatically  the attributes asked,  but I haven't found the right one.

Any clue?

Best regards

--------------------------------------------------------------------------------------
Juan Sierra Pons                                 ju...@elsotanillo.net
Linux User Registered: #257202      
Web: http://www.elsotanillo.net Git: http://www.github.com/juasiepo
GPG key = 0xA110F4FE
Key Fingerprint = DF53 7415 0936 244E 9B00  6E66 E934 3406 A110 F4FE
--------------------------------------------------------------------------------------

ju...@elsotanillo.net

unread,
Apr 8, 2014, 2:27:15 PM4/8/14
to wekamooc...@googlegroups.com

I think I have committed a Pratfall (A new English word I have learned today in the lesson) :(

I was counting the missing values using the editor, Then I have realized that the in the preprocess tab the missing values appear. :)

Best regards

Eduard

unread,
Apr 8, 2014, 11:55:15 PM4/8/14
to wekamooc...@googlegroups.com
This is how I did this task. In Preprocess Tab, I pressed "Edit" button. Weka opens Viewer with a table. If you click on a colunm, Weka sorts all instances on that column values. Then all instances with missing values (grey color cells) in this column will be at the top of the table. Now, you should go down in the table until you see the last cell with missing values in the column. Then check the position of a vertical slider (I don't know the exact name for this). If the position of the slider is more then 33% from the top, then the column has more then 33% of missing values and should be deleted. For some columns, I counted the grey cells "by hand" (again, after clicking on that column). If the number of grey cells was >20 (the total number of instances was 57(?)), I deleted this column. I hope, it helps.

Eduard

unread,
Apr 8, 2014, 11:58:01 PM4/8/14
to wekamooc...@googlegroups.com
Forgot to say that you can delete a column directly in the Viewer: right-click on a column -> "Delete attribute".

Reply all
Reply to author
Forward
0 new messages