[group] Homework 5 PB2

16 views

Skip to first unread message

Molly White

unread,

Nov 16, 2012, 10:27:48 PM11/16/12

to Teaching Assistant

I'm a bit confused about how we're supposed to write the code for Problem 2. Do we just write a function that, when given a node, splits it according to the entered threshold? Does it have to repeat? Are we supposed to include cases such as splitting a node that already has child nodes?

Virgil Pavlu

unread,

Nov 17, 2012, 5:00:47 PM11/17/12

to cs1500...@googlegroups.com, Teaching Assistant

Hi Molly,

Yes, you would write a function like

split(the_node_address, the_feature, the_threshold)

that takes a node (must be unsplitted) and performs the split from there, on that feature and threshold. This function should not be recursive; rather an outside wrapper/main/other-function calls the splits.

For part C (extra credit) the biggest challenge is to try all the splits without actually modifying the tree until you know which split is the best. So here's how I would do this:

First modify the function split above to have the ability to "simulate" the split, rather that crate it like below:

split(the_node_address, the_feature, the_threshold, do_or_simulate)

- crete the children, filter the data to left or right

- if really do the split: write the feature and the threshold at the node, then make the linkage (children to the node, and the node's left and right to children)

- if only simulate, calculate the value of the split, then delete the children, and do not modify the node.

This way you can call this function as "simulate" for all possible features and thresholds just to figure out the best split, then do it for real for that split. You would also need a que for the current terminal nodes addresses to handle the nodes that are to be splitted - once a split is done the new children (now terminal nodes) are added to that queue/list.

--virgil

On Nov 16, 2012, at 10:27 PM, Molly White wrote:

I'm a bit confused about how we're supposed to write the code for Problem 2. Do we just write a function that, when given a node, splits it according to the entered threshold? Does it have to repeat? Are we supposed to include cases such as splitting a node that already has child nodes?

--
You received this message because you are subscribed to the Google Groups "cs1500-forum" group.
Visit this group at http://groups.google.com/group/cs1500-forum?hl=en.

Reply all

Reply to author

Forward

0 new messages