I did some code changes, and the bug disappeared. After trying a series of experiments, each node of the classifier was able to achieve the desired true positive rate. There were few instances where the classifier went amok. The graph below shows the performance of the classifier at one of these bad instances.
Analyzing why there is sudden increase in the false positive rate, the following was observed in these type of failures. The histogram of output of the boosted classifier (at a particular node) is shown below. The three histograms shows the following:
- Blue: Entire distribution of PredictedY [760 0 .. 0 99 14 0 .. 0 727]
- Green: Distribution of PredictedY at Y = 1 [ .. 0 0 0 26 5 0 .. 0 727]
- Red: Distribution of PredictedY at Y = 0 [760 0 .. 0 73 9 0 .. 0 0]
The Fix for the above bug was done this way. (Note:The values are different from above since it is a different experiment). For uniqueSortedPredictedY = 0, 3.1102, 3.6728, 6.7830. Intermediate values were also chosen. intermediateValuesOfPredY = 1.5551, 3.3915, 5.2279. This was the also included in the list of possible thresholds and the required true positive rate was achieved for the classifier. I guess this has closed bugs in the Threshold Variation algorithm. Next we will move ahead with the Linear classifier implementation. All the experiments where run on the 10D Gaussian dataset.
Brodatz dataset: After this bug fix, the experiment was run on a the tough Brodatz pattern distinguishing dataset. Since the feature values where finding it difficult to classify at the such high true positive rates the training failed in the first node it self. The distribution of the PredictedY at a node is shown below. A lower target was set and the classifier started to build but failed to move beyond the the 2nd node.
Synapse dataset: The code run for Synapse dataset gave the following training & test results. The true positive target was set at 0.9.
The bug fixed code can be found in /usr/sci/crcnsdata/CRCNS/Synapses/Code/Matlab/ML_Boosting4_DT3 directory. The Boost.m was modified.
No comments:
Post a Comment