Wednesday, October 1, 2008

New kNN setup

In the new kNN experiment setup we will be using the old dataset since it has many examples compared to the Dr.Marc marked examples. There will be a 5-fold validation on the kNN classifier. In this classifier the disk size used in 55 rather than 35 used in the previous experiment.

Step 1: Generating SIFT key points
Number of SIFT key points = 447947

Step 2: Converging the SIFT key points
After this filtering the number of points = 24944
After converging the points the number of unique points = 21844

Step 3: Clustering the points
Separation distance = 27.5 pixels
Number of clusters identified = 8395

Step 4: Positive & Negative examples : Separate Clustering
Separate clustering of points was done for positive and negative examples
Positive SIFT points are ones less than 10 pixels away from the Converged Synapses. (Positives = 608, Clusters = 336, ClustersFromConvergedSynapses = 377)
Negative SIFT points are ones greater than 55 pixels away from the Converged Synapses. (Negatives = 17688, Clusters = 6198)
One observation is that the number of clusters identified depends on the first point chosen.
The for the negative data set the points would be clustered and representative points would be taken, but in case of positive we will use the entire set because clustering them halves the number of positive examples and would make the skewed data set (1:10) to (1:20).
The data points are stored in /usr/sci/crcnsdata/CRCNS/Synapses/Code/Matlab/kmeans/PositiveNegSIFTPoints.mat
Add more twist to the tale, when doing to clusterToPoint2 reduction for the Negative examples it finally ends up with 5784 points. On repeating the same procedure 3 times the reduction stops and the number of points are 4625.

Step 5: Set up for 5-fold validation
Positive points(uniqueConvergedSIFTpointsLT10) = 608 and Negative points (NegativesClusterCenters4) = 4625. From PositiveNegSIFTPoints.mat file.

The function /usr/sci/crcnsdata/CRCNS/Synapses/Code/Matlab/kmeans/createFolds.m creates the number of folds for the data.

Step 6: Circular Region Extraction
The entire setup for the experiment is done by the function "generateFeatures('synapse1-5fold')". The testing for the individual folds can be done by running the scripts foldXXRun (XX = 1,2,3,4,5). The results of the individual fold runs are found in foldXXResults.

Test Results:
A quantitative examination of the results is shown in the bar graph of confusion matrix entries for all folds.
Qualitative analysis:
For the qualitative analysis the test patches for the 5 folds have been extracted and stored in an image.

No comments: