Monday, September 1, 2008

Clustering algorithm

From the view of data point reduction we will cluster the SIFT points so that, we can make the kNN classifier run faster. We will do a agglomerative clustering with complete linkage. The clustering will be done so that the distance between the key points within a cluster is less that the disk diameter that is going to be used to generate the patch.

Clustering Results:
Type 1: In this method of clustering all the distance matrix is calculated and the points with least distances are merged to a single point. All such nearest points are merged until the nearest neighbor of a point is at least the diameter of the disk size that is going to be used to generate the patch. After such a reduction 10114 unique converged SIFT points were reduced to 2676 points. The figure below is the histogram of the distances between a synapse point to the nearest such SIFT point. The set of reduced points are stored in /usr/sci/crcnsdata/CRCNS/Synapses/Code/Matlab/clustering/meanClustering.mat
Type 2: In this the clustering mechanism is same as the above method but instead of calculating simple mean a weighted mean is done. Initially all points are started with equal weight of one. Once a pair of points are merged the weight of the point is increased to the sum of the weights of the merged points. This would avoid merged points getting drifted too far from the original points. This method resulted in 1465 points. The points are stored in /usr/sci/crcnsdata/CRCNS/Synapses/Code/Matlab/clustering/weightedmeanClustering.mat
Type 3: This method the linkage would be complete. The points are not merged. A new point would we added to the cluster one and only if it's distance from all the cluster points is not larger than the disk size.

No comments: