The code was profiled using the NVIDIA cuda profiler which gives the execution time of the individual kernel calls.
It looks like the cross correlations are taking the most time. The parallelization is done on the filter pixels which is the smallest size compared to the sensitivities and feature map sizes. CudaFFT might be a good option for the same.
Tuesday, September 29, 2009
Monday, September 28, 2009
GPU speed up results for Convolution Neural Network
The below graph shows the speed up results for a convolution neural network, The network has 3 hidden layers and 4 nodes per hidden layer. The output layer is 1 pixel/voxel in size.
This implementation executes spacial convolutions. It doesn't use shared memory or texture memory hence there are large latencies involved. The speed up is not significant for the smaller kernels because there are not enough operations to parallelize because of the high modularity of the CUDA code.
This implementation executes spacial convolutions. It doesn't use shared memory or texture memory hence there are large latencies involved. The speed up is not significant for the smaller kernels because there are not enough operations to parallelize because of the high modularity of the CUDA code.
Tuesday, April 28, 2009
Neuron segmentation based on segmentation of same neuron from another slice
Motivation:
TBD
Dataset:
The following is the neuron image
The following image is the segmentation of the neuron image.
The following binary mask is extracted from the hue component of the above labeled image. Then the holes in the label image have been closed. The binary image is eroded to get the following binary image which we can consider as a segmentation mask for the neuron from another slice.
The 'g' image is shown below
TBD
Dataset:
The following is the neuron image
The following image is the segmentation of the neuron image.
The following binary mask is extracted from the hue component of the above labeled image. Then the holes in the label image have been closed. The binary image is eroded to get the following binary image which we can consider as a segmentation mask for the neuron from another slice.
The 'g' image is shown below
Subscribe to:
Posts (Atom)