Personal Profile

cuDNN

Deep Learning to Unlock Mysteries of Parkinson’s Disease

Deep Learning to Unlock Mysteries of Parkinson’s Disease

Deep Learning to Unlock Mysteries of Parkinson’s Disease

Researchers at The Australian National University are using deep learning and NVIDIA technologies to better understand the progression of Parkinson’s disease.

Currently it is difficult to determine what type of Parkinson’s someone has or how quickly the condition will progress.
The study will be conducted over the next five years at the Canberra Hospital in Australia and will involve 120 people suffering from the disease and an equal number of non-sufferers as a controlled group.

“There are different types of Parkinson’s that can look similar at the point of onset, but they progress very differently,” says Dr Deborah Apthorp of the ANU Research School of Psychology. “We are hoping the information we collect will differentiate between these different conditions.”

Researchers Alex Smith (L) and Dr Deborah Anthrop (R) work with Parkinson’s disease sufferer Ken Hood (middle).

Researchers Alex Smith (L) and Dr Deborah Anthrop (R) work with Parkinson’s disease sufferer Ken Hood (middle).

Dr Apthorp said the research will measure brain imaging, eye tracking, visual perception and postural sway.

From the data collected during the study, the researchers will be using a GeForce GTX 1070 GPU and cuDNN to train their deep learning models to help find patterns that indicate degradation of motor function correlating with Parkinson’s.

The researchers plan to incorporate virtual reality into their work by having the sufferers’ wear head-mounted displays (HMDs), which will help them better understand how self-motion perception is altered in Parkinson’s disease, and use stimuli that mimics the visual scene during self-motion.

“Additionally, we would like to explore the use of eye tracking built into HMDs, which is a much lower cost alternative to a full research eye tracking system and reduces the amount of equipment into a highly portable and versatile single piece of equipment,” says researcher Alex Smith.

Teaching an AI to Detect Key Actors in Multi-person Videos

Teaching an AI to Detect Key Actors in Multi-person Videos

Teaching an AI to Detect Key Actors in Multi-person Videos

Researchers from Google and Stanford have taught their computer vision model to detect the most important person in a multi-person video scene – for example, who the shooter is in a basketball game which typically contains dozens or hundreds of people in a scene.

Using 20 Tesla K40 GPUs and the cuDNN-accelerated Tensorflow deep learning framework to train their recurrent neural network on 257 NCAA basketball games from YouTube, an attention mask selects which of the several people are most relevant to the action being performed, then tracks relevance of each object as time proceeds. The team published a paper detailing more of their work.

The distribution of attention for the model with tracking, at the beginning of “free-throw success”. The attention is concentrated at a specific defender’s position. Free-throws have a distinctive defense formation, and observing the defenders can be helpful as shown in the sample images in the top row.

The distribution of attention for the model with tracking, at the beginning of “free-throw success”. The attention is concentrated at a specific defender’s position. Free-throws have a distinctive defense formation, and observing the defenders can be helpful as shown in the sample images in the top row.

Over time the system can identify not only the most important actor, but potential important actors and the events with which they are associated – such as, the ability to understand the player going up for a layup could be important, but that the most important player is the one who then blocks the shot.

New Deep Learning Method Enhances Your Selfies

New Deep Learning Method Enhances Your Selfies

New Deep Learning Method Enhances Your Selfies

Researchers from Adobe Research and The Chinese University of Hong Kong created an algorithm that automatically separates subjects from their backgrounds so you can easily replace the background and apply filters to the subject.

Their research paper mentions there are good user-guided tools that support manually creating masks to separate subjects from the background, but the “tools are tedious and difficult to use, and remain an obstacle for casual photographers who want their portraits to look good.”

A highly accurate automatic portrait segmentation method allows many portrait processing tools to be fully automatic.

A highly accurate automatic portrait segmentation method allows many portrait processing tools to be fully automatic.

Using a TITAN X GPU and the cuDNN-accelerated Caffe deep learning framework, the researchers trained their convolutional neural network on 1,800 portrait images from Flickr. Their GPU-accelerated method was 20x faster than a CPU-only approach.

Portrait video segmentation is next on the radar for the researchers.

Facial Recognition Software Helping Caterpillar Identify Sleepy Operators

Facial Recognition Software Helping Caterpillar Identify Sleepy Operators

Facial Recognition Software Helping Caterpillar Identify Sleepy Operators

Operator fatigue can potentially be a fatal problem for Caterpillar employees driving the massive mine trucks on long, repetitive shifts throughout the night.

Caterpillar recognized this and joined forces with Seeing Machines to install their fatigue detection software in thousands of mining trucks worldwide. Using NVIDIA TITAN X and GTX 1080 GPUs along with the cuDNN-accelerated Theano, TensorFlow and Caffe deep learning frameworks, the Australian-based tech company trained their software for face tracking, gaze tracking, driver attention region estimation, facial recognition, and fatigue detection.

On-board the truck, a camera, speaker and light system are used to monitor the driver and once a potential “fatigue event” is detected, an alarm sounds in the truck and a video clip of the driver is sent to a 24-hour “sleep fatigue center” at the Caterpillar headquarters.

Facial Recognition Software Helping Caterpillar Identify Sleepy Operators

Facial Recognition Software Helping Caterpillar Identify Sleepy Operators

“This system automatically scans for the characteristics of microsleep in a driver,” Sal Angelone, a fatigue consultant at the company, told The Huffington Post, referencing the brief, involuntary pockets of unconsciousness that are highly dangerous to drivers. “But this is verified by a human working at our headquarters in Peoria.”

In the past year, Caterpillar referenced two instances – in one, a driver had three fatique events within four hours and he was contacted onsite and forced to take a nap. In another, a night shift truck driver who experienced a fatique event realized it was a sign of sleep disorder and asked his management for medical assistance.

It’s a matter of time before this technology is incorporated into every car on the road.

Facebook and CUDA Accelerate Deep Learning Research

Facebook and CUDA Accelerate Deep Learning Research

Facebook and CUDA Accelerate Deep Learning Research

Last Thursday at the International Conference on Machine Learning (ICML) in New York, Facebook announced a new piece of open source software aimed at streamlining and accelerating deep learning research. The software, named Torchnet, provides developers with a consistent set of widely used  deep learning functions and utilities. Torchnet allows developers to write code in a consistent manner speeding development and promoting code re-use both between experiments and across multiple projects.

Torchnet sits atop the popular Torch deep learning framework benefits from GPU acceleration using CUDA and cuDNN.

Torchnet sits atop the popular Torch deep learning framework benefits from GPU acceleration using CUDA and cuDNN.

Torchnet sits atop the popular Torch deep learning framework benefits from GPU acceleration using CUDA and cuDNN. Further, Torchnet has built-in support for asynchronous, parallel data loading and can make full use of multiple GPUs for vastly improved iteration times. This automatic support or multi-GPU training helps Torchnet take full advantage of powerful systems like the NVIDIA DGX-1 with its eight Tesla P100 GPUs.

Facebook and CUDA Accelerate Deep Learning Research

Facebook and CUDA Accelerate Deep Learning Research

According to the Torchnet research paper, its modular design makes it easy to re-use code in a series of experiments. For instance, running the same experiments on a number of different datasets is accomplished simply by plugging in different dataloaders. And the evaluation criterion can be changed easily by plugging in a different performance meter.

Torchnet adds another powerful tool to data scientists’ toolkit and will help speed the design and training of neural networks, so they can focus on their next great advancement.

Artificial Intelligence System Predicts How You Will Look With Different Hair Styles

Artificial Intelligence System Predicts How You Will Look With Different Hair Styles

Artificial Intelligence System Predicts How You Will Look With Different Hair Styles

A new personalized search engine helps you explore what you would look like with brown hair, curly hair or in a different time period.

Upload a selfie to Dreambit and type in a term like “curly hair” or “1930 woman”, and the software’s algorithm searches through photo collections for similar images and seamlessly maps your face onto images matching your search criteria.

Ira Kemelmacher-Shlizerman, a computer vision researcher at University of Washington, developed the image recognition software using a TITAN X GPU and the cuDNN-accelerated Caffe deep learning framework to train the models and for inference. Ira presented her paper at this week’s SIGGRAPH 2016 and the search engine will be publicly available later this year.

Illustration of the system. The system gets as input a photo and a text query. The text query is used to search a web image engine. The retrieved photos are processed to compute a variety of face features and skin and hair masks, and ranked based on how well they match to the input photo. Finally, the input face is blended into the highest ranked candidates.

Illustration of the system. The system gets as input a photo and a text query. The text query is used to search a web image engine. The retrieved photos are processed to compute a variety of face features and skin and hair masks, and ranked based on how well they match to the input photo. Finally, the input face is blended into the highest ranked candidates.

Dreambit is also able to predict what a child might look like when they are forty years old or with red hair, black hair, or even a shaved head.

Artificial Intelligence System Predicts How You Will Look With Different Hair Styles

Artificial Intelligence System Predicts How You Will Look With Different Hair Styles

“It’s hard to recognize someone by just looking at a face, because we as humans are so biased towards hairstyles and hair colors,” said Kemelmacher-Shlizerman. “With missing children, people often dye their hair or change the style so age-progressing just their face isn’t enough. This is a first step in trying to imagine how a missing person’s appearance might change over time.”

NVIDIA Deep Learning SDK Now Available

NVIDIA Deep Learning SDK Now Available

NVIDIA Deep Learning SDK Now Available

The NVIDIA Deep Learning SDK brings high-performance GPU acceleration to widely used deep learning frameworks such as Caffe, TensorFlow, Theano, and Torch. The powerful suite of tools and libraries are for data scientists to design and deploy deep learning applications.

Following the Beta release a few months ago, the production release is now available with:

  • cuDNN 4 – Accelerate training and inference with batch normalization, tiled FFT, and NVIDIA Maxwell optimizations.
  • DIGITS 3 – Get support for Torch and pre-defined AlexNet and Googlenet models
cuDNN 4 Speedup vs. CPU-only for batch = 1 AlexNet + Caffe, Fwd Pass Convolutional Layers GeForce TITAN X, Core i7-4930 on Ubuntu 14.04 LTS

[cuDNN 4 Speedup vs. CPU-only for batch = 1]—[AlexNet + Caffe, Fwd Pass Convolutional Layers]—[GeForce TITAN X, Core i7-4930 on Ubuntu 14.04 LTS]

Download now >>

Deep Learning for Computer Vision with MATLAB and cuDNN

Deep Learning for Computer Vision with MATLAB and cuDNN

Deep Learning for Computer Vision with MATLAB and cuDNN

Deep learning is becoming ubiquitous. With recent advancements in deep learning algorithms and GPU technology, we are able to solve problems once considered impossible in fields such as computer vision, natural language processing, and robotics.

Deep learning uses deep neural networks which have been around for a few decades; what’s changed in recent years is the availability of large labeled datasets and powerful GPUs. Neural networks are inherently parallel algorithms and GPUs with thousands of cores can take advantage of this parallelism to dramatically reduce computation time needed for training deep learning networks. In this post, I will discuss how you can use MATLAB to develop an object recognition system using deep convolutional neural networks and GPUs.

Pet detection and recognition system.

Pet detection and recognition system.

Why Deep Learning for Computer Vision?

Machine learning techniques use data (images, signals, text) to train a machine (or model) to perform a task such as image classification, object detection, or language translation. Classical machine learning techniques are still being used to solve challenging image classification problems. However, they don’t work well when applied directly to images, because they ignore the structure and compositional nature of images. Until recently, state-of-the-art techniques made use of feature extraction algorithms that extract interesting parts of an image as compact low-dimensional feature vectors. These were then used along with traditional machine learning algorithms.

Enter Deep learning. Deep convolutional neural networks (CNNs), a specific type of deep learning algorithm, address the gaps in traditional machine learning techniques, changing the way we solve these problems. CNNs not only perform classification, but they can also learn to extract features directly from raw images, eliminating the need for manual feature extraction. For computer vision applications you often need more than just image classification; you need state-of-the-art computer vision techniques for object detection, a bit of domain expertise, and the know-how to set up and use GPUs efficiently. Through the rest of this post, I will use an object recognition example to illustrate how easy it is to use MATLAB for deep learning, even if you don’t have extensive knowledge of computer vision or GPU programming.

Example: Object Detection and Recognition

The goal in this example is to detect a pet in a video and correctly label the pet as a cat or a dog. To run this example, you will need MATLAB®, Parallel Computing Toolbox™, Computer Vision System Toolbox™ and Statistics and Machine Learning Toolbox™. If you don’t have these tools, request a trial at www.mathworks.com/trial. For this problem I used an NVIDIA Tesla K40 GPU; you can run it on any MATLAB compatible CUDA-enabled NVIDIA GPU.

Our approach involves two steps:

  1. Object Detection: “Where is the pet in the video?”
  2. Object Recognition: “Now that I know where it is, is it a cat or a dog?”

Figure 1 shows what the final result looks like.

Using a Pretrained CNN Classifier

The first step is to train a classifier that can classify images of cats and dogs. I could either:

  1. Collect a massive amount of cropped, resized and labeled images of cats and dogs in a reasonable amount of time (good luck!), or
  2. Use a model that has already been trained on a variety of common objects and adapt it for my problem.
Figure 2: Pretrained ImageNet model classifying the image of the dog as 'beagle'.
Figure 2: Pretrained ImageNet model classifying the image of the dog as ‘beagle’.

For this example, I’m going to go with option (2) which is common in practice. To do that I’m going to first start with a pretrained CNN classifier that has been trained on the ImageNet dataset.

I will be using MatConvNet, a CNN package for MATLAB that uses the NVIDIA cuDNN library for accelerated training and prediction. [To learn more about cuDNN, see this Parallel Forall post.] Download and install instructions for MatConvNet are available on its home page. Once I’ve installed MatConvNet on my computer, I can use the following MATLAB code to download and make predictions using the pretrained CNN classifier. Note: I also use the cnnPredict() helper function, which I’ve made available on Github.

%% Download and predict using a pretrained ImageNet model

% Setup MatConvNet
run(fullfile('matconvnet-1.0-beta15','matlab','vl_setupnn.m'));

% Download ImageNet model from MatConvNet pretrained networks repository
urlwrite('http://www.vlfeat.org/matconvnet/models/imagenet-vgg-f.mat', 'imagenet-vgg-f.mat');
cnnModel.net = load('imagenet-vgg-f.mat');

% Load and display an example image
imshow('dog_example.png');
img = imread('dog_example.png');

% Predict label using ImageNet trained vgg-f CNN model
label = cnnPredict(cnnModel,img);
title(label,'FontSize',20)

The pretrained CNN classifier works great out of the box at object classification. The CNN model is able to tell me that there is a beagle in the example image (Figure 2). While this is certainly a great starting point, our problem is a little different. I want to be able to (1) put a box around where the pet is (object detection) and then (2) label it accurately as a dog or a cat (classification). Let’s start by building a dog vs cat classifier from the pretrained CNN model.

Training a Dog vs. Cat Classifier

The objective is simple. I want to solve a simple classification task: given an image I’d like to train a classifier that can accurately tell me if it’s an image of a dog or a cat. I can do that easily with this pretrained classifier and a few dog and cat images.

To get a small collection of labeled images for this project, I went around my office asking colleagues to send me pictures of their pets. I segregated the images and put them into separate ‘cat’ and ‘dog’ folders under a parent called ‘pet_images’. The advantage of using this folder structure is that the imageSet function can automatically manage image locations and labels. I loaded them all into MATLAB using the following code.

%% Load images from folder
% Use imageSet to load images stored in pet_images folder
imset = imageSet('pet_images','recursive');

% Preallocate arrays with fixed size for prediction
imageSize = cnnModel.net.normalization.imageSize;
trainingImages = zeros([imageSize sum([imset(:).Count])],'single');

% Load and resize images for prediction
for ii = 1:numel(imset)
  for jj = 1:imset(ii).Count
      trainingImages(:,:,:,jj) = imresize(single(read(imset(ii),jj)),imageSize(1:2));
  end
end

% Get the image labels
trainingLabels = getImageLabels(imset);
summary(trainingLabels) % Display class label distribution

Feature Extraction using a CNN

What I’d like to do next is use this new dataset along with the pretrained ImageNet to extract features. As I mentioned earlier, CNNs can learn to extract generic features from images. These features can be used to train a new classifier to solve a different problem, like classifying cats and dogs in our problem.

CNN algorithms are compute-intensive and can be slow to run. Since they are inherently parallel algorithms, I can use GPUs to speed up the computation. Here is the code that performs the feature extraction using the pretrained model, and a comparison of multithreaded CPU (Intel Core i7-3770 CPU) and GPU (NVIDIA Tesla K40 GPU) implementations.

%% Extract features using pretrained CNN

% Depending on how much memory you have on your GPU you may use a larger
% batch size. I have 400 images, so I choose 200 as my batch size
cnnModel.info.opts.batchSize = 200;

% Make prediction on a CPU
[~, cnnFeatures, timeCPU] = cnnPredict(cnnModel,trainingImages,'UseGPU',false);
% Make prediction on a GPU
[~, cnnFeatures, timeGPU] = cnnPredict(cnnModel,trainingImages,'UseGPU',true);

% Compare the performance increase
bar([sum(timeCPU),sum(timeGPU)],0.5)
title(sprintf('Approximate speedup: %2.00f x ',sum(timeCPU)/sum(timeGPU)))
set(gca,'XTickLabel',{'CPU','GPU'},'FontSize',18)
ylabel('Time(sec)'), grid on, grid minor
Figure 3: Comparision of execution times for feature extraction using a CPU (left) and NVIDIA Tesla K40 GPU (right).
Figure 3: Comparision of execution times for feature extraction using a CPU (left) and NVIDIA Tesla K40 GPU (right).
Figure 4: The CPU and GPU time required to extract features from 1128 images.
Figure 4: The CPU and GPU time required to extract features from 1128 images.

As you can see the performance boost you get from using a GPU is significant, about 15x for this feature extraction problem.

The function cnnPredict is a wrapper around MatConvNet’s vl_simplenn predict function. The highlighted line of code in Figure 5 is the only modification you need to make to run the prediction on a GPU. Functions like gpuArray in the Parallel Computing Toolbox make it easy to prototype your algorithms using a CPU and quickly switch to GPUs with minimal code changes.

Figure 5: The `gpuArray` and `gather` functions allow you to transfer data from the MATLAB workspace to the GPU and back.
Figure 5: The `gpuArray` and `gather` functions allow you to transfer data from the MATLAB workspace to the GPU and back.

Train a Classifier Using CNN Features

With the features I extracted in the previous step, I’m now ready to train a “shallow” classifier. To train and compare multiple models interactively, I can use the Classification Learner app in the Statistics and Machine Learning Toolbox. Note: for an introduction to machine learning and classification workflows in MATLAB, check out this Machine Learning Made Easy webinar.

Next, I will directly train an SVM classifier using the extracted features by calling the fitcsvm function using cnnFeatures as the input or predictors and trainingLabels as the output or response values. I will also cross-validate the classifier to test its validation accuracy. The validation accuracy is an unbiased estimate of how the classifier would perform in practice on unseen data.

%% Train a classifier using extracted features

% Here I train a linear support vector machine (SVM) classifier.
svmmdl = fitcsvm(cnnFeatures,trainingLabels);

% Perform crossvalidation and check accuracy
cvmdl = crossval(svmmdl,'KFold',10);
fprintf('kFold CV accuracy: %2.2f\n',1-cvmdl.kfoldLoss)

svmmdl is my classifier that I can now use to classify an image as a cat or a dog.

Object Detection

Most images and videos frames have a lot going on in them. In addition to a dog, there may be a tree or a raccoon chasing the dog. Even with a great image classifier, like the one I built in the previous step, it will only work well if I can locate the object of interest in an image (dog or cat), crop the object and then feed it to a classifier. The step of locating the object is called object detection.

For object detection, I will use a technique called Optical Flow that uses the motion of pixels in a video from frame to frame. Figure 6 shows a single frame of video with the motion vectors overlaid.

Figure 6: A single frame of video with motion vectors overlaid (left) and magnitude of the motion vectors (right).
Figure 6: A single frame of video with motion vectors overlaid (left) and magnitude of the motion vectors (right).

The next step in the detection process is to separate out pixels that are moving, and then use the Image Region Analyzer app to analyze the connected components in the binary image to filter out the noisy pixels caused by the camera motion. The output of the app is a MATLAB function (I’m going to call it findPet) that can locate where the pet is in the field of view.

Tying the Workflow Together

I now have all the pieces I need to build a pet detection and recognition system.

To quickly recap, I can:

  • Detect the location of the pet in new images;
  • Crop the pet from the image and extract features using a pretrained CNN;
  • Classify the features using an SVM classifier.

Pet Detection and Recognition

Tying all these pieces together, the following code shows my complete MATLAB pet detection and recognition system.

%% Tying the workflow together
vr = VideoReader(fullfile('PetVideos','videoExample.mov'));
vw = VideoWriter('test.avi','Motion JPEG AVI');
opticFlow = opticalFlowFarneback;
open(vw);

while hasFrame(vr)
% Count frames
frameNumber = frameNumber + 1;

% Step 1. Read Frame
videoFrame = readFrame(vr);

% Step 2. Detect ROI
vFrame = imresize(videoFrame,0.25); % Get video frame
frameGray = rgb2gray(vFrame); % Convert to gray for detection
bboxes = findPet(frameGray,opticFlow); % Find bounding boxes
if ~isempty(bboxes)
img = zeros([imageSize size(bboxes,1)]);
for ii = 1:size(bboxes,1)
img(:,:,:,ii) = imresize(imcrop(videoFrame,bboxes(ii,:)),imageSize(1:2));
end

% Step 3. Recognize object
% (a) Extract features using a CNN
[~, scores] = cnnPredict(cnnModel,img,'UseGPU',true,'display',false);

% (b) Predict using the trained SVM Classifier
label = predict(svmmdl,scores);

% Step 4. Annotate object
videoFrame = insertObjectAnnotation(videoFrame,'Rectangle',bboxes,cellstr(label),'FontSize',40);
end

% Step 5. Write video to file
writeVideo(vw,videoFrame);

fprintf('Frames processed: %d of %d\n',frameNumber,ceil(vr.FrameRate*vr.Duration));
end
close(vw);

Conclusion

Solutions to real-world computer vision problems often require tradeoffs depending on your application: performance, accuracy, and simplicity of the solution. Advances in techniques such as deep learning have significantly raised the bar in terms of the accuracy of tasks like visual recognition, but the performance costs were too significant for mainstream adoption. GPU technology has closed this gap by accelerating training and prediction speeds by orders of magnitude.

MATLAB makes computer vision with deep learning much more accessible. The combination of an easy-to-use application and programming environment, a complete library of standard computer vision and machine learning algorithms, and tightly integrated support for CUDA-enabled GPUs makes MATLAB an ideal platform for designing and prototyping computer vision solutions.



Popular Pages
Popular posts
Interested
About me

My name is Sayed Ahmadreza Razian and I am a graduate of the master degree in Artificial intelligence .
Click here to CV Resume page

Related topics such as image processing, machine vision, virtual reality, machine learning, data mining, and monitoring systems are my research interests, and I intend to pursue a PhD in one of these fields.

جهت نمایش صفحه معرفی و رزومه کلیک کنید

My Scientific expertise
  • Image processing
  • Machine vision
  • Machine learning
  • Pattern recognition
  • Data mining - Big Data
  • CUDA Programming
  • Game and Virtual reality

Download Nokte as Free


Coming Soon....

Greatest hits

You are what you believe yourself to be.

Paulo Coelho

One day you will wake up and there won’t be any more time to do the things you’ve always wanted. Do it now.

Paulo Coelho

Gravitation is not responsible for people falling in love.

Albert Einstein

Anyone who has never made a mistake has never tried anything new.

Albert Einstein

It’s the possibility of having a dream come true that makes life interesting.

Paulo Coelho

The fear of death is the most unjustified of all fears, for there’s no risk of accident for someone who’s dead.

Albert Einstein

Imagination is more important than knowledge.

Albert Einstein

Waiting hurts. Forgetting hurts. But not knowing which decision to take can sometimes be the most painful.

Paulo Coelho


Site by images
Statistics
  • 6,637
  • 19,350
  • 63,548
  • 18,587
Recent News Posts