Deep Learning For Smart Cities

July 5, 2016February 22, 2022 ~ Prateek Joshi ~ Leave a comment

1 main In recent years, technological advancements in hardware, software, and embedded systems are enabling billions of smart devices to be connected to the internet. This ecosystem is collectively referred to as Internet of Things. A lot of people are actively migrating to cities, which means the essential resources are going to get scarcer. Cities will have to manage infrastructure like water, power, transport, and so on very effectively if they want to support everybody. But how do we do that? The data that is being collected varies so much quality and format that it becomes very difficult to use it effectively. How can we effectively use the data being collected by connected sensors? Continue reading “Deep Learning For Smart Cities” →

Deep Learning For Sequential Data – Part V: Handling Long Term Temporal Dependencies

May 31, 2016February 22, 2022 ~ Prateek Joshi ~ 1 Comment

1 main In the previous blog post, we learnt why we cannot use regular backpropagation to train a Recurrent Neural Network (RNN). We discussed how we can use backpropagation through time to train an RNN. The next step is to understand how exactly the RNN can be trained. Does the unrolling strategy work in practice? If we can just unroll an RNN and make it into a feedforward neural network, then what’s so special about the RNN in the first place? Let’s see how we tackle these issues. Continue reading “Deep Learning For Sequential Data – Part V: Handling Long Term Temporal Dependencies” →

Deep Learning For Sequential Data – Part IV: Training Recurrent Neural Networks

May 24, 2016February 22, 2022 ~ Prateek Joshi ~ Leave a comment

1 main In the previous blog post, we learnt how Recurrent Neural Networks (RNNs) can be used to build deep learning models for sequential data. Building a deep learning model involves many steps, and the training process is an important step. We should be able to train a model in a robust way in order to use it for inferencing. The training process needs to be trackable and it should converge in a reasonable amount of time. So how do we train RNNs? Can we just use the regular techniques that are used for feedforward neural networks? Continue reading “Deep Learning For Sequential Data – Part IV: Training Recurrent Neural Networks” →

Deep Learning For Sequential Data – Part III: What Are Recurrent Neural Networks

May 17, 2016February 22, 2022 ~ Prateek Joshi ~ Leave a comment

1 main In the previous two blog posts, we discussed why Hidden Markov Models and Feedforward Neural Networks are restrictive. If we want to build a good sequential data model, we should give more freedom to our learning model to understand the underlying patterns. This is where Recurrent Neural Networks (RNNs) come into picture. One of the biggest restrictions of Convolutional Neural Networks is that they force us to operate on fixed size input data. RNNs know how to operate on sequences of data, which is exciting because it opens up a lot of possibilities! What are RNNs? How do we construct them? Continue reading “Deep Learning For Sequential Data – Part III: What Are Recurrent Neural Networks” →

Deep Learning For Sequential Data – Part I: Why Do We Need It

May 3, 2016February 22, 2022 ~ Prateek Joshi ~ 4 Comments

1 main Most of the current research on deep learning is focused on images. Deep learning is being actively applied to many areas, but image recognition is definitely generating a lot of buzz. Deep neural networks are being used for image classification tasks and they are able to outperform all the other approaches by a big margin. The networks that are used here are traditional feedforward neural networks that learn how to classify data by generating the optimal feature representation. These neural networks severely limited when it comes to sequential data. Time series data is perhaps the most popular form of sequential data. Why can’t we use feedforward neural networks analyze sequential data? Continue reading “Deep Learning For Sequential Data – Part I: Why Do We Need It” →

How To Extract Feature Vectors From Deep Neural Networks In Python Caffe

April 26, 2016April 20, 2016 ~ Prateek Joshi ~ 6 Comments

1 main Convolutional Neural Networks are great at identifying all the information that makes an image distinct. When we train a deep neural network in Caffe to classify images, we specify a multilayered neural network with different types of layers like convolution, rectified linear unit, softmax loss, and so on. The last layer is the output layer that gives us the output tag with the corresponding confidence value. But sometimes it’s useful for us to extract the feature vectors from various layers and use it for other purposes. Let’s see how to do it in Python Caffe, shall we? Continue reading “How To Extract Feature Vectors From Deep Neural Networks In Python Caffe” →

How To Programmatically Create A Deep Neural Network In Python Caffe

April 19, 2016April 4, 2016 ~ Prateek Joshi ~ 2 Comments

1 main When you are working with Caffe, you need to define your deep neural network architecture in a ‘.prototxt’ file. These prototxt files usually consist of hundreds of lines, defining layers and corresponding parameters. Before you start training your neural network, you need to create these files and define your architecture. One way to do this is manually write all these lines into a file. But sometimes, it’s beneficial to dynamically create this architecture depending on our needs. In such cases, creating a deep neural network programmatically can be very useful. Let’s go ahead and see how to do it in Python Caffe, shall we? Continue reading “How To Programmatically Create A Deep Neural Network In Python Caffe” →

What Is Local Response Normalization In Convolutional Neural Networks

April 5, 2016April 7, 2016 ~ Prateek Joshi ~ 12 Comments

1 main Convolutional Neural Networks (CNNs) have been doing wonders in the field of image recognition in recent times. CNN is a type of deep neural network in which the layers are connected using spatially organized patterns. This is in line with how the human visual cortex processes image data. Researchers have been working on coming up with better architectures over the last few years. In this blog post, we will discuss a particular type of layer that has been used consistently across many famous architectures. This layer is called Local Response Normalization layer and it plays an important role. What does it do? What’s the advantage of having this in our network? Continue reading “What Is Local Response Normalization In Convolutional Neural Networks” →

Understanding Xavier Initialization In Deep Neural Networks

March 29, 2016April 7, 2016 ~ Prateek Joshi ~ 6 Comments

1 main I recently stumbled upon an interesting piece of information when I was working on deep neural networks. I started thinking about initialization of network weights and the theory behind it. Does the image to the left make sense now? The guy in that picture is lifting “weights” and we are talking about network “weights”. Anyway, when we implement convolutional neural networks, we tend to utilize all the knowledge and research available out there. A good number of things in deep learning are based on heuristics! It’s worth exploring why we do things in a certain way whenever it’s possible. This goes a long way in unlocking the hidden mysteries of deep learning and why it’s so unbelievably accurate. Let’s go ahead and understand how network weights are initialized, shall we? Continue reading “Understanding Xavier Initialization In Deep Neural Networks” →

Deep Learning With Caffe In Python – Part IV: Classifying An Image

February 23, 2016April 7, 2016 ~ Prateek Joshi ~ 11 Comments

4 main In the previous blog post, we learnt how to train a convolutional neural network (CNN). One of the most popular use cases for a CNN is to classify images. Once the CNN is trained, we need to know how to use it to classify an unknown image. The trained model files will be stored as “caffemodel” files, so we need to load those files, preprocess the input images, and then extract the output tags for those images. In this post, we will see how to load those trained model files and use it to classify an image. Let’s go ahead see how to do it, shall we? Continue reading “Deep Learning With Caffe In Python – Part IV: Classifying An Image” →