Internet of Things (IoT) has emerged as one of the hottest trends in the technology world. It has the potential to radically change the way we experience life. It will particularly have a huge impact on the industrial world where we have to deal with massive machines, buildings, and open fields. Industrial technologies have direct impact on some of the most pressing problems facing humanity like water shortage, energy consumption, infrastructure management, and so on. When we apply IoT methodologies to the industrial world, it is called Industrial IoT. There has been a lot of discussion as to what exactly is it. Is it a technology? Is it a collection of things? More importantly, there has been a lot of misinformation around it. Let’s go ahead and dissect it, shall we?
What exactly is it
If you read IoT articles online, you will come across a bunch of definitions as to what IoT is all about. Some people say it is a collection of objects connected to the internet, while some people say it is a technology. To be honest, those definitions are quite reductive! In reality, IoT is way too big to be defined in such narrow terms. IoT is an ecosystem with a large number of moving parts.
One of the main goals of IoT is connect every device to the internet. These devices can be sensors, actuators, phones, and so on. Due to advances in hardware, we can manufacture these devices at a very low cost. For example, equipments in manufacturing plants need to be monitored at all time. The physical conditions near these machines don’t allow humans to go near them. So how do we monitor them? A good way to solve this problem is to install a bunch of sensors that are wirelessly connected to the internet. These sensors monitor the state of the equipment and you can keep track of it from your comfortable office. Once you get the data to the cloud, we can do analysis and extract insights. This is a very simple use case of Industrial IoT.
What about the actual architecture
On the surface, it looks fairly straightforward. Right? You build a small sensor, connect it wirelessly to the internet, and you are done! If you look underneath, you will realize that it is backed by a very complex and sophisticated architecture that enables data to trickle through all the layers seamlessly. Between the “thing” (i.e. the sensor) and the nice charts that you see on your laptop, there are many layers that control the flow of data. Let’s take a look:
Why are there so many layers
This is what it takes even for a simple use case like a temperature sensor sending data to your laptop. The journey of piece of data starts at the “thing” layer and it goes through many layers before becoming a nice colorful chart on your laptop. Let’s see what each layer does:
Thing: A “thing” is usually a device that acts as the point of contact with the physical world. This device converts physical phenomena into numbers. This device can also be an actuator that takes automatic actions based on the data that comes back from the cloud.
Gateway: A gateway is a point in computer networks that act as an entrance to another network. It is required to enable the connectivity of devices with other systems. Gateway is one of the most crucial components in this stack because it is responsible for connectivity, manageability, and security of data. Everything that comes from and goes to a connected device has to go through this gateway. These gateways can also handle local computing and data preprocessing to make sure we control the amount of data that has to go to the cloud. This is helpful in conserving precious bandwidth.
Connectivity: This layer controls how the device is actually connected (either wired or wireless). There are a few different protocols that control the connectivity. Most of the modern devices use wireless connectivity unless the conditions are harsh like in an industrial manufacturing plant. Wired connectivity includes protocols like RS-485, USB, RJ45, and so on.
Link Protocol: The link protocol controls what protocol is being used to link the device. The link protocol tends to be centered on protocols like WiFi, CDMA, Bluetooth, Ethernet, and so on.
Internet Protocol: The internet layer controls how each device is identified across the network. We currently live in the world of IPv4 and we are quickly moving towards IPv6. Given the explosion in the number of devices, it is safe to say that IPv6 pretty much owns this layer.
Communication: There are a large number of devices that are communicating with each other. These devices are constrained in computational resources, so we need to make sure the messaging protocol is lightweight and leaves low footprint. The communication layer takes care of this using modern messaging protocols like MQTT and AMQP.
Data ingestion: This is where handling large amounts of data becomes really critical. Up until this layer, we were just dealing with passing the data along. The data ingestion layer usually has to deal with millions of devices communicating in parallel all the time. These devices send data continuously and that data needs to be ingested in real time. The data ingestion layer handles all this using Apache tools like Kafka, Spark Streaming, Flume, Storm, and so on. The exact tool depends on the use case, but it’s crucial to the infrastructure.
Data Storage: All the data coming from devices is really valuable, so we need to store it efficiently. One of the most important characteristics of this data is that it is temporal. Instead of using traditional databases like mySQL, PostgreSQL, or MongoDB that don’t take advantage of the temporal nature of our data, we need to use a time series database like InfluxDB or OpenTSDB that will do it very efficiently.
Data Access: This layer deals with accessing the data that’s stored in the databases. This layer is important to perform tasks like visualizing data, extracting stats, and so on. This layer doesn’t necessarily deal with any kind of intelligence. It’s important to use databases that allow for data access that can take advantage of the temporal nature of data.
Intelligence layer: This is where all the magic is! The insights you derive from your data is what’s going to drive all the revenue. This layer need to analyze data, extract patterns, and then provide insights. Instead of using traditional machine learning techniques like Support Vector Machines or Random Forests, we need to use a model that will take advantage of the temporal nature of data. This is where Recurrent Neural Networks become really relevant.
Application: This is the interface through which the end user will interact with your tech stack. To take a simple use case, let’s say the end user is a floor manager who wants to monitor the temperature of an equipment in a manufacturing plant. He wants to see the graph on his smartphone and get a notification when the temperature goes above a certain threshold. In this case, the application is a mobile app that will provide the required functionality.