Vadym Fedorov is a Solutions Architect at SoftServe, a leading global software application development and consulting company, and a regular blogger on theSoftServe United blog.Vadym has 12 years experience in enterprise application development, as well as 2 years’ experience in Cloud and operations optimization.
When talking about the Internet of Things (IoT), we mean thousands of different sensors deployed in different geographic locations and constantly producing streams of data. This stream needs to be accepted, stored, processed, and visualized to allow mining of valuable knowledge from the input data. Based on this, we can represent reference solution architecture in form of the following tiers:
Devices (Sensors) produce data stream accepted and stored further by the Access Gateway layer. Event Processing and Analytics layer performs data processing and delivers it to the Visualization layer where information is available for end users in a suitable format. The architecture seems clear, however there are certain challenges:
- High velocity of the input data stream;
- High volume of the data;
- User expectations, for example to get reports in a due time.
The software providers and open source community apply a wide range of tools to deal with those challenges:
- Servers, application frameworks and message brokers like Apache Kafka that help to implement communication with devices and message gathering;
- Storages with high scalability, such as MongoDB, ElasticSearch, HDFS or VoltDB, can keep high data volumes and return data to consumer fast;
- Processing frameworks and analytics like Apache Storm, Apache MapReduce or Apache Spark;
- Visualization like D3JS, Kibana, MicroStrategy, Tableau, R or Excel.
All these tools are rather effective, however it is necessary to deploy infrastructure and software, as well as perform configuration. The architecture drivers require compute, storage, network resources and considerable effort to perform valid capacity planning and infrastructure configuration. This issue is already a part of another eternal dilemma: Capex vs. Opex, or Rent vs. Buy. For cases when you need to get a quick access to compute storage and network resources, it is smart to use cloud solutionsthat give customers:
- An access to the scalable compute and storage resources;
- Out-of-box IoT and Big Dataoriented services, which eliminate spending time for infrastructure design and implementation;
- Specialized services that can operate on high data volumes with great input velocity suitable for the IoT. They help reduce time to value and redirect developers' focus from the infrastructure to the software solution;
- Predictive cost of the ownership, by transparent pay-per-use pricing model, and an ability to manage resources usage online, if needed.