Introduction to Big Data
Big Data might well be the Next Big Thing in the IT world. Big data emerge suddenly in the first decade of the 21st century. The first organizations to accept it was online and startup firms.The curiosity about what is big data has been soaring in the past few years.
What is Big Data?
Big data refers to the large amounts of data that are spawned from various data sources and have different formats. Even previously there was huge data that was being stored in databases. But because of the varied nature of this data, the traditional relational database systems are not capable of handling this data.
Big data is much more than a collection of data sets with different formats. It is an important asset that can be used to obtain innumerable benefits.
Characteristics of Big Data (Five V’s of Big Data)
Volume refers to the unimaginable amount of information generated every single second from social media, cell phones, cars, credit cards, M2M sensors, images, videos, and whatnot.
We are currently using distributed systems to store the information in several locations and bring them together by a software framework like Hadoop. We all know that Facebook alone can generate about billions of messages. 4.5 billion times that like button is recorded and over 350 million new posts are updated each and every day. Such a huge amount of data can be only handled by big data technologies.
As discussed before, big data is generated in multiple varieties. Compared to their traditional data like phone numbers and addresses the latest trend of information is in the form of photos, videos and audios and many more. Making about 80 percent of the data to be completely unstructured.
Veracity basically means the degree of reliability that the information need to offer. Since the major part of the data is unstructured and irrelevant, big data needs to find an alternate way to filter them or to translate them out of the data to make it crucial for business developments.
Value is that the major issue that we’d like to consider . It is not just the amount of data that we store or process. It is actually the quantity of valuable, reliable and trustworthy data that must be stored, processed, analyzed to seek out insights.
Velocity plays a major role compared to others. There is no point in investing such a lot and find yourself expecting data. So the major aspect of big data is to provide data on demand at a faster pace.
Types of Big Data
Big data is generally characterized into three different varieties. They are.
- Structured Data
- Semi-structured Data
- Unstructured Data
1. Structured Data
Structured Data owns a dedicated data model. It has a well-defined structure. It follows a uniform order, and it’s designed in such how that it are often easily accessed and employed by an individual or a computer.
Structured data is typically stored in well-defined columns and also databases. For example, the simple and humble DBMS.
2. Semi-Structured Data
Semi-structured data are often considered as another sort of structured data. It inherits a few properties from structured data. But the main part of this type of knowledge fails to possess a particular structure. And also, it doesn’t obey the formal structure of knowledge models like an RDBMS. A very common example of Semi-structured data is a CSV file or comma-separated file.
3. Unstructured Data
Unstructured data is completely a different type of data, which neither has a structure nor always follows the formal structure of rules of data models. It doesn’t even have a uniform format and it’s found to be varying all the time. But rarely it may have information related to data such as date and time. The simplest examples of unstructured data are audio files, images, videos, etc.
- Big Data Engineer
- Big Data Architect
- Data Warehouse Manager
- Database Developer
- Database Administrator
- Database Manager
- Data Scientist
- Business Intelligence Analyst.