Have you ever heard of big data correction? Of course, you have heard the term. In the last 4 or 5 years, there has been talking of big data everywhere.
But what exactly is big data !? In this post about big data, we try to get acquainted with this concept and examine the definitions and characteristics of big data.
Big Data Story
In ancient times, people traveled from village to village in chariots drawn by horses. Then, Over time, the villages became cities and the population distribution. As a result, traveling between cities became a problem despite luggage and furniture.
By the definition of big data, we have never had a problem storing data on our servers. Because the volume of data was relatively limited and also the time required to process this data was sufficient. But today, in this world full of technology, data is growing rapidly. Also, people are more dependent on this data. As data grows, it is not possible to store data on any type of server. Also, data processing in traditional ways does not meet our needs.
Factors affecting big data
For many reasons, the volume of data on the planet is increasing exponentially. Our numerous resources and day-to-day activities have led to the production of vast amounts of data. With the invention of the Web, it has become online all over the world, and the smallest thing we do is make a digital impact.
Despite the smart objects going online, the data growth rate is also growing rapidly. Major sources of big data can be social networking sites, sensor networks, digital videos and images, cell phones, shopping transaction records, log websites, medical records, archives, military surveillance systems, e-commerce sites, complex scientific research, and more. All of this data may be around a few quintiles of data, and scientists predict that by 2021, the data volume will be about 40 zeta bytes.
What is big data?
To provide a correct definition of big data, it may be best to first understand the characteristics of this environment.
Each of these features introduces one dimension of the big data environment and helps us better understand this environment. In the following, we examine the characteristics of the big data environment.
But in a brief definition, it may not be wrong to say that big data is a term used for large datasets with a wide variety of data types that store and process this data set, based on available database management tools or traditional solutions. Data processing is difficult. The challenge that big data faces is to record, organize, store, search, share, transfer, analyze, and visualize this data.
Features of big data
Usually, we can introduce big data along with its properties. Researchers, organizations, and individuals in the field of big data have come up with different characteristics of big data.
The Gartner Institute, for example, introduces three characteristics of volume, production rate, and diversity as characteristics of the big data environment. In addition to these three features, IBM has also introduced accuracy as another feature of the big data environment.
It refers to the volume of “data-size”, which is expanding at an increasing rate. The amount of data generated by humans, machines, and their interaction on social media alone is enormous.
Production rate refers to the rate at which different sources produce data on a daily basis. This volume of data flow is huge. There are currently 1.03 billion “daily active users” of Facebook who use mobile, an increase of 22% annually.
Since there are many sources that can be analyzed as primary data in big data solutions, the type of data that is generated is usually different. This data can be structured, semi-structured, or unstructured. Hence, we face a variety of data that people generate them daily.
Accuracy refers to data that, due to inconsistencies and inconsistencies between them, causes doubts about the available data.
After examining volume, production rate, diversity, and accuracy, we turn to the property of value in the definition of big data. Access to big data is very valuable, but if we can not value this data, it will remain unused.
We can divide the data available in today’s world into 3 parts:
- Structured data
- Semi-structured data
- Unstructured data