Do you really know all about “Data”?

Jay Patel
6 min readApr 8, 2021

A little bit about myself:

My name is Jay Patel and I am pursuing a Bachelor of Technology in Computer Science and Engineering with a Specialization in Big Data And Analytics.

If you Want to Connect with Me:

Linkedin: www.linkedin.com/in/jay-patel-823a08172

Twitter: https://twitter.com/JayPatel15406

Things to know before we start our journey on “Data”:-

There are few terms which often used in this blog. So, it is beneficial if you know the signification of some words before starting our journey of exploring “Data”:-

  1. Information:- Facts provided or learned about something or someone.
  2. Countable Nouns (Count Noun):- Countable nouns are for things we can count using the “Numeral”. They have a singular and a plural form. The singular form can use the determiner “a” or “an”. e.g.:- 2 Dogs, 1 Cat, 5 Friends, etc.
  3. Uncountable Noun (Uncount Noun or Mass Noun):- Uncountable nouns are for the things that we cannot count with “Numeral”. They may be the physical object that is too minute or too amorphous to be counted. Uncountable nouns are used with a singular verb. They usually do not have a plural form. e.g.:- Tea, Love, Sugar, Wheat Seeds, etc.
  4. Cardinal Numbers:- Cardinal Numbers are a generalization of Natural Numbers used to measure “Cardinality (Quantity)”. e.g.:- One, Two, Three, etc.

From where this “Data” word came from?

Binary Data Gif from (Giphy.com)

Let’s, first of all, understand the meaning of “Data” then we will deep dive into it. Basically, the collection of “Information” is known as “Data”. The “Data” is a Plural form of “Datum”. “Datum” is from its Latin Origin, and also may refer to a “single item of data” or “single unit of information”. In one sense, “datum” can also be a “count noun” with plural “datums” that can be used as “cardinal numbers” (e.g.:- 19 Datums, etc.)

Learn more: Datum

History of “Data”

The word “Data” has generated considerable controversy on whether it is an “uncountable noun” used with verbs conjugated in the singular, or should be treated as the plural of the now-rarely-used “Datum”. The debate over appropriate usage continues, but “data” as a singular form is far more common.

According to Etymology and terminology, the first usage of the word “data” was made in the 1640s (mid 17th Century). The word “data” was first used to mean “transmissible and storable computer information” in 1946. From 1897 (mid 19th Century) it is also known as “numerical facts collected for future reference”.

# F.Y.I. List of Some Data Terms first Recorded:-+-----------------+-------------------------------+
| Data Terms | Term First Recorded (in Year) |
+-----------------+-------------------------------+
| Data-processing | 1954 |
| Database | 1962 |
| Data-entry | 1970 |
+-----------------+-------------------------------+

Flavors of Data Based on different Scenario:-

After Completion of the ‘Episode: History of “Data” ’. We can now open the treasure of ‘data’ for exploring different types of data based on the situation. But before moving ahead let’s take a look at our map stated below for the correct path:

# Basic Map for Flavors of Data based on different Scenario:-Scenario for Data Exploration
|
|
+----->In term of Machine
| |
| +-----> Analog Data
| |
| +-----> Digital Data
|
+----->In term of Basic Data Analysis
| |
| +-----> Qualitative Data (Categorical Data)
| | |
| | +-----> Binomial Data (Binary Data)
| | |
| | +-----> Nominal Data (Unordered Data)
| | |
| | +-----> Ordinal Data (Ordered Data)
| |
| +-----> Quantitative Data (Numerical Data)
| |
| +-----> Discrete Data
| |
| +-----> Continous Data
|
+----->In term of Data Processing
|
+-----> Structured Data
|
+-----> Unstructured Data
|
+-----> Semi-Structured Data

Let’s survey each node one by one.

1. In term of Machine:-

There are two general types of data for the given scenario:

a. Analog Data (e.g.:- Photocopies, Audiotapes, etc.)

b. Digital Data (e.g.:- Digital Thermometre, Symbols, etc.)

‘Nature’ is ‘Analog’, while a ‘Computer’ is ‘Digital’. All digital data are stored as ‘Binary Digits’. One of the most common data types is Text, also referred to as Character String.

2. In term of Basic Data Analysis:-

There are two main flavors of data for the given scenario:

a. Qualitative Data (Categorical Data):

The Data deals with Numbers and the things you can measure objectively. Such type of data is known as ‘Qualitative Data’. e.g.:- Dimensions such as height, Prices, Areas, Volumes, etc.

There are basically Three Type of Qualitative Data:-

(i) Binomial Data (Binary Data):-

Binary Data place things in one of the mutually exclusive categories: Right/Wrong, True/False, or Accept/Reject.

(ii) Nominal Data (Unordered Data):-

We assign the individual items to named categories that do not have an implicit or natural value or rank. If we look out the box of colors and record color on each worksheet, that would be ‘Nominal Data’.

It Sometimes also called ‘Named Data’ — a meaning coined from the word ‘Nominal’.

(iii) Ordinal Data (Ordered Data):-

Data in which items are assigned to categories that do have some kind of implicit or natural order. Such as ‘Short, Medium and Tall’. The scale of 1 to 10. Which defines 10 as better than 9. This Categorical Data lies under the type of ‘Ordinal Data’.

b. Quantitative Data (Numerical Data):

The Data deals with Characteristics and Descriptors that can’t be easily measured but can observe subjectively such as Taste, Attractiveness, and Color. Such type of data is known as ‘Quantitative Data’.

There are two types of ‘Quantitative Data’ stated below:-

(i) Discrete Data:-

Discrete Data’ is a count that can’t be made more precise. Typically it involves integers. for instance:- Number of Children in your Family is ‘discrete data’ because we are counting whole individual entries, you can’t have 2.5 Person or 1.3 Pet.

(ii) Continous Data:-

‘Continous Data’ on the other hand, could be divided and reduced to finer and finer levels. e.g.:- Height of a Kid at progressively and more precise.

3. In term of Data Processing:-

There are 3 type data lies under this scenario:

a. Structured Data:

It’s based on RDBMS. Versioning over tuples, Rows, Tables. It is very Robust. Structured Query allows complex Joining. and it is very difficult to scale DB Schema.

b. Semi-Structured Data:

It’s based on XML/ RDF (Resource Description Framework). queries over anonymous nodes are possible. Versioning over Tuples and Graph is possible.

c. Unstructured Data:

It’s based on Character and Binary Data. Versioning as a whole. Only the textual Queries is possible. and it is more Scalable.

Before Conclusion many Devs have Queries such as :

NOTE:- Below mentioned Queries are in TL;DR Format.

“Why ‘data’ is also known as ‘Crude Oil’ today’s Technological era?”

So, the Simple Answer for the question is ‘Data’ is Valuable when it was in ‘Refined Format’, but if it is ‘Unrefined’ then it cannot really be used just like ‘Crude Oil’.

“Why learning of ‘Data’ is important?”

So, from the above scenario. It is important to learn how the refining process of ‘Data’ really works. But before that ‘Type of Data’, ‘Behaviour of Data’, etc. all the module plays a crucial role for ‘Refining Process’ to find Feasible Solution.

Summary:-

So finally, it’s time to conclude our journey on ‘Data’. Hope you have enjoyed and learned a lot from this blog. If you have any suggestions feel free to contact me & open for feedback. “Happy Learning :)”

--

--