Statistics for Data Science — Basic Statistics

Posted Mar 15, 2026

By Md. Sawrab

4 min read

Statistics is a foundational component of data science, providing powerful tools and techniques for analyzing and interpreting data. Data scientists use statistical methods to extract meaningful insights from large and complex datasets, identify patterns and trends, and support informed business decisions. With a strong statistical foundation, a data scientist can better understand the behavior of data.

In this blog series, we will cover everything from foundational theories to advanced analytical techniques and explore their real-world applications.

What Is Statistics?

Statistics is the branch of applied mathematics that deals with the collection, organization, analysis, interpretation, and presentation of data.

It is widely used in science, economics, social sciences, business, and engineering to generate insights, make predictions, and guide decision-making. In simple terms, statistics helps us discover patterns, trends, and relationships in data.

Examples

Calculating the average (mean) marks of students in an exam.
Estimating the average height of all students in a school based on a sample of 100 students.

Key Concepts

Data

Data can be anything and everything. Any information or fact can be considered data.

Examples: age, weight, score, income.

Population

A population is the complete set of individuals or items that share a common characteristic and are the subject of study.

Example: all students in a class.

Types of Population

Finite population: Can be counted and measured directly. Example: the number of people enrolled in a course.
Infinite population: So large that it cannot be fully counted. Example: the number of Google searches performed per second.

Sample

A sample is a subset of a population used to draw conclusions about the entire population.

Example: surveying 100 students to understand the study habits of all students in a school.

Parameter

A parameter is a numerical value that describes a population.

Example: if the true average height of all students in a school is 5.5 feet, that value is a parameter.

Statistic

A statistic is a numerical value that describes a sample.

Example: if 100 students are measured and their average height is 5.4 feet, that value is a statistic.

Variable

A variable is any characteristic or quantity that can take different values.

Examples: age, length, height.

Types of Variables

Qualitative (categorical) variable: Describes qualities or categories. Examples: color of a car, blood type, gender.
Quantitative (numerical) variable: Represents measurable quantities. Examples: number of children, weight, income.

Types of Quantitative Data

Discrete data: Takes specific, countable values (often integers). Example: number of students in a class (30, 31, 33).
Continuous data: Takes any value within a range and is measured. Example: height of a person.

Scales of Measurement

There are four primary scales of measurement:

Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale

Nominal Scale

The nominal scale classifies data into distinct categories with no inherent order.

Examples:

Gender: Male, Female
Blood Type: A, B, AB, O
Marital Status: Single, Married, Divorced

Ordinal Scale

The ordinal scale ranks data in a meaningful order, but differences between ranks are not equal or precisely measurable.

Examples:

Education Level: High School, Bachelor’s, Master’s, PhD
Customer Satisfaction: Very Unsatisfied to Very Satisfied
Economic Status: Low, Middle, High

Interval Scale

The interval scale has ordered values with equal intervals, but no true zero point.

Examples:

Temperature: Celsius, Fahrenheit
Calendar Years
IQ Scores

With interval data, addition and subtraction are meaningful, but ratio statements are not. For example, 20 C is not twice as warm as 10 C.

Ratio Scale

The ratio scale has all interval scale properties plus an absolute zero point, allowing meaningful ratio comparisons.

Examples:

Height
Weight
Age

Types of Statistics

There are two major types of statistics:

Descriptive Statistics
Inferential Statistics

Descriptive Statistics

Descriptive statistics summarizes and presents data in a meaningful way so that we can understand it quickly.

Key Components

Measures of central tendency
- Mean: average value
- Median: middle value in ordered data
- Mode: most frequent value
Measures of dispersion (variability)
- Range: highest minus lowest value
- Variance: average squared deviation from the mean
- Standard Deviation: square root of variance
Frequency distribution
- Shows how often each value appears (tables, histograms, pie charts)

Inferential Statistics

Inferential statistics uses sample data to draw conclusions or make predictions about a population.

Key Components

Hypothesis Testing
- Null Hypothesis (H0): no effect or no difference
- Alternative Hypothesis (H1): effect or difference exists
Confidence Intervals
- A range likely to contain the true population parameter
Regression Analysis
- Studies relationships between variables and supports prediction
Statistical Tests
- t-tests, chi-square tests, ANOVA

Thanks for reading.

“Your network is your net worth.” - Tim Sanders

Connect on LinkedIn: md-sawrab

GitHub: md-sawrab

Data Science

This post is licensed under CC BY 4.0 by the author.

What Is Statistics?

Examples

Key Concepts

Data

Population

Types of Population

Sample

Parameter

Statistic

Variable

Types of Variables

Types of Quantitative Data

Scales of Measurement

Nominal Scale

Ordinal Scale

Interval Scale

Ratio Scale

Types of Statistics

Descriptive Statistics

Key Components

Inferential Statistics

Key Components

Trending Tags