Statistical Analysis with Excel-Chapter13

hajra Begum
3 min readApr 18, 2022

In the last chapter, I went through all the basics needed to work with Distributions.

Let us now use them to do Analysis in Excel.

  1. NORM.DIST

I always use datasets that are current and great to give insights. But this dataset is chosen because of the fact, that India is closer to my heart and we all know how large datasets can be.

If the Cumulative box is entered with TRUE, then we get the Cumulative area.

Cumulative distribution function
probability density function

If the cumulative box is entered with FALSe then we get the height of the normal distribution at that value(8000).

2. NORM.INV

It is just the opposite of the previous formula, here you supply probability and you will receive the value of the random variable.

I cannot stress enough the importance of standardizing data and Scaling the data.

When I have a large amount of data that is to be compared I need the data in the standard form to be able to compare it.

There is so much to explain how these concepts are important. But I will take a simple example to do so.

Standardizing: It is process by which we convert all values to standard scores called z-scores.It allows us to compare data sets with different means and standard deviations .

Original Normal distribution:X ~N(μ, σ)

Z-distribution/standard normal distribution:Z~N(μ, σ),μ=0 and σ=1,values of z-scores are in the range between -3 and 3.

The data points in the standard distribution are represented as x but they are called z or z-scores in the z-distribution. A z-score is also called a standard score and tells us how many standard deviations away from the mean an individual value (x) lies:

  • The z-score will be positive if the x-value is greater than the mean(μ).
  • The z-score will be negative if the x-value is less than the mean(μ).
  • The z-score will be zero if the x-value and mean(μ) are equal.
Formula to calculate z-scores

x (individual value),μ (mean),σ (standard deviation)

Why standardizing?

I was predicting the price of the house using Machine learning and obtained the prediction model as

Price=2.1*avgAreaIncome +1.6*avgAreaHouseAge +1.2 avgAreaNumOfRooms’

If predicting price using the above 3 features ,then we make sure magnitude(scale of avgAreaIncome is bigand scale of avgAreaHouseage is small)of these features should not contribute to the price.

We can do this by standardising the values of these features.

3.NORMAL.S.DIST

It is counterpart of NORM.DIST except that it is used for standard normal distribution.

original value 8380(New_cases)

z-score=8380–48318.98106/74979.84652=-0.53

the cumulative probability is 0.298

4.NORM.S.INV

Supply the cumulative probability and in return it gives the z-score.

NORM.S.INV

All these functions are very important as they are very useful in Analytics. Even in the future if you planning to study advanced Analytics you will already have conceptual knowledge.

--

--

hajra Begum

Enthusiast of Data Science, Operations Research, and Mathematics/Love to cook.