INCOME TAX NOTES

Skewness and kurtosis

 Skewness and Kurtosis


Skewness and kurtosis

Skewness is a measure of the asymmetry or symmetry of a a data distribution.

A distribution , or data set, is symmetric if it looks the same to the left and right of the center point.

                                          No skewness=symmetric curve 

Types of skewness

1)No skewness(Zero skewness): if the values of variable are equidistance from the mean have equal frequency ,then the distribution is said to have symmetric or zero skewness. Normal distribution visual, also known as a bell curve. It is a symmetrical graph with all measures of central tendency in the middle

Mean=median=mode


symmetric skewness


2)Positive skewness right skewed: If the frequency of the values of the variable lower than the mean have high frequencies compared to the frequencies associated with the values of variables higher than the mean.

mean>median >mode 

positive skewness left skewed



3)Negative skewness left skewed:

mode>median>mean


Negative skewness left skewed


For the left graph since the tail is to the left, it is left-skewed (negatively skewed) and the right graph has the tail to the right, so it is right-skewed (positively skewed).

How about deriving a measure that captures the horizontal distance between the Mode and the Mean of the distribution? It’s intuitive to think that the higher the skewness, the more apart these measures will be. 
So let’s jump to the formula for skewness now

                               

Division by Standard Deviation enables the relative comparison among distributions on the same standard scale. Since mode calculation as a central tendency for small data sets is not recommended, so to arrive at a more robust formula for skewness we will replace mode with the derived calculation from the median and the mean.

relation of mode with mean and median

Replacing the value of mode in the formula of skewness, we get:



scale of skewness


Kurtosis

What is Kurtosis and how do we capture it?
Think of punching or pulling the normal distribution curve from the top, what impact will it have on the shape of the distribution? Let’s visualize:

kurtosis is a a measure of whether the data are heavy -tailed or light -tailed relative to a normal distribution.

image of kurtosis


So there are two things to notice — The peak of the curve and the tails of the curve, Kurtosis measure is responsible for capturing this phenomenon. The formula for kurtosis calculation is complex (4th moment in the moment-based calculation) so we will stick to the concept and its visual clarity. A normal distribution has a kurtosis of 3 and is called mesokurtic. Distributions greater than 3 are called leptokurtic and less than 3 are called platykurtic. So the greater the value more the peakedness. Kurtosis ranges from 1 to infinity. As the kurtosis measure for a normal distribution is 3, we can calculate excess kurtosis by keeping reference zero for normal distribution.
 Now excess kurtosis will vary from -2 to infinity.

Excess kurtosis=kurtosis-3

Excess Kurtosis for Normal Distribution = 3–3 = 0
The lowest value of Excess Kurtosis is when Kurtosis is 1 = 1–3
 = -2


So we can conclude from the above discussions that the horizontal push or pull distortion of a normal distribution curve gets captured by the Skewness measure and the vertical push or pull distortion gets captured by the Kurtosis measure. Also, it is the impact of outliers that dominate the kurtosis effect which has its roots of proof sitting in the fourth-order moment-based formula. I hope this blog helped you clarify the idea of Skewness & Kurtosis in a simplified manner, watch out for more similar blogs in the future.