INCOME TAX NOTES

Measures of Dispersion in statistics

 Measures of Dispersion 

            
 1.Measures of Dispersion types and formula.
(i) Range 
ii)Quartile deviation
iii)Mean deviation
iv)Standard deviation
v)Variance
Dispersion 
Dispersion is the degree of variation of data. It is the extent to which values in a distribution differ from central value (mean, median, mode).
Dispersion is also known as scatter, spread and variation.


Example:
 

Sr.No 

Sneh

Amit

Minna

15000

7000 

5000

15000

10000 

2000 

15000 

14000 

8000 

15000 

17000 

10000 

----- 

20000 

50000 

----- 

22000 

----- 

Total income 

60000 

90000 

75000 

Average income 

15000 

15000 

15000 


Arithmetic Mean

Sneha= 6000/4=15000

Amit=90000/6=15000

Minna=75000/5=15000


  • Types of measure of Dispersion




1.Range :
The Range is the difference between the lowest and highest values. We can calculate range by subtracting small value from large value.

Range= L-S

Coefficient of range= L-S/L+S*100

Individual series
Calculate range and coeff. of range?
7, 8 , 12 ,15, 18, 20

Solution:
Range =L-S
=20-7
=13


Coeff. of Range =L-S/L+S*100
=20-7/20+7*100

=13/27*100
=48.148%

  • Discrete series
xf
53
107
155
2012
259

Calculate range and coeff. of range?

Solution: 
Range = L-S
25-5
=20

Coeff. of Range
L-S/L+S*100
=25-5/25+5*100
=66.67%
  • Continuous Series
-Inclusive
classf
20-252
25-305
30-359
35-404
40-457


Range= L-S
45-20
=25

Coeff. of range=L-S/L+S*100
=45-20/45+20*100
=38.46%

-Exclusive
classf
20-242
25-295
30-349
35-394
40-447


After converting it into inclusive-
classf
19.5-24.52
24.5-29.55
29.5-34.59
34.5-39.54
39.5-44.57

Calculate range and Coeff. of Range.?

Solution:
Range = L-S
=44.5-19.5
=25

Coeff. of Range
=L-S/L+S*100
=44.5-19.5/44.5+19.5*100
=39.06%

Merits: 
Amongst all the methods of dispersion ,range is the simplest to understand and easiest to calculate.
Range is the quickest method of dispersion, It takes minimum time to calculate value of range.

Demerits:
Range is not based on each and every observation of the series.
It is most affected by sampling fluctuation.
Range is the most unreliable as a guide of the dispersion.
We can not calculate range for open end classification.

Uses:
Quality control.
fluctuation in the share price.
weather forecast.



  2. Quartile Deviation

The presence of even one extremely high or low value in a distribution can reduce the utility of range as a measure of dispersion. Thus, you may need a measure which is not unduly affected by the outliers.
 In such a situation, if the entire data is divided into four equal parts, each containing 25% of the values, we get the values of quartiles and median. The upper and lower quartiles (Q3 and Q1 , respectively) are used to calculate inter-quartile range which is Q3 – Q1 .

  Inter-quartile range = Q3 – Q1 .

Inter quartile range is based upon middle 50% of the values in a distribution and is, therefore, not affected by extreme values. Half of the inter-quartile range is called quartile deviation (Q.D.).




Coeff. of Q.D=(Q3-Q1/Q3+Q1)*100

Q1-First quartile(Lower quartile)
Q3-Third quartile(Upper quartile)

Thus: Q.D. is therefore also called  Semi Inter Quartile Range.

1.Individual Series 

Example 1 
Calculate range and Q.D. of the following observations: 20, 25, 30, 41, 29, 35, 39, 48, 51, 60 and 70 

Sol.- Firstly arrange value in ascending or descending order.
20, 25, 29, 30, 35, 39, 41, 48, 51, 60 and 70 

Range is clearly 70 – 20 = 50
 For Q.D., we need to calculate values of Q3 and Q1 
Q1 is the size of n +1 th/4th value. 
(11+1)/4=3
n being 11, Q1 is the size of 3rd value. As the values are already arranged in ascending order, it can be seen that Q1 , the 3rd value is 29.

 Similarly
Q3 is size of 3( n +1)/4)value;
3(11+1)/4
36/4=9
 i.e. 9th value which is 51. Hence Q3 = 51 =

Q.D=Q3-Q1/2
 =(51-29)/2
= 11

Coeff. of QD= (Q3-Q1)/(Q3+Q1)*100
=(51-29)/(51+29)*100
=27.5%

Example.2
Calculate QD and Coeff. of QD.
    
Month Income
Jan139
Feb140
Mar140
Apr141
May141
Jun142
July142
Aug143
Sep143
Oct144
Nov144
Dec145


Sol: Arrange the data in ascending or descending data.
139,140,140,141,141,142,142,143,143,144,144,145
Here n= 12

Q1= (n+1)/4
(12+1)/4 =13/4
=3.25th term

3rd term+0.25(4th-3rd)
140+0.25(141-140)
140+0.25
140.25

Q3 = 3(n+1)/4 th term
=9.75th term
9th term+ .75(10th term-9th term)
143+0.75(144-143)
143.75

Q.D= Q3-Q1/2
143.75-140.25
=3.5/2
=1.75

Coeff. of QD= (Q3-Q1/Q3+Q1)*100
(143.75-140.25)/(143.75+140.25)*100
=1.23%

Discreate series

Example:3-Calculate QD and Coeff. of QD?

xf
52
22
91
105
153

Sol-
Firstly Convert the series in ascending or descending order and calculate cf.

xfc.f
211
523
925
10510
15313
 here sum of f =13

Q1= (n+1)/4
=(13+1)/4
=14/4=3.5

3rd+ .50(4t-3rd)
=5+0.5(9-5)
=5+0.5(4)
=7


Q3=3(n+1)/4
=3(13/4)
=10.5 th term
10th term+0.5(11th -10th)

=10+0.5(15-10)
=10+.5(5)
=12.5

Q.D= (Q3-Q1)/2
=(12.5-7)/2
=5.5/2
=2.75


Coeff of Q.D= (Q3-Q1/Q3+Q1)*100
=(12.5-7/12.5+7.5)100
=28.2%
                       

3)Continuous series


                        

Example:4
Calculate Q.D and Coeff. of Q.D

Class interval C INo of student (f)
0-105
10--208
20-4016
40-607
60-904
40


For Q.D., first calculate cumulative frequencies as follows:
                           
Class interval C INo of student (f)CF
0-1055
10--20813
20-401629
40-60736
60-90440
40


 
Q1 is the size of n th/ 4 value in a continuous series.
 Thus, N=40 , 40/4=10, it is the size of the 10th value.

The class containing the 10th value is 10–20. Hence, Q1 lies in class 10–20. Now, to calculate the exact value of Q1 , the following formula is used:
                    
 

Where L = 10 (lower limit of the relevant Quartile class) 
c.f. = 5 (Value of c.f. for the class preceding the quartile class)
 i = 10 (interval of the quartile class), 
and f = 8 (frequency of the quartile class) 

Thus, 
 Q1= 10+(10-5)/8*10=16.25

Similarly, Q3 is the size of 3n/4 th value
value; i.e., 30th value, which lies in class 40–60. Now using the formula for Q3 , its value can be calculated as follows:

Q3=40+(30-29)/7*20=42.87

QD=Q3-Q1/2
(42.87-16.25)/2=13.31

Coeff. of QD=

(Q3-Q1/Q3+Q1)*100
(42.87-16.25)/42.87+16.25)*100
=44.64%


Standard Deviation

          

The standard deviation is the average amount of variability in your dataset. It tells you, on average, how far each value lies from the mean.

A high standard deviation means that values are generally far from the mean, while a low standard deviation indicates that values are clustered close to the mean.

Standard deviation is a useful measure of spread for normal distributions.

In normal distributions, data is symmetrically distributed with no skew. Most values cluster around a central region, with values tapering off as they go further away from the center. The standard deviation tells you how spread out from the center of the distribution your data is on average.

Many scientific variables follow normal distributions, including height, standardized test scores, or job satisfaction ratings. When you have the standard deviations of different samples, you can compare their distributions using statistical tests to make inferences about the larger populations they came from.


Standard deviation formulas 

a)Actual mean method of calculating  standard deviation
=√ Σ(X-)2/n
FormulaExplanation
Formula to find the standard deviation of a sample.
  • s = sample standard deviation
  • ∑ = sum of…
  • X = each value
  •  = sample mean
  • n = number of values in the sample
b)Assumed Mean Method
d= x-a
c) Coeff. of S.D= 
σ 
d)Variance= (X-)
e) Coeff. of Variance= (σx̅)*100
i)Individual series:
For Example
we have given data set below:
Data Set(score)
466932605241
Sol-we have the formula =√ Σ(X-)2/n

1.Find the mean.

 = (46 + 69 + 32 + 60 + 52 + 41) ÷ 6 = 50

2.Find each score’s deviation from the mean.


ScoreDeviation from the mean
4646 – 50 = -4
6969 – 50 = 19
3232 – 50 = -18
6060 – 50 = 10
5252 – 50 = 2
4141 – 50 = -9

3. Square each deviation from the mean and find the sum of square.
ScoreDeviation from the mean(X-x̅)2
4646 – 50 = -416
6969 – 50 = 19361
3232 – 50 = -18324
6060 – 50 = 10100
5252 – 50 = 24
4141 – 50 = -981
886

4.Put the values in the formula

 Σ(X-)2/n

=√886/6
Standard deviation= √147.6 = 12.14
Coeff. of S.D= 
σ
=12.14/50
=.2428
Variances= (σ)
2=147.6
Coeff. of Variance= (σ)*100
=24.28%
Example 2.
Calculate Standard Deviation by Assumed Mean Method
10,12,13,15,20
Sol-
Firstly we will find the some of d and sum of square of d according to formula,
           
                                   
Here we assume d= 13(middle value) ,

xd=x-13d2
10-39
12-11
1300
15416
20749
totalΣd=5Σd2=63

Now we will put the values in formula
=√ 63/5-(5/5)2
=√12.6-1
=√11.6
=3.4


Coeff. of S.D= 
σ

=3.4/14
=.2428


Variance=(
σ)2

11.6


Coeff. of Variance= (
σx̅)*100

=.2428*100
=24.28

2.Continuous Series

σ= ( Σfd2/N(Σfd/N)2 ) x h

d= x-a/h
h=class interval
mean=A+(Σfd/Σf) x h

Example:
Calculate S.D ,Coeff. of ,  S.D,  Variance and Coeff. of Variance.

∑ fd
i
(xifd
x¯)2
N

Classf
0-1015
10-2015
20-3023
30-4022
40-5025
50-6010
60-705
70-8010

Sol-

a=35
ClassfMid Valued

  d2

fd

fd2

0-10155-39-45135
Oct-201515-24-3060
20-302325-11-2323
30-4022350000
40-502545112525
50-601055242040
60-70565391545
70-80107541640160
N=1252488


σ=(√ 488/125-(2/125)2  ) *10

=(3.094-0.0003)*10
=1.976*10
=19.76

Coeff. of S.D= σ

Mean=A+(Σfd/Σf) x h

35+(2/125)*10
35+0.16
35.16


σ/ 
=19.16/35.16
=0.5620


Variance=(σ)
2
=(19.76)2
=390.45


Coeff. of Varience=
σ/ x̅*100
=56.2%


Mean Deviation


 MEASURES OF DISPERSION FROM AVERAGE


Recall that dispersion was defined as the extent to which values differ from their average. Range and quartile deviation are not useful in measuring, how far the values are, from their average. Yet, by calculating the spread of values, they do give a good idea about the dispersion. Two measures which are based upon deviation of the values from their average are Mean Deviation and Standard Deviation.
 Since the average is a central value, some deviations are positive and some are negative. If these are added as they are, the sum will not reveal anything. In fact, the sum of deviations from Arithmetic Mean is always zero. Look at the following two sets of values.


Mean Deviation

Suppose a college is proposed for students of five towns A, B, C, D and E which lie in that order along a road. Distances of towns in kilometers from town A and number of students in these towns are given below: 
    
TownDistance from townStudents
a090
b2150
c6100
d14200
e1880
620

Now, if the college is situated in town A, 150 students from town B will have to travel 2 kilometers each (a total of 300 kilometers) to reach the college. The objective is to find a location so that the average distance travelled by students is minimum.
 You may observe that the students will have to travel more, on an average, if the college is situated at town A or E. If on the other hand, it is somewhere in the middle, they are likely to travel less. Mean deviation is the appropriate statistical tool to estimate the average distance travelled by students. Mean deviation is the average. The average used is either the arithmetic mean or median .
Calculation of Mean Deviation from Arithmetic Mean for ungrouped data

 Direct Method Steps:
 (i) The A.M. of the values is calculated
 (ii) Difference between each value and the A.M. is calculated. All differences are         considered positive. These are denoted as |d|
 (iii) The A.M. of these differences (called deviations) is the Mean Deviation

                   

Calculate the mean deviation of the following values; 2, 4, 7, 8 and 9.
A.M=SUM of X/N
30/5=6

XI X-X bar=d I
24
42
71
82
93
12

M.D(x) =12/5=2.4
Mean Deviation from median for ungrouped data.
Method Using the values in Example 3, M.D. from the Median can be calculated as follows

XId I (X-median)
25
43
70
81
92
11
,
Steps:
 (i) Calculate the median which is 7.
 (ii) Calculate the absolute deviations from median, denote them as |d|.
 (iii) Find the average of these absolute deviations.
M.D (median)=11/5=2.21 

Mean Deviation from Mean for Continuous Distribution 
Example:
Profits of  Companies (Rs in lakh) Class intervalsNumber of companies
10-205
20-308
30-5016
50-708
70-803
40

Steps:
 (i) Calculate the mean of the distribution.
 (ii) Calculate the absolute deviations |d| of the class midpoints from the mean.
 (iii) Multiply each |d| value with its corresponding frequency to get f|d| values. Sum     them up to get Σ f|d|.
(iv) Apply the following formula, 

M.D( x ) = Σ f | d | /  Σ f

Profits of  Companies (Rs in lakh) Class intervalsNumber of companies(F)Mid-pointI d IfI d I
10-2051525125
20-3082515120
30-50164000
50-7086020152
70-8037535102
40499


M.D( x ) = Σ f | d | /  Σ f
=499/40
=12.47

Means deviation : comment
Mean deviation is based on all values. A change in even one value will affect it. Mean deviation is the least when calculated from the median i.e., it will be higher if calculated from the mean. However it ignores the signs of deviations and cannot be calculated for open ended distributions.

As you can see in above example ,Mean is same but dispersion is very at every point .If you have another value which reflects the quantum of variation in values, your understanding of a distribution improves considerably. For example, per capita income gives only the average income. A measure of dispersion can tell you about income inequalities, thereby improving the understanding of the relative standards of living enjoyed by different strata of society. 






Thank you

For free study materials and for more information,

Please visit our website-Study.taxwill.in

For one to one tutor :call on  +918700745227

Mail us on taxwill4u@gmail.com  

  • Business studies
  • Economics
  • Accounts