Part Three ms-95
Part Three ms-95
by averaging their rank positions. Suppose our sample size is 6, and the di values Interpretation of Data :
are Nonparametric Tests
-0.5, - 1, -2, -2, 2, 3
The absolute values are 0.5, 1, 2, 2, 2, 3
3+4+5
We assign a rank of 4 to each of the third, fourth and fifth pair.
3
5) The test statistic T is calculated which happens to be smaller sure of like signed
ranks. T is obtained by totalling all the ranks with positive signs and totalling
separately all the ranks with negative signs. The smaller of these two sum's is T.
6) For the purpose of accepting or rejecting the null hypothesis of no difference
between the values of the given pairs of observations at a desired level of
significance, we compare the observed value of T with the tabulated value given
in Table 6 at the end of this unit. If observed (calculated) value of T is less than
or equal to the tabulated value, we reject the null hypothesis.
Let us illustrate the test with the help of following example.
Example : An experiment is conducted to judge the effect of brand name on quality
perception. 16 subjects are recruited for the purpose and are asked to taste and
compare two samples of product on a set of scale items judged to be ordinal. The
following data are obtained :
Pair Brand A Brand B
1 73 51
2 43 41
3 47 43
4 53 41
5 58 47
6 47 32
7 52 24
8 58 58
9 38 43
10 61 53
11 56 52
12 56 57
13 34 44
14 55 57
15 65 40
16 75 68
Test the hypothesis, using Wilcoxan Matched-Pairs signed rank test, that there is no
difference between the perceived quality of the two samples. Use 5% level of
significance.
Solution : The null hypothesis to be tested is
Ho : There is no difference between the perceived quality of two samples against the
alternative hypothesis
H1 : There is difference between the perceived quality of two samples.
The value of T statistic can be worked out as under :
25
Data Presentation and Analysis
The pair number 8 is dropped as `d' value of this is zero and therefore our sample
size. reduces to n = (16 - 1) = 15
For a two-tailed test, the table value of T at 5% level of significance when n = 15 is
25. The calculated value of T is 18.5 which is less than the table value of 25.
Therefore, the null hypothesis is rejected and we conclude that there is difference
between the perceived quality of the two samples.
In the case of large sample where n exceeds 25, the sampling distribution of T is
26
………………………………………………………………………………………… Statistical Analysis and
………………………………………………………………………………………… Interpretation of Data :
………………………………………………………………………………………… Nonparametric Tests
…………………………………………………………………………………………
…………………………………………………………………………………………
Activity 6
The following are the weight gains (in pounds) of two random samples of young
Turkeys fed two different diets but otherwise kept under identical conditions:
Use the Mann-Whitney U test at the 0.01 level of significance to test the null
hypothesis that the two populations sampled have identical distributions against the
alternative hypothesis that on the average the second diet produces a greater gain in
weight.
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
Activity 7
This test for k samples is an extension of the median test for two samples. Here the
elements of all the samples are pooled together and the combined median is found
out. Then the sample elements are tabulated in the form of a 2 X k matrix with
respect to the combined, median. For example, in case of a 3-sample study, the 27
tabulate result would be as follows:
Data Presentation and Analysis
If all the sample elements are from the same population or from populations with
same median, equal number of observations lie in the two classifications: above the
median and below the median. The associated probability is given by
If this probability is smaller than a, the level of significance, then the null hypothesis
is rejected. i.e., the samples do not belong to the same population. For large samples,
X2 test with (k - 1) degrees of freedom is used to accept or reject the null hypothesis.
The Kruskal-Wallis Test
In the test, all the elements of different samples are pooled together and they are
ranked with the lowest score receiving a rank value of 1. Ties are treated in the usual
fashion for ranking data. If all the samples belong to the same population (the null
hypothesis), then the sum of the ranks of the elements of each sample would, be
equal. Let ri be the sum of the ranks of the elements of the i th sample. The Kruskal-
Wallis test uses the X2-test to test the null hypothesis. The test statistics is given by
28
Solution : In order to use the Kruskal-Wallis test, we pool the elements and rank
them. These rankings are given below:
The critical value of H from X2-distribution with two degrees of freedom at 5 per cent level of
significance is 5.99. As the calculated value of H is less than critical value, the null Statistical Analysis and
hypothesis is accepted. Thus, all the three fertilizers yield the same level of output. Interpretation of Data :
Nonparametric Tests
Activity 8
Using the median test for the data given above, draw your conclusions.
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
Activity 9
Use the Kruskal-Wallis H test at 5% level of significance to test the null hypothesis
that a professional bowler performs equally well with the four bowling balls, given
the following results :
Bowling Results in Five Games
With Ball No. A 271 282 257 248 262
With Ball No. B 252 275 302 268 276
With Ball No. C 260 255 239 246 266
With Ball No. D 279 242 297 270 258
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
8.5 SUMMARY
In this unit, we have introduced the role of nonparametric statistical tests in the
analysis of statistical data. When the basic assumptions underlying the parametric
tests are not valid or when one does not have the knowledge of the distribution of the
parameter, then the nonparametric tests are the appropriate tests to draw inferences
about the hypotheses. These tests are more suitable for analysing ranked, scaled or
rated data. However, when the basic assumptions underlying the parametric tests are
valid, the nonparametric tests are less powerful than the parametric tests. Therefore,
if for such problems nonparametric tests are used, there is a greater risk of accepting
a false hypothesis and thus committing a type II error.
These nonparametric tests are grouped into one sample, two sample and k sample
tests. One sample tests are used to study whether there is significant difference
between observed and expected frequencies, or whether it is reasonable to accept that
the sample drawn is a random sample or whether the sample has been drawn from a
specified population. The two sample tests are used to study the effectiveness of two
treatments, two methods or two strategies, etc. In problems of this type, there can be
two independent samples or the same sample elements are studied again. K-sample
tests are an extension of two sample tests. Here, instead of two samples we have
more than two samples being studied simultaneously.
Bernard Ostle and Mensing RW, 1975, Statistics in Research, the Iowa State
University Press, Ames.
Gravetter, FJ and Wallnau, LB, 1985, Statistics for the Behavioural Sciences, West
Publishing Co, St Paul.
Shenoy, GV and Pant Madan, 1990, Statistical Methods in Business and Social
Sciences, Macmillan India Ltd., New Delhi.
Srivastava, UK, Shenoy, GV and Sharma, SC, 1989, Quantitative Techniques for
Managerial Decisions, (2nd Ed), Wiley Eastern, New Delhi. ,
31
Data Presentation and Analysis
T
a
b
l
e
2
C
r
i
t
i
c
a
l
v
a
l
u
e
s
o
f
r
he runs test
i Given in the tables are various critical values of r for values of n, and n 2 less than or
n equal to 20. For the one-sample runs test, any observed value of r which is less than
or equal to the smaller value, or is greater than or equal to the larger value in a pair is
t significant at the α = .05 level.
32
Statistical Analysis and
Interpretation of Data :
Nonparametric Tests
33
Data Presentation and Analysis
34
Statistical Analysis and
Interpretation of Data :
Nonparametric Tests
35
Data Presentation and Analysis
36
Statistical Analysis and
Interpretation of Data :
Nonparametric Tests
37
Data Presentation and Analysis
38
Statistical Analysis and
Interpretation of Data :
Nonparametric Tests
39
Statistical Analysis and
UNIT 9 MULTIVARIATE ANALYSIS OF Interpretation of Data :
Nonparametric Tests
DATA
Objectives
After studying this unit, you should be able to:
explain the concept of association that takes place between a dependent variable
and a set of independent variables
describe the various multivariate procedures available to analyse associative data
in the context of any research problem
interpret the findings of multivariate analysis in any research study
use a particular technique of multivariate analysis suitable for a particular
business research problem.
Structure
9.1 Introduction
9.2 Regression Analysis
9.3 Discriminant Analysis
9.4 Factor Analysis
9.5 Summary
9.6 Self-assessment Exercises
9.7 Further Readings
9.1 INTRODUCTION
In MS-8 (Quantitative Analysis for Managerial Applications), we have covered the
fundamentals of statistical inference with special emphasis on hypothesis testing as
an effective tool for business decisions. Univariate analysis forms the foundation for
the development of multivariate analysis, which is the topic of discussion in this unit.
While the concept of the univariate analysis will continue to draw our attention time
and again, our focus in this unit will be on procedures of multivariate analysis which
has emerged as the most striking trend in business research methodology.
Description and analysis of associative data involves studying the relationship and
degree of association among several variables and therefore multivariate procedures
become imperative. We shall attempt to highlight the procedures. We have discussed
here three multivariate techniques, namely Regression Analysis, Discriminant
Analysis and Factor Analysis.
Regression analysis finds out the degree or relationship between a dependent variable
and a set of independent variables by fitting a statistical equation through the method
of least square. Whenever we are interested in the combined influence of several
independent variables upon a dependent variable our study is that of multiple
regression. For example, demand may be influenced not only by price but also by
growth in industrial production, extent of import prices of other goods, consumer's
income, taste and preferences etc. Business researchers could use regression for
explaining per cent variation in dependent variable caused by a number of
independent variables and also problems involving prediction or forecasting.
Discriminant analysis is useful in situations where a total sample could be classified
into mutually exclusive and exhaustive groups on the basis of a set of predictor
variables. Unlike the regression analysis these predictor variables need not be
independent! For example, one may wish to predict whether sales potential in a
'particular marketing territory will be `good' or `bad' based on the territory's personal
disposal income, population density and number of retail outlets. You may like to
classify a consumer as a user or non-user of one of the five brands of a product based
on his age, income and length of time spent in his present job. Here the interest is
what variables discriminate well between groups. 39
Data Presentation and Analysis
Factor analysis the identification of factors that determine the company's image. When the decision
provides an maker is overwhelmed by many variables the factor analysis comes to his help in
approach that compressing many variables into a few meaningful dimensions, like service
reduces a set of orientation, quality level and width of assortment in a research project involving 20
variables into retail chains on 35 factors or variables.
one or more
underlying 9.2 REGRESSION ANALYSIS
variables. The Regression analysis is probably the most widely applied technique amongst the
technique analytical models of association used in business research. Regression analysis
groups together attempts to study the relationship between a dependent variable and a set of
those variables independent variables (one or more). For example, in demand analysis, demand is
that seem to versely related to price for normal commodities. We may write D = A - BP, where D
belong together is, the demand which is the dependent variable, P is the unit price of the commodity,
and an independent variable. This is an example of a simple linear regression equation.
simultaneously The multiple linear regressions model is the prototype of single criterion/ multiple
supplies the predictor association model where we would like to study the combined influence of
weighing several independent variables upon one dependent variable. In the above example if P
scheme. For is the consumer price index, and Q is the index of industrial production, we may be
example, 'one able to study demand as a function of two independent variables P and Q and write D
may be = A - BP + C Q as a multiple linear regression model.
interested in
The objectives of the business researchers in using Regression Analysis are :
1) To study a general underlying pattern connecting the dependent variable and
independent variables by establishing a functional relationship between the two.
In this equation the degree of relationship is derived which is a matter of interest
to the researcher in his study.
2) To use the well-established regression equation for problems involving
prediction and forecasting.
3) To study how much of the variation in the dependent variable is explained by the
set of independent variables. This would enable him to remove certain unwanted
variables from the system. For example, if 95% of variation in demand in a study
could be explained by price and consumer rating index, the researcher may drop
other factors like industrial production, extent of imports, substitution effect etc.
which may contribute only 5% of variation in demand provided all the causal
variables are linearly independent.
We proceed by first discussing bivariate (simple) regression involving the dependent
variables as a function of one independent variable and then onto multiple regression.
Simple linear regression model is given by
Y=0 1X1
where Y is the dependent variable,
X1 is independent variable
is a random error term
0 and 1 are the regression coefficients to be estimated.
Assumptions of the model
1) The relationship between Y and XI is linear.
2) Y is a random variable which follows a normal distribution from which sample
values are drawn independently.
3) XI is fixed and is non-stochastic (non-random).
4) The means of all these normal distribution of Y as conditioned by Xi lie on a
straight line with slope 1 .
40 5) is the error term ∩ IND (0, 2 ) and independent of X1.
a+bX1
estimates of 0 and 1 obtained through the method of least square by minimising
the error sum of squares.
We state the normal equations without going into any derivations. The normal
equations are :
Strength of association
It is one thing to find the -regression equation after validating the linearity
relationship; but at this point we still do not know how strong the association is. In
other words, how well does X1 predict Y?
This is measured by the co-efficient of determination
2 RSS
r = TSS = Variation in Y explained by regression compared to total variation.
Higher the r2, greater is the degree of relationship.
The product moment correlation or simple correlation co-efficient between Y and X1
RSS
is = r2
TSS
r2 lies between 0 and 1.0 measuring no correlation and 1 measuring perfect
correlation.
r lies between - 1 and + 1 and the sign of r is determined by the sign of the sample
regression coefficient (b) in the sample regression equation
□
Y a+bX1
Having given a foundation structure with underlying assumptions and possible
analysis of the model, we now turn our attention to a numerical example to clarify the
concepts. It is needless to mention that analysis of data and interpretation of the
results are of paramount importance.
Suppose that a researcher is interested in consumer's attitude towards nutritional diet
41
of a ready-to-eat cereal.
Data Presentation and Analysis
X1: the amount of protein per standard serving
In the nature of a pretest, the researcher obtains consumer's interval-scaled evaluation
of the ten concept descriptions, on a preference rating scale ranging from 1, dislike .
extremely, upto 9, like extremely well. The data is given below.
i) Fit a linear regression model of Y on X1.
ii) Test the validity of the equation statistically.
iii) What do you think of the strength of association?
Answer
42
Statistical Analysis and
Interpretation of Data :
Nonparametric Tests
This can now be solved as a simple linear regression model for forecast where Z
is dependent variable and t is independent variable as before.
3) Double log form
This fan now be slowed as normal bivariate regression equation to forecast sales
for the next period.
It is time to introduce the concept of multiple regression model 43
Data Presentation and Analysis
Y=β0 +β1X1 β2X2........... βk Xk
The assumptions are exactly same as simple linear regression except that you add X1,
X2,……...Xk in the place of X1 because Y is linearly related to X1...............Xk and our
aim is to understand the combined influence of the K factors X1, X2…....Xk on Y. To
understand the clearly, let us study a case of 2 independent variables and write the model as
concept Y=β0 +β1X1 β2X2
□
so that Y = a + b XI + cX2 being the estimated regression equation where we add
one more independent variable X 2 in the model. Suppose we extend the previous
example of bivariate regression on preference rating Vs. protein (X 1) by adding X2:
the percentage of minimum daily requirements of vitamin D per standard serving. Let
us see how the multiple regression model emerges to explain the variation in the
dependent variable Y caused by X1 and X2. Let us look at the following table giving
the data on Y, X1, and X2.
Correlation Matrix
1 .85 .85
.85 1 .84
.85 .84 1
Variance Covariance Matrix
45
Data Presentation and Analysis
The program output gives many other statistical analysis which we will not touch
upon now, and come to our important tests straightway. The residual or error between
Y and Y i.e. between actual and forecast on important measure of reliability of the
model is printed out for each observation. If you look at the errors, you get a fairly
good idea about the model equation. However for validity of the regression equation,
you look first at the co-efficient of multiple determination R 2 and multiple correlation
co-efficient R. In our example R 2.= 0.78 and R = 0.89 which is a satisfactory one
indicating that the preference rating Y is linearly related to protein intake X 1 and
vitamin D intake X2. It tells that 78% of variation in Y is explained jointly by the
variations in X1 and X2 jointly.
Hypothesis testing for linearity through ANOVA.
Activity 2
In what ways can multiple regression be used to forecast some industry's sales?
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
Activity 3
Zippy Cola is studying the effect of its latest advertising campaign. People chosen at
random were called and asked how many cans of Zippy Cola they had bought in the
past week and how many Zippy Cola advertisements they had either read or seen in
the past week.
X (number of ads) 4 9 30162 5
Y (cans purchased) 12 14 7 6 3 5 6 10
a) Develop the estimating equation that best fits the data.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
b) Calculate the sample co-efficient of determination and interpret it.
…………………………………………………………………………………
…………………………………………………………………………………
………………………………………………………………………………….
c) Forecast the number of cans purchased when the numbers of advertisements
seen or read in the past week were 10.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
………………………………………………………………………………… 47
Data Presentation and Analysis
9.3 DISCRIMINANT ANALYSIS
It has been pointed out earlier, that the discriminant analysis is a useful tool for
situations where the total sample is to be divided into two or more mutually exclusive
and collectively exhaustive groups on the basis of a set of predictor variables. For
example, a problem involving classifying sales people into successful and
unsuccessful; classifying customers into owners or and non-owners of video tape
recorder, are examples of discriminant analysis.
Objectives of two group discriminant analysis:
1) Finding linear composites of the predictor variables that enable the analyst to
separate the groups by maximising among groups relative to with in-groups
variation.
2) Establishing procedures for assigning new individuals, whose profiles but not
group identity are known, to one of the two groups.
3) Testing whether significant differences exist between the mean predictor
variable profiles of the two groups.
4) Determining which variables account most for intergroup differences in mean
profiles.
A numerical example
Let us return to the example involving ready-to-eat cereal was presented in the
regression analysis. However, in this problem the ten consumer raters are simply
asked to classify the cereal into one of two categories like versus dislike. The data is
given below: Here again
X1 : The amount of protein (in grams) per standard serving.
X2 : The percentage of minimum daily requirements of vitamin D per standard
serving.
Also shown in the data table are the various sums of squares and cross products, the
means on X1 and X2 of each group, and total sample mean.
Consumer evaluations (like versus dislike) of ten cereals varying in nutritional
content
We first note from the table that the two groups are much more widely separated on
X1 (Protein) than they are on X2 (Vitamin D). If we were forced to choose just one of
the variables, it would appear that X 1 is a better bet than X 2. However, there is
information provided by the group separation on X 2, so we wonder if some linear
composite of both X1 and X2 could do better than X1 alone. Accordingly, we have the
following linear function :
Z = K1X1 + K2X2 where K1 and K2 are the weights that we seek.
48
But how shall we define variability? In discriminant analysis, we are concerned with the ratio of two
sums of squares after the set of scores on the linear composite has been computed. Statistical Analysis and
One sum of squared deviations represents the variability of the two group means on Interpretation of Data :
the composite around their grand mean. The second sum of squared deviations Nonparametric Tests
represents the pooled variability of the individual cases around their respective group
means also on the linear composite. One can then find the ratio of the first sum of
squares to the second. It is this ratio that is to be maximised through the appropriate
selection of K1 and K2. Solving for K1 and K2 involves a procedure similar to the one
encountered in the multiple regression. However, in the present case, we shall want
to find a set of sums of squares and cross products that relate to
the variation within groups. For ease of calculation let us define x1 = X1 - X1 and x2
= X2 - X2 (i.e. each observation measured from its mean).
Solving for K1 and K2.
Mean corrected sums of squares and cross products
We note that the discriminant function "favours" X1 by giving about 2.5 times the
(absolute value) weight (K1 = 0.368 versus K2 = - 0.147) to X1 as is given to X2.
The discriminant scores of each person are shown below. Each score is computed by
the application of the discriminant function to the persons original X1 and X2 Values.
Since the normal equations for solving K1 and K2 are obtained by maximising the
ratio between group and within group variance the discriminant criterion as 49
Data Presentation and Analysis
calculated above = 3.86 will be the maximum possible ratio. If we suppress X 2 in the
discriminant function and calculate another C, it will be less than 3.86. It is rather
interesting that the optimal function Z = 0.368 X 1 – 0.147 X2 is a difference function
in which X2 on the left of the midpoint (1.596) to the disliker group.
(Vitamin D) • Assign all cases with discriminant scores that are on the right of the midpoint
receives a (1.596) to the Liker group.
negative That is all true dislikers will be correctly classified as such and all true likers will be
weight correctly classified. This can be shown by a 2 X 2 table
bringing
thereby the
importance of
X1 to the
highest order.
This means
protein is much
more important Testing Statistical Significance
than Vitamin While the discriminant function does perfectly in classifying the ten cases of the
D. illustration on protein (X1) and vitamin (X2) into likers and dislikers, we still have not
Classifying the persons tested whether the group means differ significantly. This is also based on F ratio
It is all well which required calculation of Mahalanobis D2. This calculation of F is little
and good to complicated which is normally an output parameter in the standard package like
find the
discriminant
function, but
the question is
how to assign Biomedical computer program and SPSS of IBM. Biomedical computer program of
the persons to the University of California press is an outstanding software containing all
the relevant multivariate procedures. For our illustration let us calculate F
groups.
where n1 = number of observations in group 1
n2 = number of observations in group 2
m = number of independent variables
D2 = Malialanobis square distance
In our problem n1 = 5
• Assign n2 = 5
all cases m = 2 (X1 and X2)
with Simple way of calculating D 2 would be to use the discriminant function
discrimin D2 = (n1 + n2 - 2) (0.368 (5.0) - 0.147 (2))
ant
= 8 (0.368 X 5 - 0.147 X 2) = 12.353
scores
that are You please note that the expression within brackets is the discriminant function Z =
0.368 X1 - 0.147 X2 where X1 and X2 are substituted by the respective group means
difference : X1 (likers) - X1 (dislikers) X2 (likers) - X2 (dislikers)
F= 5 5(5+5-2-1)
50 12.353
2 (5+5)(5+5-
2)
25 7 Statistical Analysis and
=
2 10 12.353 13.511 Interpretation of Data :
Nonparametric Tests
8
52
Can we now collapse the seven variables into three factors? Intuition might suggest
the presence of three primary factors: A maturity factor revealed in age/children/size
of household, physical size as shown by height and weight, and intelligence or
training as revealed by education and IQ.
The sales people data have been analysed by the SAS program. This program accepts
data in the original units, automatically transforming them into standard scores. The
three factors derived from the sales people data by a principal component analysis
(SAS program) are presented below
Three-factor results with seven variables.
Factor Loading: The co-efficients in the factor equations are called "factor loadings" Statistical Analysis and
They appear above in each factor column, corresponding to each variable. The Interpretation of Data :
equations are : Nonparametric Tests
The factor loadings depict the relative importance of each variable with respect to a
particular factor. In all the three equations, education (X 3) and IQ (X7) have got
positive loading factor indicating that they are variables of importance in determining
the success of sales person.
Variance summarised : Factor analysis employs the criterion of maximum
reduction of variance - variance found in the initial set of variables. Each factor
contributes to reduction. In our example Factor I accounts for 51.6 per cent of the
total variance. Factor II for 26.4 per cent and Factor III for 16.5 per cent. Together
the three factors "explain" almost 95 percent of the variance.
Communality : In the ideal solution the factors derived will explain 100 per cent of
the variance in ea h of the original variables, "communality" measures the percentage
of the variance in the original variables that is captured by the combination of factors
in the solution.. Thus a communality is computed for each of the original variables.
Each variables communality might be thought of as showing the extent to which it is
revealed by the system of factors. In our example the communality is over 85 per
cent for every variable. Thus the three factors seem to capture the underlying
dimensions involved in these variables.
There is yet another analysis called varimax rotation, after we get the initial results.
This could be employed if needed by the analyst. We do not intend to dwell on this
and those who want to go into this aspect can use SAS program for varimax rotation.
In the concluding remarks, it should be mentioned that there are two important
subjective issues which should be properly resolved before employing factor analysis
model. They are :
1) How many factors should be employed in attempting to reduce the data? What
criteria should be used in establishing that number?
2) The labelling of the factors is purely intuitive and subjective.
Activity 6
Mention briefly the purpose and uses of factor analysis.
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
9.5 SUMMARY
In this unit, we have given a brief introduction of the multivariate tools and their
applicability in the relevant problem areas.
We started the discussion with the regression analysis, which is probably the most
widely used technique amongst the analytical models of association. We have started 53
Data Presentation and Analysis
the simple linear regression model first to introduce the concept of regression and
then moved on to the multiple linear regression mode. All the underlying
assumptions of the model have been explained. Both the bivariate and multivariate
regression models have been illustrated using the example of preference rating as a
function of protein intakes, and vitamin D intake perception in the case of a ready-to-
eat cereal. The concept of testing the linear equation, contribution made by regression
in explaining variation in dependent variables and strength of association have all
been explained using ANOVA table. A brief account of the role of regression in sales
forecasting involving time series analysis has also been given. The need for resorting
to computer solutions for large number of variables and observations has been
brought out with an actual print out of the example already discussed. The concept of
stepwise regression and the problems encountered in any regression analysis have
also been explained.
Next, we have gone to the discriminant analysis technique-a technique when the
interest is to classify the groups on the basis of a set of predictor variables. We have
explained the concept of separation by giving examples of classifying sales people
into successful and unsuccessful, customers into owners and non-owners etc. As
before, we have begun the discussion with discriminant function involving two
predictor variables using the example of `ready-to-eat cereal problem' but with a
difference-classifying the persons into liker group and disliker group. The
discriminant function, the discriminant criterion and the assignment rule have all
been explained. Testing the statistical significance using F test based on Mahalanobis
D2 has also been carried out. We have pointed out that the multiple discriminant
analysis involving more than two predictor variables require the use of computer
although the basic structure of the model does not change.
Factor analysis is the last multivariate tool that we have discussed in this unit. We
have first mentioned that the fundamental objective of factor analysis is to reduce the
number of variables in the data matrix. Then it has been pointed out that the
computation of any factor analysis involves very complex calculation which will
have to be solved using computer packages like SAS. The concepts of "factor
loading", "variance summarised" and "communality" have been explained using one
practical example that has been solved by SAS program. The subjective issues like
"how many factors?" "what criteria to decide this number?" and "labelling of the
factors" have been mentioned at the end.
As concluding remarks, it may be mentioned here that 1) all multivariate procedures
can be more effectively solved using standard computer packages when the number
of variables and number of observations increase significantly, 2) what is more
important is the ability to interpret the results of the market research study involving
multivariate analysis.
58 Fig.2: A logical flow model for material procurement decisions with quantitative discounts
allowed.
Activity 2
Activity 3
Mention below a mathematical model which has been used for sales forecasting by
your organisation or any organisation you know of. Think of a
production
………………………………………………………………………………………… decision situation
………………………………………………………………………………………… and present it
………………………………………………………………………………………… diagrammatically
………………………………………………………………………………………… using logical flow
………………………………………………………………………………………… model.
…………………………………………………………………………………………
………………………………………………………………………………………… ……………………
…………………………………………………………………… Model Building and
………………………………………………………………………………………… Decision-making
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
The example we will consider here is the case of a co-operative state level milk and
milk products marketing federation. The federation has a number of district level
dairies affiliated to it, each having capacity to process raw milk and convert it into a
number of milk products like cheese, butter, milk powders, ghee, shrikhand, etc. The
diagrammatic model of the processes in this set up is depicted in Figure 3.
I he typical problems faced by the managers in such organisations are that : (a) the
amount of milk procurement by the individual district dairies is uncertain, (b) there
are limited processing capacities for different products, and (c) the product demands
are uncertain and show large fluctuations across seasons, months and even weekdays.
The type of decisions which have to be made in such a set up can be viewed as a
combination of short/ intermediate term and long-term ones. The short-term decisions
are typically product-mix decisions like deciding : (1) whereto produce which
product and (2) when to produce it. The profitability of the organisation depends to a
great extent on the ability of the management to make these decisions optimally. The
long-term decisions relate to (1) the capacity creation decisions such as which type of
new capacity to create, when, and at which location(s) and (2) which new products to
go in for. Needless to say, this is a rather complex decision-making situation and
intuitive or experience based decisions may be way off from the optimal ones.
Modelling of the decision-making process and the interrelationships here can prove
very useful. 59
Data Presentation and Analysis
In absence of a large integrated model, a researcher could attempt to model different
subsystems in this set up. For instance, time series forecasting based models could
prove useful for taking care of the milk procurement subsystem; for the product
demand forecasting one could take recourse, again, to time series or regression based
models; and for product-mix decisions one could develop Linear Programming based
models.
60
Macro vs. Micro Models: The terms macro and micro in modelling are also referred
to as aggregative and disaggregative respectively. The macro models present a
holistic picture of a decision-making situation in terms of aggregates. The micro
models include explicit representations of the individual components of the system.
Static vs. Dynamic Models : The difference between the Static and Dynamic models
is vis-à-vis the consideration of time as an element in the model. Sta4c models
assume the system to be in a balance state and show the values and relationships for
that only. Dynamic models, however, follow the changes over time that result from
the system activities. Obviously, the dynamic models are more complex and more
difficult to build than the static models. At the same time, they are more powerful and
more useful for most real life situations.
Analytical Numerical Models : The analytical and the numerical models refer to the
procedures used to solve mathematical models. Mathematical models that use
analytical techniques (meaning deductive reasoning) can be classified as analytical
type models. Those which require a numerical computational technique can be called
numerical type mathematical models.
Deterministic vs. Stochastic Models : The final way of classifying models is into
the deterministic and the probabilistic/stochastic ones. The stochastic models
explicitly take into consideration the uncertainty that is present in the decision-
making process being modelled. We have seen this type of situation cropping up in
the case of the milk marketing federation decision-making. The demand for the
products and the milk procurement, in this situation (please refer Section 10.3) are
uncertain. When we explicitly build up these uncertainties into our milk federation
model then it gets transformed from a deterministic to a stochastic/ probabilistic type
of model.
Activity 5
……………………………………………………………………………………
……………………………………………………………………………………
…………………………………………………………………………………….
(b) Micro Model
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
61
Data Presentation and Analysis
(c) Determine Model
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
…………………… ……………………
…………………… …………………………………………………………………………………….
……………………
10.5 OBJECTIVES OF MODELLING
The objectives or purposes which underlie the construction of models may vary from
one decision-making situation to another. In one case it may be used for explanation
purposes whereas in another it may be used to arrive at the optimum course of action.
The first purpose is to describe or explain a system and the processes therein. Such
models help the researcher or the manager in understanding complex, interactive
systems or processes. The understanding, in many situations, results in improved
decision-making. An example of this can be quoted from consumer behaviour
problems in the realm of marketing. Utilising these models the manager can
understand the differences in buying pattern of household groups. This can help him
in designing hopefully, improved marketing strategies.
The second objective of modelling is to predict future events. Sometimes the models
developed for the description/explanation can be utilised for prediction purposes also.
Of course, the assumption made here is that the past behaviour is an important
indicator of the future. The predictive models provide valuable inputs for managerial
decision-making.
The last major objective of modelling is to provide the manager inputs on what he
should do in a decision-making situation. The objective of modelling here is to
optimize the decision of the manager subject to the constraints within which he is
operating. For instance, a materials manager may like to order the materials for his
organisation in such a manner that the total annual inventory related costs are
minimum, and the working capital never exceeds a limit specified by the top
management or a bank. The objective of modelling, in this situation, would be to
arrive at the optimal material ordering policies.
Activity 6
You may go through various issues of any management journal(s). It is vary likely
that you may come across a regression model for estimating, sales, advertisement
expenditure, price or any other variable. Discuss how the model may be used for the
following :
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
62 ……………………………………………………………………………………
(ii) Prediction of the future value of the dependent variable. ………………
………………
…………………………………………………………………………………… ………………
…………………………………………………………………………………… ………………
…………………………………………………………………………………… ………………
…………………………………………………………………………………… ……
(iii) Helping the decision maker decide what to do to achieve a given object.
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
Model Building and
10.6 MODEL BUILDING/MODEL DEVELOPMENT Decision-making
The approach used for model building or model development for managerial
decision-making will vary from one situation to another. However, we can enumerate
a number of generalized steps which can be considered as being common to most
modelling efforts. The steps are
6) Model calibration.
7) Implementation.
The decision problem for which the researcher intends to develop a model needs to
be identified and formulated properly. Precise problem formulation can lead one to
the right type of solution methodology. This process can require a fair amount of
effort. Improper identification of the problem can lead to solutions for problems
which either do not exist or are not important enough. A classic illustration of this is
the case of a manager stating that the cause of bad performance of his company was
the costing system being followed. A careful analysis of the situation by a consultant
indicated that the actual problem lay elsewhere, i.e., the improper product-mix being
produced by the company. One can easily see here the radically different solutions/
models which could emerge for the rather different identifications of the problem!
The next major step in model building is description of the system in terms of blocks.
Each of the blocks is a part of the system which has a few input variables and a few
output variables. The decision-making system as a whole can be described in terms
of interconnections between blocks and can be represented pictorially as a simple
block diagram. For instance, we can represent a typical marketing system in form of
a block diagram (please refer Figure 4). However, one should continuously question
the relevance of the different blocks vis-à-vis the problem definition and the
objectives. Inclusion of the not so relevant segments in the model increases the-
model complexity and solution effort. 63
Data Presentation and Analysis
A number of alternative modelling forms or structures may be possible. For instance,
in modelling marketing decision-making situations, one may ask questions such as
whether the model justifies assumptions of linearity, non-linearity (but linearizable)
and so on. Depending upon the modelling situation one may recommend the
appropriate modelling form. The model selection should be made considering its
appropriateness for the situation. One could evaluate it on criteria like theoretical
soundness, goodness of fit to the historical data, and possibility of producing
decisions which are acceptable in the given context.
Fig. 4: A Model of the Product and Information Flows in a Competitive Marketing Systems.
The final steps in the model development process are related to model calibration and
implementation. This involves assigning values to the parameters in the mode. When
sample data is available then we can use statistical techniques for calibration.
However, in situations where little or no data is available, one has to take recourse to
subjective procedures. Model implementation involves training the support personnel
and the management on system use procedures. Documentation of the model and
procedures for continuous review and modifications are also important here.
Activity 7
You are the personnel manager of a construction company. If you were asked to build
a model to forecast the manpower requirements of both skilled and non-skilled
workers for the next five year, list out the steps you may consider for building the
model.
A number of criteria have been proposed for model validation. The ones which are
considered important for managerial applications are face validity, statistical validity
and use validity.
64
In face validity, among other things, we are concerned about the validity of the model based, which are
structure. One attempts to find whether the model does things which are consistent found to be of use
with managerial experience and intuition. This improves the likelihood of the model for complex
actually being used. decision problems.
The term
In statistical validity we try to evaluate the quality of relationships being used in the `simulation' is used
model. The use validity criteria may vary with the intended use of the model. For to describe a
instance, for descriptive models one would place emphasis on face validity and procedure of
goodness of fit. establishing a
model and deriving
10.8 SIMULATION MODELS a solution
numerically. A
Simulation models are a distinct class of quantitative models, usually computer
series of trial and error experiments are conducted on the model to predict the Model Building and
behaviour of the system over a period of time. In this way the operation of the real Decision-making
system can be replicated. This is also a technique which is used for decision-making
under conditions of uncertainty. Generally, simulation is used for modelling in
conditions where mathematical formulation and solution of model are not feasible.
This methodology has been used in numerous types of decision problems ranging
from queuing and inventory management to energy policy modelling. A detailed
discussion of simulation is beyond the purview of this unit. For those of you who
would like to read more on this, we would recommend a comprehensive text like
Gordon (1987) or Shenoy et., al. (1983).
10.9 SUMMARY
In this unit we have briefly examined the role of models in managerial decision-
making research. We have also examined the different types of managerial decisions
and the process of decision-making. This was followed by a discussion on the type of
models and their characteristics. A specific model was discussed describing the
decision-making scenario in a milk marketing federation. We noted that there could
be three types of modelling objectives viz., descriptive, predictive and normative. A
brief description of the model development and validation processes followed.
Finally, a brief exposure to simulation models was provided.
Lilien, Gary L. and Philip Kotler, 1983. Marketing Decision Making; A Model
Building Approach, Harper & Row, New York.
66
Substance of Reports
UNIT 11 SUBSTANCE OF REPORTS
Objectives :
4) Making a decision
5) Drawing up an action plan
5) Working out a contingency plan
Problem
The problem is the beginning and the end of decision making. A start with a wrong
problem, a wrong hypothesis, or a wrong assumption will only solve a non-existing
problem or create a new problem.
In defining the problem, identify the following elements :
1) What is the situation, and what should it be? This question sets the overall
objective for the problem.
2) What are the symptoms, and what are the causes?
3) What is the central issue, and what are the subordinate issues?
4) What are the decision areas, and what needs to be done immediately, in the short
term, medium term, or long term?
For analysing a problem, Kepner and Tregoe (See under suggested readings)
recommend sorting out the information under what, where, when and extent across
what is and what is not, the distinctiveness in the situation, and changes which may
have taken place as follows:
After this analysis, compare the deduced causes with the actual or observed ones.
Constructing the Criteria
Words like aim, goal, objective, intention, purpose, and criterion are used sometimes
synonymously or with different meanings. Here the first five words are treated as
synonymous and are recognized at the problem definition stage itself while
identifying what the situation is and should be.
However, to bring the existing situation to what it should be, criteria or yardsticks are
used to evaluate options. Criteria link the problem definition with the option,
generation and evaluation.
In constructing the criteria, SWOT analysis is useful. Recognize the strengths and the
weaknesses of the decision maker and the organization and the opportunities
available and threats confronting the decision maker and the organization in a given
situation. This analysis helps in constructing the criteria which in turn help in
evaluating the options against the feasibility of implementation.
Further, ensure and explicitly clarify the following:
1) The criteria arise out of the problem definition and are not independent of it.
2) They are measurable or observable as much as possible. However, non-
quantifiable criteria are not ignored merely because they cannot be quantified.
3) They are prioritized and tradeoffs are recognized.
4) They encompass a holistic view-economics, personal, organizational, and
societal considerations.
5) They are not loaded or one-sided. Both pro and con aspects are considered.
7
Report Writing and
Presentation
Generating and Evaluating the Options
In generating options, creativity is required. Sometimes the options are obvious. But
one can look beyond the obvious.
Once a set of options have been generated, they are shortlisted and ranked by priority
or their probability of meeting the objectives, identified in the problem definition.
Then the options are evaluated against the criteria and possible implications in
imple ia or options depending upon which structure is easy to understand. For instance, if
ment the criteria are few and options are many, the presentation will be easy to understand
ation if it is structured by criteria. But if the options are few and criteria are many, the
witho presentation will be effective if it is structured according to options.
ut
Making a decision
losin
g The decision or recommendation flows out of the evaluation of the options, provided
track the thinking process has been logical so far.
of The recommendation should be an adequate response to the problem and
the implementable.
main
objec Drawing up an action plan
tive Even the best analysis can go waste if attention is not paid to the action plan. The
of action steps and their consequences should be visualized to avoid being caught
what unawares. Be clear of who does what, when, where, and how. Even at this stage we
the have to go through the problem solving steps in a futuristic scenario-what problems
situat do we anticipate, what objectives and criteria would we like to pursue, what options
ion would be open to us, and what choices can we make under what circumstances?
shoul
d be. Working out a contingency plan
The Administrators, executives, and managers thrive on optimism and confidence to get
evalu things done. Yet, if something can go wrong, it is likely to go wrong. They should
ation have the parachute ready to bail out. The contingency plan must emerge from the
proce action plan. There is need to think of how to achieve the second best objective if the
ss first one is not feasible.
dema
nds Conclusion
logic The problem solving approach helps only when one can question oneself again and
al again at every stage and bring to bear various' thought processes to do a
and comprehensive analysis and synthesis. Then only will the administrator, executive, or
critic manager be able to genuinely share his/her thoughts with the reader.
al
thinki If the problem solving approach and steps are used merely as a form filling exercise,
ng. a superficial analysis and report will result. An attractive package does not
necessarily mean a good product.
The
prese An executive report is not a summary of the view and information that a decision
ntatio maker has elicited but an analysis and synthesis of an integrated decision or
n of recommendation. Thinking through a decision making situation is an iterative act.
evalu A good decision report is structured sequentially but reflects comprehensively the
ation iterative thinking process of the decision maker(s).
is
struct Research Reports
ured Research reports contribute to the growth of subject literature. They pave the way for
by new information, significant hypotheses, and innovative and rigorous methods of
criter research and measurement. They broadly have the following organization
1)
Literature survey to find gaps in knowledge.
2) Nature and scope of the study, hypothesis to be tested, and significance and K
8
utility of the study.
3) Methodology for collecting data, conducting the experiment, and analysing the Substance of Reports
data.
4) Description and analysis of the experiment and data.
5) Findings.
6) Conclusions.
7) Recommendations.
8) Suggestions for further research.
9) Backup evidence and data.
Activity 2
Describe a strike or any other serious incident that has recently occurred in your
organisation and check whether your description answers all the questions indicated
under descriptive reporting.
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
Activity 3
Take a report of your organisation and check whether the problem solving approach
or descriptive approach has been used. If you were to rewrite the report, what will be
your contents outline and what stages would you do to improve the report.
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
11