Measurement
Contents :
  • Concept & Definitions of Measurement.
  • What is Measured ?
  • Measurement Process.
  • Functions of Measurement.
  • Measurement Techniques/Types.
  • Criteria for Good Measurement.
  • Difficulties in Measurement.
  • Problems in Measurement in Management Research.
  • Validity of Measurement Scale : Validity Techniques.
  • Reliability of Measurement Scales : Reliability Techniques.
  • Level of Measurement : Measurement Scales.

What is Measurement ?


Measurement is the process of quantifying and assigning numerical values to attributes or properties of objects, events, or phenomena in a structured and standardized manner.

Measurement is the mapping of the values on a set of numbers. Organizations involve lots of decision-making in various functional areas like production, marketing finance and personnel management. Various marketing decisions are heavily influenced by the predispositions or attitudes of the present and potential customers of the company and its products.

Measurement is he process by which the organization observes and records the observations that have been gathered as a result of some research activities, In other words, measurement can be defined as "the process of mapping the aspects of an area onto some other areas as per some rules".

For example : Researchers may wish to measure the percentage of people who make purchases the products for their company. For measurement, a scale is needed to be developed in the range that refers to a particular set, according to set theory. after this, mapping is carried-out on the observations, which are based on defined scale.

Definitions of Measurement 


According to G.C. Helmstadter :
"Measurement is a process of obtaining a numerical description of the extent to which a person or object possesses some characteristics".

According to Kerlinger : 
"Measurement is the assignment of numerals to objects or events according to rules".

The major application of such data is in the area of marketing where measurements are taken regarding predispositions or attitudes of current and potential customers of a company. By knowing about the attitudes of the customers, the marketing managers may take important decisions which are effective and beneficial to the company. Various areas of marketing where measurement techniques are used are product positioning. market segmentation, advertising message effectiveness, etc.

One of the most fundamental uses of research is attitude measurement. The process of attitude measurement relies on measuring the beliefs and emotional feelings of people. By gaining an understanding of the attitudes of the customers, marketers get an advantage for the company in the marketplace. Unlike physical attributes (like weight, height, etc.,) it is not easy to measure attitude. Hence, the scales which are developed to measure attitudes art not very precise, Altitude measurement scales can be of different types. There is a minute difference between altitude and trait. While, attitude is an objective reference, trait is subjective in nature. For example, if someone has a hostile attitude towards poor people, then his or her hostility will be directed towards poor people only, however if one has the trait of hostility, then he will be hostile towards everyone.

What is Measured ?


Three categories of things can be measured as per Abraham Kaplan :

1) Direct Observables : 
The things which can directly be observed are called direct observables. For example, by meeting an individual the brand of his/her wrist watch can be directly observed.

2) Indirect Observables :
The things which cannot be directly observed are called 'indirect observables'. More complex and refined observation efforts is required for observing such things. For example, minutes of earlier board meetings of corporations can be used to observe past business decisions.

3) Constructs : 
The things which cannot be observed directly or indirectly are called 'constructs'. These are the theoretical concepts, which are developed by observing different aspects of an operation. For example, IQ is known as a construct. It cannot be directly or indirectly observed. It is determined only by mathematically observing answers of different test questions asked in an IQ test.


Process of Measurement 


Steps in the Measurement Process are as follows :

1) Developing Behavioral Categories :
Measurement process starts with the behavioral categorization of different events to be measured. It is very crucial to categories events carefully so as to make the measurement process simple and appropriate.

2) Selection of Appropriate Calculation Method :
After successfully categorizing the events, the next step in measurement process is selecting appropriate calculation method for different behavioral categories. The different calculation methods are as follows :

a) Frequency Method : 
In this method, frequency counts are used for calculations. Number of occurrences of a particular event in a definite time period is called its frequency count. Behaviors or events which occur number of times within a definite time period, occur for short duration's, or have a sharp beginning or ending, are calculated with the help of frequency method.

b) Duration Method : 
In this duration method of an individual being involved in a particular behavior or activity within a fixed time limit is calculated.

c) Interval Method : 
In this interval method, the whole observation time limit is divided into different intervals and these intervals are checked for a specific behavior or activity.

3) Using Multiple Observers : 
The final step in measurement process is using multiple observations so as to measure inter-rater reliability. Observations used in measurement process art as follows :

a) Naturalistic Observation : 
The naturalistic observation involves observing in a natural or real environment, Here, the actual behavior of the respondents is observed and recorded, which is free from manipulations.

b) Participant Observation : 
In this type of observation, the researcher joins the group of participants as an individual participant and therefore, observes their behavior.

c) Contrived Observation : 
When a simulated environment is created to observe the natural behavior of the respondents, it is called contrived or structured observation. This type of observation eliminates the need of observing respondents in natural selling.

Functions of Measurement 


1) Allows to Summaries Information :
The framing of proper measures also allows the information to be summarized and presented in a better way. This also allows researchers to use various graphs, tables, and, charts to represent he data properly. This also makes the research and the research findings more presentable and attractive to any potential user of the research report.

2) Provides Better Understanding of a Situation :
Measurement allows better understanding of a situation as compared to a scenario where there is no measurement at all. For example, various data about population is obtained only when it is measured, Until and unless the data is measured, it does not provide in-depth understanding of the situation.

3) Allows to Quantify Data and Statistical Sophistication : 
The process of measurement also allows the researcher to quantify the abstract variables and research parameters. The degree of statistical treatment of the data depends upon he measurement scale adopted to quantify the data.

4) Important for Research Approach : 
Selection of measurement techniques also determines the research approach and the way a researcher will tend to solve the research problems. Deciding the measures is thus an essential part of the research activity. The selection of proper measures goes a long way towards making the research a better planned end organised activity.

5) Provides Important Set of Tools : 
The measurement procedures and instruments to be used provide invaluable information to the researcher which allows him to reach at a decision regarding the research problem. It also has a bearing on the policies and programs. However. the measures that are framed are only the means towards the objective of the researcher and not the ends. It helps the Researcher in reaching at critical decisions regarding a research objective.

Types of Measurement 


There are four possible measurement techniques which are as follows :

Measurement Techniques
1) Questionnaires : 
The questionnaire is an inventory of questions used to seek information from respondents on different topics like
behavior, demographic and psychographic details, opinions, attitudes, beliefs, feelings, etc. The questions are designed for a particular study and are validated before concluding.

2) Attitude Scales : 
Attitude seeks responses on the feelings of respondents towards a particular object. Attitude scales can be of different types like as follows :
i) Rating scales make a respondent to place an object on a scale which is numerically numbered.
ii) Ranking scales require the respondents to compare a set of objects and rate them
between '1' to '10', where, '1' stands at a highest position and '10' stands at a lowest position.

3) Depth Interviews : 
In depth interviews, the respondents have complete freedom to express their feelings without any fear of rejection or meeting opposition from others. The responses which we received from the respondents are recorded in specially designed formats. This technique is used when the researcher wants to gather in-depth information about the feelings and opinions of respondents or when the researcher wants to examine some new issue or aspect of the study. 
Many times, depth interviews are also used to provide clarity or perspective on the other gathered data. It helps to provide a more comprehensive picture on the data that has been gathered. In depth interviews, a technique should be used in lieu of focus interviews where It is felt that the respondents will not be comfortable talking about the topic in a group atmosphere or where the researcher wants to differential between individual opinions and group opinions on a topic of discussion. Depth interviews are also used where the researcher wants to refine questions for a future study or survey.

4) Observation : 
Observation is a direct technique of examining the behavior or the results of the behavior. This requires the researcher to observe the behavior of an individual or a group of people. This observation must be done in a natural setting and over an interval of time. The biggest advantage of this method is it increases the credibility of the research process. It utilizes trained researchers who are unbiased regarding the research topic. By observing the behavior formally, the observers are often able to identify attitudes and predispositions which are often over locked by researchers. The disadvantage is that observation is a time-taking process and the observers often find that their presence influences the behavior of the people being observed and thus affects the reliability of the observation process.

Criteria for Good Measurement 


Seven important criteria art used for evaluating the measurement, which are as follows :

Criteria for Good Measurement
1) Reliability : 
Reliability is an important criterion for testing the measurement. When the results offered by the measuring instrument are consistent, it is called reliable. Although reliable instrument is not necessarily a valid instrument in its nature, but it leads to validity of the measurement.

2) Validity : 
The next criterion used for evaluating the measurement is its validity. The extent of
to which a particular measuring instrument specifically measures is called its validity. It can also be denoted as utility. It also expresses the extent to which differences described by a measuring instrument between the two behaviors are true.

3) Practicality :
Practicality is also a criterion for testing the measuring instrument. The extent, to
which a particular measuring instrument is suitable, cost-effective and interpret-able, denotes the practicality of the instrument.

4) Sensitivity :
The next criterion for evaluating the measurement instrument is its sensitivity. A particular measuring instrument is said to be sensitive if all the variations in responses are effectively measured by it. Measuring instruments dealing with 'Agree' or 'Disagree' types of responses are not so, sensitive. A little modification is required in instruments so as to record more sensitive responses.

5) Generalisability :
Generalisability is also an important criterion for testing the measuring instrument. The ability of data collection of an instrument from widespread respondents along with offering flexibility in its interpretation is called generalisabilty.

6) Economy : 
The choice of data collection method is also often dictated by economic factors. The rising cost of personal interviewing first led to an increased use of telephone surveys and subsequently to the current rise in Internet surveys. In standardized tests, the cost of test materials alone can be such a significant expense that it encourages multiple reuses.

7) Convenience : 
A measuring device passes the convenience test if it is easy to administer. A questionnaire or a measurement scale with a set of detailed but clear instructions, with examples, is easier to complete correctly than one that lacks these features. In a well-prepared study, it is not uncommon for the interviewer instructions to be several times longer than the interview questions. Naturally, the more complex the concepts and constructs, the greater is the need for clear and complete instructions.

Difficulties in Measurement 


Measurement has the following difficulties :

Difficulties in Measurement
1) Irrelevant Data : 
Measurement leads to the generation of enormous data. However, it is not necessary that the data is always relevant, the data may lack purpose of times. Sometimes, measurement forces the marketers to manipulate the real data for their own purposes.

2) Inaccurate Response : 
Respondents have a tendency of giving inaccurate responses in face-to-face interviews. It is very important that the research activity elicits the correct response from the respondents. Now-a-days, web-based surveys have made it possible to target large target segment, quickly and economically.

4) Training in Measurement is Rare :
Measurement requires that people have necessary skills and knowledge in particular field. However, very few organisations invest in knowledge and skill building.

5) Delegating Measurement Strategy : 
Deciding the right metrics often requires that the incumbents not only have a big picture perspective but also the power to challenge the dominant marketing mind-sets of the organisation. This is often not possible for middle managers but requires involvement of top management. Measurement should not be delegated, as the quest for truth will then take a backseat in the organisation. It needs leadership and focus in the organisation so that a congenial environment is created in the organisation.

Problems in Measurement in Management Research 


Issue of preciseness and practical use of the research work are the main concerns for several researchers. They are curious about the contribution of their research work in the concerned field. Therefore, they evaluate their work in the light of two main factors, i.e., reliability and validity. 
These reliability and validity issues need to be addressed very carefully otherwise they may lead to defective statistical decisions and errors (Type I and II).
In case of measuring psychological and behavioral events in management, it is very difficult for researchers to maintain validity in the research. Validation is not an issue in physical sciences as it involves direct measurement. Whereas, validity is problematic in management sciences as several indirect measures are involved in it. For example, different aspects like level of productivity, organisational climate, profit earned, employee satisfaction and industrial relations are measured for evaluating managerial efficiency of the organisation. Therefore, reliability and validity are the major research problems in management researches.

Validity of Measurement Scale 


Validity is the most important criterion and measures the degree to which the instrument measures what it is required to measure. It can also be considered as the utility of the instrument. It measures the extent to which the differences in the test measurements reflect the actual differences. For example, if a researcher is trying to examine the motivation level of employees, then he needs to look into a variety of other factors as well. Say, if he considers only absenteeism, then it is not a valid measure as absenteeism may also happen due to other causes like illness, personal reasons, family reasons, etc.

Validity can be measured by following techniques :

1) Face Validity : 
Face validity is the easiest among the various types of validity. It measures one
single item on a lest or all the items and tries to measure how well the item expresses the meaning of the rest. 
For example, the test item "I think I should buy a car" is an example of item which has face validity as item measures intention to purchase a car. The downside of face validity is that respondents can often hide their responses or exaggerate their responses so that the response becomes manipulated. In fact, many psychornetricians like tests which do not have face validity but general validity. Test items which measure what they are supposed to measure but have no face validity, will be more difficult to manipulate by the respondents. Though items which do not have face validity have some good features, but in the long-run, it is better if the test items have some face validity.

2) Content Validity : 
Content validity refers to how adequate the selected variables are measuring the requirements. In other words, the scale that is being used should have all the required variables. 
For example, if a researcher wants to test the facilities of a hotel and if it includes the variables like locality, number of old customers, number of new customers, turnover, etc., then, it is clear that this scale will not have content validity as these variables are not adequate to answer the research objective. Instead, the researcher should include variables like the ambience, stalls, food, cleanliness, maintenance, medical facilities, etc. The selection and choice of research variables which are to be included in the seal for the purpose of the research activity, is a very difficult task.

3) Criterion Related Validity : 
The criterion validity measures how well the data collected by the scale employed in the research work corresponds with the criterion variables. Criterion variables can be in the form of demographic and psychographic data attitude and behavior variables, or which have been derived from other scale criterion validity can be of two types depending on the time period of assessment :

i) Concurrent Validity : 
Concurrent validity is the type of criterion validity in which the data on the scale being employed and that of criterion validity are collected at the same time. The researcher can measure concurrent validity by using abridged forms of standardised personality inventories. The original instrument and the abridged version of the personalised inventory are administered over the respondents at the same time. Thereafter, the results are compared.

ii) Predictive Validity : 
In predictive validity, the researcher gathers data on the scale and the criterion variable al different points of time.

4) Construct Validity : 
In construct validity, the researcher tries to examine the characteristic or construct that the scale is measuring. When measuring construct validity, the researcher tries to answer questions like how the scale works and what kind of conclusions can be made regarding the research being carried out. Construct validity is die most difficult and sophisticated kind of validity. It includes the following :

i) Convergent Validity : 
Convergent validity measures how well the scale converges or correlates with other measures of the same construct.

ii) Discriminant Validity : 
This measures the opposite of the convergent validity. In measures how different the scale is from other measures of the same construct. It seeks to show the lack of correlation between various constructs.

iii) Nomological Validity : 
Nomological validity measures the degree to which the scale correlates to other measures of different but related constructs in theoretically predictable way. In this method, theoretical model is developed which directs to further tests, deductions and inferences. This leads to the construction of a nomological net, in which several constructs can be related with each other.

Reliability of Measurement Scales 

Reliability of Measurement Scales

Reliability refers to the ability of a measurement tool to provide consistent outcomes. It is one of the most important criteria for sound measurements. Any measuring instrument should fulfill the criteria or reliability. If an instrument is reliable, then it is considered as a valid instrument, but a valid instrument is not necessarily considered to be a reliable instrument.
Reliability is all about the precision and the accuracy of the measuring instrument. In other words, the result of the measuring instrument is reproducible or replicable. Reliability of a measuring instrument means that it produces consistently the same or similar results over different time periods and in different conditions.
For example, if a thermometer measures the temperature of a city to be similar or same every time, then the thermometer is considered to be reliable.

In other words, reliability is the ability of a measuring instrument to provide consistent results over time without error.

In some research conditions, & poor research methodology or faulty data collection methods can lead to low reliability. The quality of the responses received can be poor if the respondents are unable to understand the wordings of the questions or if they give incorrect answers to them. 
For example, if a Hindi questionnaire is administered in the rural Tamil Nadu, then the respondents will probably give erroneous feedback impacting the reliability of the questionnaire. Reliability varies between '0' and '1', with a higher reading denoting greater reliability. Reliability is measured in the following manner :

1) Test-Retest Method : 
In this method, the respondents are asked to give their responses on the scale at two different points of time. The correlation between the two responses is then determined. According to this method, if the respondent gives similar or some responses of two different points of time, then the instrument is said to be reliable.
The drawback of this method is that as the respondent has already undergone the test once, hence the second time, his responses may get influenced. Therefore, before using this method, the researcher needs to ensure that this drawback is negated.

2) Alternate Forms Method : 
This method is also called as 'parallel forms'. This method tries to address one of the limitations of the test-retest method. The way of conducting the test is same, but in this method, the respondents are given different versions of the scale in two tests. 
The second test is similar to the first last but contains a different set of questions. The respondent is thus not familiar with the questions and hence his responses are not influenced. The researcher administers the parallel form after having administered the first test.
This method allows the researchers to develop a reliability co-efficient that nullifies the effect of the test form and the only error is due to different times. In this, a particular for 'A' is applied to first group and another form 'B' is applied to the second group. The forms are then reversed and applied to the opposite groups. 
The correlation is then found between the scores on the two forms. This correlation reflects the reliability of the test. While performing this test it should be ensured that responses on the second scale do not get affected by responses on the first scale and both versions of the scale are essentially the same though not identical in content. In general, the alternate forms method is preferred over the test-retest method.

3) Split-Halves Method : 
A potential drawback of the fest-retest and the alternate forms method is that these methods require the researchers to administer the two tests at different points of time. This makes the processes time-consuming and also introduces the risk of occurrence of some natural events that may change the response of the participants. The split-halves scale while conducting the tests at the same point method addresses the problem by splitting the of time.
The scale items are divided into two halves. Scores are then computed for each set of items. The correlation of these two sets is then computed. Unlike the other methods, this coefficient does not give the reliability estimate. The reliability estimate is computed using following formula :
         2r
P = ------------
        I + r
where, 
p = Reliability estimate,
r = Obtained correlation.

A point to be noted is that if the scale is split differently, then the reliability estimate will also change.

4) Internal Consistency : 
Another method of computing the reliability is by using Cronbach's Alpha. Cronbach's alpha is the most commonly used method of computing the reliability of a measure. It is much accurate and has the advantage of requiring only the single application of scale. The drawback is that it is not possible to calculate manually as it requires calculating the correlation for all pairs of data. However, the statistical tables provide a means of calculating alpha automatically.
This measures the internal consistency and is given by the formula :
         Nr
a = -----------
       I+r(N-I)
where, 
a = Cronbach's alpha,
N = Number of items in the scale,
r = Mean inter-item correlation.

The reliability of a scale is greatly dependent on the number of items in the scale. Even if the items are not consistent internally, the scale can be reliable if this number of items on the scale is too much large.

Level of Measurement : Measurement Scales :


There are basically four type of measurement scales as follows :

Measurement Scales
1) Nominal Scale :
This is the most fundamental form of measurement scale. These scales are not used to measure the value of an object, but to categories or classify the object. The value represented in these scales need not to be necessarily a quantitative value. This makes it qualitative in nature. The nominal scales are used extensively by market researchers.
Nominal scales are essentially arbitrary because a researcher can assign any label to any category, without distorting the results. While the statistical measurement like mean and medium cannot be calculated in the outcome of these scales, the findings of a nominal scale can be further subjected to statistical tools like mode, average, binomial and chi-square tests.

Examples of nominal scales can be :
1) What is your gender (tick "M'-Male and "F"-Female) ?
i) M
ii) F

2) What is your hair colour? (tick 'A'-brown, 'B'-Black, 'C"-Blonde, 'D'-Gray, 'E'-Other)
i) A
ii) B
iii) C
iv) D
v)E

Numbers can also be assigned to the categories, but those would be treated as merely "labels' in order to categorise the items or people in a group. Typically, nominal scales are used to ascertain characteristics like gender, ethnicity, backgrounds, tastes, preferences, etc., and not their intensity.

2) Ordinal Scale :
While nominal scale focuses on categorising or classifying individuals or their responses, the ordinal scale seeks to rank them on the basis of their characteristics. In other words, it deals with relative positioning of an item on he basis of the characteristics of other items on the scale.
The participants of the research rank the objects as per their preferences and tastes. Therefore, the respondents are required to rank the objects from highest to lowest preferential order. 
Thus, in ordinal scale, the responses have nominal properties but these scales go a step further and attempt to relatively quantify them by assigning relative values. It would be pertinent to point out here that one cannot quantify the difference between the two variables. These are essentially ranking scales, used extensively by market researchers to determine customer preferences and choices. It may be noted that any data collected through interviews or other methods of interrogation, always have ordinal properties.

For example, consumers may be asked to rank five popular brands of ice cream on a scale of 1 to 5, with 1 representing the most preferred and 5 representing the least.
1) Amul ----
2) Kwality Walls ----
3) Mother Dairy ----
4) Cream Bell ----
5) Vadilal ----

3) Interval Scale :
Interval scale is the improvement or refinement over ordinal scale. Here, absolute values are assigned to the variables. Therefore, the difference between two variables is also equal. The inputs collected from interval can be easily computed with the help of Mean, Standard Deviation, etc.

For example, the difference between 100 degrees and 110 degrees is the same as between 60 degrees and 70 degrees, i.e., 10 degrees.

Another example of interval scale can be :
What is your age? (tick the correct range)
1) Below 10 years
2) 10 to 20 years
3) 20 to 30 years
4) 30 to 40 years
5) 40 to 50 years
6) 50 to 60 years
7) Above 60 years

The difference between all the values is some in the above example.

4) Ratio Scale :
Ratio scale is the most refined and sophisticated for of scales used primarily in physical sciences. While in interval scales, variables have a relative value, in ratio scale, the variables do not have a relative value but an absolute Value. Hence, they incorporate the value for absence of any characteristics in an object. This absence can be denoted with the value 'zero'. 
It is the basic differentiator between an interval and a ratio scale. The data obtained through a ratio scale makes the outcome a meaningful and scientific interpretation.
Ratio scales are endowed with the characteristics of interval scales and the ratios of the numbers of values on these scales provide better and important understanding regarding the analysis. In facts when responses on descriptive variables are  obtained through open-ended questions, they display ration scale properties and features. 

Examples :
1) The Kelvin Scale : 100 K is twice as hot as 50 K. The difference between values is meaningful and can be ordered.
2) Weight : 100 kg is twice as heavy as 50 kg. The values can be arranged in order.
(ascending/descending).