BDC Market Research

Analytical Resources

Question

Pose the right question and you will get lots of information. Ask BDC Market Research to help you ask the right questions.

Scale


The right scale for the right question reveals large amounts of information. There are a number of possible scales to choose from. Likert, Binomial Descriptive, Just About right, just to name a few.

A category scale is anchored at each point.
The following is an example of a category scale:
How likely are you to purchase strawberry ice cream?

  • Definitely will NOT buy
  • Probably will NOT buy
  • Unsure
  • Probably will buy
  • Definitely will buy

DESCRIPTIVE STATISTICS

Descriptive statistics are used to describe the basic features of the data, they show just the facts. They provide a simple summary about the sample and the measures.

Overall Liking

  • Mean
  • Mode
  • Median
  • Threshold

From a simple question of "On a scale of 1-9, how much do you like strawberry ice cream?" You can find out how much consumers like your product, what the average score is (mean), does it exceed a point for satisfied results (threshold), the most frequent value of liking (mode) and the value in the middle of the data set (median).

i.e. (4 + .+11)/2, mean = 7.5
i.e. 1 2 3 3 5 6 6. 7 7 7, mode = 7
i.e. 1 2 .4 .5 6, median = 4

Histograms/Frequency Distributions

A histogram is a great way to see the spread of continuous data visually; it provides information on the:

  1. Centre (i.e., the location) of the data;
  2. Spread (i.e., the scale) of the data;
  3. The general shape of the frequency distribution (normal, binomial, poisson, etc.);
  4. Symmetry of the distribution and whether it is skewed;
  5. Presence of outliers.

The graphs' baseline depicts all the possible points on the scale that the consumers were able to select from when answering the question. The vertical scale represents the frequencies or percentage of each score as obtained by the sample.

The chart below shows a 9-point scale on the x-axis and the frequency of people who rated each data point on the Y axis.

 

Standard Deviation

Standard deviation tells us how diverse the scores are. If we obtain a standard deviation close to zero then the data points are close to the mean. If the standard deviation obtained is far from zero, then many of the data points are far from the mean. A large standard deviation is also a tip on whether the data is skewed or not.

Skew

Skewness is a measure of the degree of asymmetry of a distribution. A distribution is skewed if one of its tails is longer than the other. If the variables you are testing are not normally distributed because they are too skewed you can't use the t test.

A histogram is an effective way of showing the skew of the data set.

INFERENCE STATISTICS

With inferential statistics, the aim is to reach a conclusion that extends beyond the immediate data alone. The response to questions that consumers provide is not the only information we can obtain. With a bit of elbow grease, more insight can be obtained from the one question. Inference statistics are strong analyses because they draw inferences from the data without having to ask participants direct questions.

ie:

  • T test
  • ANOVA
  • Regression analysis
  • Chi Square

T Test

The T-test is used to determine whether there is a significant difference between two sample means. It helps to answer the underlying question: do the two groups come from the same population, and only appear different because of chance errors.

Is there a difference in overall liking for strawberry and chocolate ice cream?

ANOVA

ANOVA also known as F test, is a powerful set of techniques that test differences among means of three or more samples. ANOVA analyzes the size of differences between the groups (variability). It is used when there is more than one factor and more than two levels of the factor.

It is different from the t test in that, the t test can only test two groups; while ANOVA can test multiple groups and provide an output for that data in one number (F statistic).

One way ANOVA

Test differences between groups when only one variable is considered. The one way ANOVA is mathematically the same as the t test.

Testing 4 groups of soil and to see if any of the four groups are different from each other in their taste.

Two way ANOVA

Is a way of studying the effects of two factors separately (their main effects) and (sometimes) together (their interaction effect).

In a study to see if car type and colour affects average speed the car travels at with the variables: type of car (small or large) and car colour (red or blue).

Chi Square


The Chi Square analysis is used for categorical data, i.e. data put into classes, red, blue, green.

It tells us if the pattern of frequencies that are observed fit well with the expected ones. Whether there is a relationship between two variables that reflect a real association in the general population.

i.e. Types of Chi test

  • Goodness of fit
  • Pearson's Chi square
  • Chi squared test of independence

Significance

In statistics "significance" means not due to chance, therefore most probably true.

Example: The difference in liking scores for ice-cream flavours. Chocolate scores 5.5 and vanilla scores 6.5, but are they significantly different? Are the differences between the scores big enough that it is not due to chance alone?

The p value is the amount of evidence against the null hypothesis, the smaller the p value the greater the evidence against the null hypothesis.

The alpha value  is the amount of error one is willing to accept.

If the data reveals that there is a significant difference and the p value is smaller than the alpha value, then the chance of an error occurring is smaller than the rate you are willing to accept. It can then be concluded that Chocolate is liked more than vanilla, and it is not just by chance that chocolate happens to score higher.

Statistical Errors

When analysing data it is important to make sure that the analysis performed minimises Statistical Error.

Type 1 Error

A type one error occurs when the null hypothesis is incorrectly rejected. It could also be thought of as a 'false positive'.

i.e. When we mistakenly think there is a difference in preference between vanilla and chocolate when in truth none exists.

Type 2 Error

A type two error occurs when a false null hypothesis is not rejected. It could also be thought of as a 'false negative'.

i.e. When we fail to notice the difference in preference between vanilla and chocolate when one exists

COMPLEX DATA MODELLING

Conjoint Analysis

Want to know what features are important to consumers when they are making their purchasing decisions and at what price to mark your product at? Conjoint analysis can determine the relative importance of each feature in the purchasing decision. When consumers make a purchasing decision it is based on a range of product attributes, and the attributes are weighted up and trade offs are made before a purchasing decision is made. Therefore when an analysis is made on product attributes it makes more sense to measure relative values of attributes jointly than in isolation.

Conjoint analysis is particularly useful because generally when asked directly, consumers can not accurately identify the importance of product attributes.

Predictive Modelling

Predictive models predict the likelihood of consumer behaviours in the future based on past performances.

TURF Analysis

TURF is an acronym for "Total Unduplicated Reach and Frequency".

TURF Analysis helps marketers make decisions about what types of products to launch with knowledge of potential cannibalisation of the existing range.

The TURF algorithm calculates the incremental value of a product to the full product line, and finds the optimal product selection to maximize the number of consumers with the fewest number of varieties. The analysis also minimises consumer overlap (duplication) across other products in the range.

The questions that TURF analysis answers are:

  • How many consumers will use each variety in the product line?
  • What is the volume of usage for each offering in the product line?
  • What is the value of additional variety of the product range? E.g. how many more consumers can we attract, with the introduction of a new product to the line?

The above graph shows that Vanilla has a sales volume of 50%. Adding the Chocolate flavour to the product line will add and incremental volume of 20%, and adding strawberry to the product line with add another incremental volume of 10%.

The above graph shows that Vanilla has a sales volume of 50%. Adding the Chocolate flavour to the product line will add and incremental volume of 20%, and adding strawberry to the product line with add another incremental volume of 10%.

© 2010 BDC Market Research