Class 12 Psychology: Statistics in Psychology - Questions and Answers
Instructions: Please answer all questions to the best of your ability.
Multiple Choice Questions (MCQs) - 1 Mark Each
Choose the most appropriate answer from the given options.
A systematic way of organizing data to show how often each score or group of scores occurs in a dataset is called a:
a) Bar Graph
b) Histogram
c) Frequency Distribution
d) Scatter Plot
Answer: c) Frequency Distribution
When data is organized into intervals (e.g., 10-19, 20-29), this is referred to as a:
a) Simple Frequency Distribution
b) Grouped Frequency Distribution
c) Cumulative Frequency Distribution
d) Relative Frequency Distribution
Answer: b) Grouped Frequency Distribution
The class interval with the highest frequency in a grouped frequency distribution is known as the:
a) Mean
b) Median
c) Modal Class
d) Range
Answer: c) Modal Class
Which measure of central tendency is most affected by extreme scores (outliers)?
a) Mean
b) Median
c) Mode
d) Range
Answer: a) Mean
In a dataset, the score that occurs most frequently is the:
a) Mean
b) Median
c) Mode
d) Standard Deviation
Answer: c) Mode
Which measure of central tendency is the middle score in a numerically ordered dataset?
a) Mean
b) Median
c) Mode
d) Quartile
Answer: b) Median
For a perfectly symmetrical distribution (like a normal distribution), which of the following is true?
a) Mean > Median > Mode
b) Mean < Median < Mode
c) Mean = Median = Mode
d) Mean and Median are always different
Answer: c) Mean = Median = Mode
When data is highly skewed, which measure of central tendency is generally the most appropriate to represent the typical score?
a) Mean
b) Median
c) Mode
d) All are equally appropriate
Answer: b) Median
To calculate the Mean of a dataset, you need to:
a) Find the middle value.
b) Find the most frequent value.
c) Sum all the scores and divide by the number of scores.
d) Find the difference between the highest and lowest scores.Answer: c) Sum all the scores and divide by the number of scores.
A psychologist wants to represent the number of students who scored in different ranges (e.g., 0-10, 11-20) on a test. Which type of frequency distribution would be most suitable?
a) Simple frequency distribution
b) Ungrouped frequency distribution
c) Grouped frequency distribution
d) Cumulative frequency distribution
Answer: c) Grouped frequency distribution
If a distribution has two modes, it is called:
a) Unimodal
b) Bimodal
c) Multimodal
d) Skewed
Answer: b) Bimodal
Which of the following is a disadvantage of using the mode as a measure of central tendency?
a) It is affected by extreme scores.
b) It can only be used with interval data.
c) It may not represent the central tendency well if the most frequent score is far from the middle of the distribution.
d) It requires complex calculations.Answer: c) It may not represent the central tendency well if the most frequent score is far from the middle of the distribution.
The sum of the deviations of scores from the mean is always:
a) Positive
b) Negative
c) Zero
d) Undefined
Answer: c) Zero
What does 'N' typically represent in statistical formulas related to central tendency?
a) The highest score
b) The total number of scores in the dataset
c) The frequency of the mode
d) The lowest score
Answer: b) The total number of scores in the dataset
If a dataset contains scores: 5, 8, 12, 5, 10, 8, 5, which of the following is the mode?
a) 8
b) 5
c) 10
d) 12
Answer: b) 5
Short Answer Questions - 2 Marks Each
Answer the following questions briefly.
What is a 'frequency distribution' and what is its main purpose?
Answer: A frequency distribution is a tabular or graphical representation of data that shows how often each score or group of scores occurs in a dataset. Its main purpose is to organize raw data into a more meaningful and understandable format, making patterns and trends visible.
Differentiate between a 'simple frequency distribution' and a 'grouped frequency distribution'
.Answer: A simple frequency distribution lists each individual score and its frequency. A grouped frequency distribution organizes data into class intervals (ranges of scores) and shows the frequency for each interval.
Define 'Measures of Central Tendency' and explain their general use in psychology
.Answer: Measures of Central Tendency are statistical measures that describe the center or typical value of a dataset. In psychology, they are used to summarize data and provide a single, representative value that indicates where most scores fall, helping to understand typical behavior or performance.
Calculate the Mean for the following set of scores: 10, 15, 12, 18, 20.
Answer: Sum of scores = 10+15+12+18+20=75
Number of scores (N) = 5
Mean = Sum of scores / N = 75/5=15
The Mean is 15.
Find the Median for the following set of scores: 7, 12, 5, 10, 8, 15, 6.
Answer: First, arrange the scores in ascending order: 5, 6, 7, 8, 10, 12, 15.
The number of scores (N) is 7 (odd). The median is the middle score.
Median = (N+1)/2th term = (7+1)/2=4th term.
The 4th term is 8.
The Median is 8.
Identify the Mode for the following dataset: 2, 4, 3, 5, 4, 2, 6, 4, 7.
Answer: The score that appears most frequently is 4 (it appears 3 times).
The Mode is 4.
When is the Median a more appropriate measure of central tendency than the Mean? Give an example.
Answer: The Median is more appropriate than the Mean when the data distribution is highly skewed (has extreme outliers), as the Mean can be pulled significantly in the direction of the outliers. For example, in income distribution, the Mean income can be misleadingly high due to a few very wealthy individuals, while the Median income provides a better representation of the typical income.
What does a 'modal class' indicate in a grouped frequency distribution?
Answer: A modal class indicates the class interval that has the highest frequency of scores within a grouped frequency distribution. It represents the range of scores where the most observations fall.
Why is it useful to represent frequency distributions graphically (e.g., using histograms)?
Answer: Graphical representations like histograms make it easier to visualize the shape, spread, and central tendency of a distribution. They help quickly identify patterns, outliers, and skewness that might not be immediately apparent from a table.
Explain one disadvantage of using the Mode as the sole measure of central tendency.
Answer: One disadvantage of the Mode is that it may not be unique (a dataset can have multiple modes or no mode). Also, it might not be centrally located in the distribution, especially in highly skewed data, and it does not consider the values of all scores in the dataset.
Long Answer Questions - 5 Marks Each
Answer the following questions in detail.
What is a frequency distribution? Describe in detail the process of constructing a 'grouped frequency distribution' from a raw set of data. Explain the importance of class intervals and class limits in this process.
Answer:
Frequency Distribution:
A frequency distribution is a systematic way of organizing raw data to show how often each score or value (or range of values) occurs in a dataset. It summarizes data by grouping scores into categories or intervals and listing the frequency (count) of scores within each. Its main purpose is to make large sets of data more manageable, interpretable, and to reveal patterns, trends, and the shape of the data.
Process of Constructing a Grouped Frequency Distribution:
A grouped frequency distribution is used when there is a wide range of scores, making a simple frequency distribution too long and cumbersome. The process typically involves the following steps:
Determine the Range of Scores: Find the highest score (H) and the lowest score (L) in the raw data.
Range = H - L
Decide on the Number of Class Intervals: This is usually between 5 and 20 intervals, depending on the number of scores. Too few intervals lose detail; too many defeat the purpose of grouping. A general guideline is to use the formula k=1+3.322log
10
N, where k is the number of intervals and N is the total number of scores.
Determine the Class Interval Width (i): The width of each interval should be a convenient number (e.g., 5, 10, 20).
i≈Range/k (round up to a convenient number).
All intervals must be of the same width.
Determine the Class Limits: Start with the lowest score and create the intervals. The lower limit of the first interval should be a multiple of the interval width or start slightly below the lowest score to include it. The upper limit of an interval ends just before the lower limit of the next interval to avoid overlap.
Importance of Class Limits: Class limits define the exact boundaries of each interval. They ensure that every score falls into one and only one interval. For example, if the interval width is 10, intervals might be 0-9, 10-19, 20-29.
True Class Limits (Real Limits): For continuous data, it's often more precise to use true class limits (or real limits), which extend 0.5 units below the lower limit and 0.5 units above the upper limit. For example, the interval 10-19 has true limits of 9.5-19.5. This accounts for the continuous nature of measurement.
List the Class Intervals: Create a column listing all the determined class intervals from the lowest to the highest.
Tally the Frequencies: Go through each raw score and place a tally mark in the appropriate class interval.
Count the Frequencies (f): Sum the tally marks for each interval to get the frequency (f) for that class.
Compute Relative Frequencies (Optional): Divide the frequency of each interval by the total number of scores (N) to get the proportion or percentage of scores in that interval.
Compute Cumulative Frequencies (Optional): Sum the frequencies from the lowest interval up to the highest, showing the number of scores below the upper limit of each interval.
Importance of Class Intervals and Class Limits:
Class Intervals: They are crucial because they group data into manageable chunks, making the distribution concise and easier to interpret, especially for large datasets with a wide range of scores. They allow for the visualization of the data's overall shape and density in different ranges.
Class Limits: They precisely define the boundaries of each interval, ensuring that there is no ambiguity about which interval a particular score belongs to. They prevent overlap between intervals, ensuring that each score is counted exactly once. True class limits are particularly important for continuous data to reflect the underlying continuity of the variable being measured. Improperly defined limits can lead to misrepresentation of the data and incorrect conclusions.
Explain the three main Measures of Central Tendency: Mean, Median, and Mode. For each measure, describe how it is computed and discuss its primary uses and advantages.
Answer:
Measures of Central Tendency are statistical values that represent the typical or central value of a dataset. They provide a single, summary score that gives an idea of where most of the data points lie.
Mean (Arithmetic Mean):
Computation: The mean is calculated by summing all the scores in a dataset and then dividing by the total number of scores (N).
Formula:
X
ˉ
=∑X/N (where
X
ˉ
is the mean, ∑X is the sum of all scores, and N is the number of scores).
Primary Uses:
Most common and widely used measure of central tendency.
Used when the data is interval or ratio level and is approximately symmetrically distributed (e.g., test scores, height, weight).
Forms the basis for many other advanced statistical analyses (e.g., standard deviation, t-tests).
Advantages:
Takes into account every score in the dataset.
Is a stable measure across different samples from the same population.
Has important mathematical properties that make it suitable for further statistical calculations.
Median:
Computation: The median is the middle score in a dataset that has been arranged in numerical order (ascending or descending).
If N (number of scores) is odd, the median is the middle score: ((N+1)/2) th
score.
If N is even, the median is the average of the two middle scores: ((N/2) th
score+((N/2)+1) th
score)/2.
Primary Uses:
Used when the data is ordinal, interval, or ratio, especially when the distribution is highly skewed (has extreme outliers).
Appropriate for data where the actual values of extreme scores might distort the mean (e.g., income, house prices).
Advantages:
Not affected by extreme scores (outliers), making it a robust measure for skewed distributions.
Can be calculated for ordinal data.
Provides a good representation of the "typical" value in skewed datasets.
Mode:
Computation: The mode is the score or value that occurs most frequently in a dataset. A dataset can have one mode (unimodal), two modes (bimodal), more than two modes (multimodal), or no mode if all scores occur with the same frequency.
Primary Uses:
The only measure of central tendency that can be used for nominal (categorical) data (e.g., most popular color, favorite fruit).
Useful for identifying the most common category or score in any type of data.
Used when a quick estimate of central tendency is needed.
Advantages:
Easy to compute and understand.
Not affected by extreme scores.
Can be used for all levels of measurement, including nominal data.
Imagine you are a psychologist conducting a study on the anxiety levels of Class 12 students before their board exams. You collect anxiety scores from 20 students. Explain how you would use frequency distribution and measures of central tendency to analyze and present this data. Discuss the specific insights each statistical tool would provide.
Answer:
Let's assume the raw anxiety scores (on a scale of 1-100) for 20 students are:
75, 82, 68, 90, 55, 78, 85, 70, 92, 65, 80, 72, 88, 60, 76, 83, 79, 95, 62, 73
1. Using Frequency Distribution:
Purpose: To organize the raw scores into a meaningful table to see patterns in anxiety levels. Since there's a range of scores, a grouped frequency distribution would be most appropriate.
Construction Process:
Range: Highest score = 95, Lowest score = 55. Range = 95 - 55 = 40.
Number of Intervals: For 20 scores, 5-7 intervals would be suitable. Let's choose 5.
Interval Width: Range / Number of Intervals = 40 / 5 = 8. Let's choose a convenient width of 10.
Class Intervals: Start from 50 (to include the lowest score 55).
50-59
60-69
70-79
80-89
90-99
Tally and Frequencies:
| Anxiety Score Interval | Tally | Frequency (f) |
| :--------------------- | :----------- | :------------ |
| 50-59 | | | 1 |
| 60-69 | ||| | 4 |
| 70-79 | ||||| | 7 |
| 80-89 | |||| | 5 |
| 90-99 | || | 3 |
| Total | | 20 |
Insights from Frequency Distribution:
Distribution Shape: It would show a somewhat central tendency, with most students scoring in the 70-79 range, followed by 80-89, and then 60-69.
Range of Scores: Clearly indicates the spread of anxiety levels, from low (50s) to very high (90s).
Common Ranges: Highlights the anxiety score ranges where most students fall, giving a quick overview of the typical anxiety level of the group.
Outliers/Extremes: Can easily identify if there are very few students with extremely low or extremely high anxiety. For instance, only one student scored in the 50-59 range, while three scored in the 90-99 range.
2. Using Measures of Central Tendency:
Purpose: To provide single, representative scores that describe the "average" or "typical" anxiety level of the group.
Computation and Uses:
Mean:
Computation: Sum all 20 scores and divide by 20.
Sum = 75+82+68+90+55+78+85+70+92+65+80+72+88+60+76+83+79+95+62+73=1500
Mean = 1500/20=75
Insight: The average anxiety score for the Class 12 students before board exams is 75. This gives a general idea of the group's central tendency. It is useful for comparing this group's average to other groups or to a theoretical average.
Median:
Computation: First, sort the scores in ascending order:
55, 60, 62, 65, 68, 70, 72, 73, 75, 76, 78, 79, 80, 82, 83, 85, 88, 90, 92, 95
Since N=20 (even), the median is the average of the 10th and 11th scores.
10th score = 76, 11th score = 78
Median = (76+78)/2=154/2=77
Insight: 50% of the students scored an anxiety level of 77 or below, and 50% scored 77 or above. The median is particularly useful if there were a few students with extremely low or high anxiety scores, as it is not affected by these outliers. Here, it gives a slightly different "middle" value than the mean, indicating a slight skew.
Mode:
Computation: Identify the score that appears most frequently.
In our specific raw data (75, 82, 68, 90, 55, 78, 85, 70, 92, 65, 80, 72, 88, 60, 76, 83, 79, 95, 62, 73), each score appears only once. Therefore, there is no mode in this exact raw dataset. If a score appeared multiple times, that would be the mode.
However, if we consider the grouped frequency distribution, the modal class is 70-79, as it has the highest frequency (7).
Insight: In this specific raw dataset, the mode does not provide a useful measure of central tendency because all scores are unique. If there were repeated scores, the mode would indicate the most common anxiety score among students. The modal class (70-79) from the grouped distribution tells us the most common range of anxiety levels.
By using both frequency distribution and measures of central tendency, the psychologist can gain a comprehensive understanding of the anxiety levels in the group, including the overall pattern, the typical score, and the spread of data, which is crucial for psychological studies.