# Thread: Urgent help with understanding statistics

1. ## Urgent help with understanding statistics

alright. so here a question.

Marky-Zingerwitts has 5 music exam results as follows:

24, 15, 6, 18, 12

(i) Find the mean and standard deviation of his results.

Will this is easy enough, mean works out at 15 and s.d is 6. But...

(ii) Zingerwitts wants to change his results by multiply each by a constant $a$ and adding a constant $b$ so that the new mean is 40 and the new s.d is doubled. Find $a$ and $b$.

I hate mental blocks

2. $\tilde{x}_i = a x_i + b \\ \mu = \frac{1}{n} \sum_{i=1}^n x_i \\ \tilde{\mu} = \frac{1}{n} \sum_{i=1}^n \tilde{x}_i = \frac{1}{n} \left ( \sum_{i=1}^n a x_i + b \right) = \frac{1}{n} \left ( a \sum_{i=1}^n x_i \; + \; \sum_{i=1}^n b \right) = a \mu + b \\ \sigma = \sqrt{\frac{1}{n} \left( \sum_{i=1}^n x_i^2 \right) \; - \; \mu^2} \\ \tilde{\sigma} = \sqrt{\frac{1}{n} \left( \sum_{i=1}^n \tilde{x}_i^2 \right) \; - \; \tilde{\mu}^2} = \sqrt{\frac{1}{n} \left( \sum_{i=1}^n \left( a x_i + b )^2 \right) \; - \; \left( a \mu + b \right)^2} = \sqrt{\frac{1}{n} \left( a^2 \sum_{i=1}^n x_i^2 \; + 2 a b \sum_{i=1}^n x_i + \sum_{i=1}^n b^2 \right) \; - \; a^2 \mu^2 - 2 a b \mu - b^2 } = \left| a \right| \sqrt{\frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) \; - \; \mu^2 } = \left| a \right| \sigma$

so $a = \pm 2$ from the problem text.

3. So a = 2 and b = 12? I'm having a problem seeing what you did here, rpenner:
Originally Posted by rpenner
$\sqrt{\frac{1}{n} \left( a^2 \sum_{i=1}^n x_i^2 \; + 2 a b \sum_{i=1}^n x_i + \sum_{i=1}^n b^2 \right) \; - \; a^2 \mu^2 - 2 a b \mu - b^2 } = \left| a \right| \sqrt{\frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) \; - \; \mu^2 } = \left| a \right| \sigma$

4. $\mu = 15 \\ \tilde{\mu} = 40 = a \mu + b \\ b = \tilde{\mu} - a \mu = 40 - 15 a$

So a = 2, b = 10 OR a = -2, b = 70.

5. $\tilde{\sigma} = \sqrt{\frac{1}{n} \left( \sum_{i=1}^n \tilde{x}_i^2 \right) \; - \; \tilde{\mu}^2}
= \sqrt{\frac{1}{n} \left( \sum_{i=1}^n \left( a x_i + b )^2 \right) \; - \; \left( a \mu + b \right)^2}
= \sqrt{\frac{1}{n} \left( a^2 \sum_{i=1}^n x_i^2 \; + 2 a b \sum_{i=1}^n x_i + \sum_{i=1}^n b^2 \right) \; - \; a^2 \mu^2 - 2 a b \mu - b^2 }
= \sqrt{a^2 \frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) - a^2 \mu^2 + 2 a b \frac{1}{n} \left( \sum_{i=1}^n x_i \right) - 2 a b \mu + b^2 \frac{1}{n} \left( \sum_{i=1}^n 1 \right) - b^2 }
= \sqrt{a^2 \frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) - a^2 \mu^2 + 2 a b \mu - 2 a b \mu + b^2 - b^2 }
= \sqrt{a^2 \frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) - a^2 \mu^2 }
= \sqrt{a^2} \sqrt{\frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) - \mu^2 }
= \left| a \right| \sqrt{\frac{1}{n} \left( \sum_{i=1}^n x_i^2 \; \right) \; - \; \mu^2 }
= \left| a \right| \sigma$

6. iPhones and LaTeX don't mix well.

7. That would indeed be the perfect upgrade for an rpenner-on-the-go: a voice-activated version, in which your utterances are nicely scripted for you, without putting you through too much verbiage. Not to mention, in your voice activated car, you could then direct it to turn right on Main Street with mean of 2 meters from the curb and a standard deviation of 10 centimeters, or, abbreviated, "Right on Main mu two sigma point one" using some of the same parsing algorithm.

Nice presentation as usual. If it were a big commercial enterprise like Oracle they could utilize your presentations for sales demos!

As for the problem itself this was a good basic way to test an understanding of how to calculate mean and standard deviation. Most people remember averaging just from common use but the std dev is not as easy to recall if you haven't been in the habit of using it, or if the notion of a moment has faded from the memory.

We can conclude that scaling the samples by "a" scales the mean by the same amount and shifting the samples left or right (biasing) by "b" shifts the mean by the same amount, whereas the standard deviation is immune to biasing (which is an important property in many cases) yet it also scales by "a" in its spread about the mean, which fits well with our intuitive idea of what it means to measure the spread of a population.

8. Here's a graphical approach. More intuitive, less rigorous.

Imagine the numbers spread out on a number line.
The mean is an indication of the middle of the numbers.
The standard deviation indicates how spread out they are:
$
\picture(600,100){
%% Number line %%
(0,50){\line(600,0)}
(50,40;100,0;6){\line(0,15)}
(c50,30){0}
(c150,30){10}
(c250,30){20}
(c350,30){30}
(c450,30){40}
(c550,30){50}

%% Data points %%
(c110,60){\bullet}
(c170,60){\bullet}
(c200,60){\bullet}
(c230,60){\bullet}
(c290,60){\bullet}

%% Mean %%
(200,45){\line(0,-10)}
(200,45){\line(-3,-5)}
(200,45){\line(3,-5)}
(c200,25){\small{\textit{mean}}}

%% Standard deviation %%
(140,75){\line(120,0)}
(140,75){\line(5,3)}
(140,75){\line(5,-3)}
(260,75){\line(-5,3)}
(260,75){\line(-5,-3)}
(c200,85){\small{\textit{standard deviation}}}
}$

Adding a constant means sliding everything along the line without changing their spread.
This adds the constant to the mean, but doesn't change the standard deviation.
Here's the same data set, with 10 added to everything:
$
\picture(600,100){
%% Number line %%
(0,50){\line(600,0)}
(50,40;100,0;6){\line(0,15)}
(c50,30){0}
(c150,30){10}
(c250,30){20}
(c350,30){30}
(c450,30){40}
(c550,30){50}

%% Data points %%
(c210,60){\bullet}
(c270,60){\bullet}
(c300,60){\bullet}
(c330,60){\bullet}
(c390,60){\bullet}

%% Mean %%
(300,45){\line(0,-10)}
(300,45){\line(-3,-5)}
(300,45){\line(3,-5)}
(c300,25){\small{\textit{mean}}}

%% Standard deviation %%
(240,75){\line(120,0)}
(240,75){\line(5,3)}
(240,75){\line(5,-3)}
(360,75){\line(-5,3)}
(360,75){\line(-5,-3)}
(c300,85){\small{\textit{standard deviation}}}
}$

Multiplying by a constant means stretching everything out along the line. The further a data point is from zero, the further it shifts.
This multiplies both the mean and the standard deviation by the constant.
Here's the original dataset, but with everything doubled
$
\picture(600,100){
%% Number line %%
(0,50){\line(600,0)}
(50,40;100,0;6){\line(0,15)}
(c50,30){0}
(c150,30){10}
(c250,30){20}
(c350,30)
(c450,30){40}
(c550,30){50}

%% Data points %%
(c170,60){\bullet}
(c290,60){\bullet}
(c350,60){\bullet}
(c410,60){\bullet}
(c530,60){\bullet}

%% Mean %%
(350,45){\line(0,-10)}
(350,45){\line(-3,-5)}
(350,45){\line(3,-5)}
(c350,25){\small{\textit{mean}}}

%% Standard deviation %%
(230,75){\line(240,0)}
(230,75){\line(5,3)}
(230,75){\line(5,-3)}
(470,75){\line(-5,3)}
(470,75){\line(-5,-3)}
(c350,85){\small{\textit{standard deviation}}}
}$

You want to double the standard deviation (you need to stretch everything by a factor or two).
This tells you that the multiplication constant is 2.

Multiplying by two will also double the mean to 30.
You want the mean to be 40 (you need to slide everything along the number line by 10 units.)
This tells you that the addition constant is 10.

Hope this helps!

Note that this approach doesn't immediately give you rpenner's second answer.
To get that, you need to realize that multiplying everything by -2 also stretches everything out, as well as flipping everything around the x-axis.
The standard deviation is always positive, so multiplying by a negative constant means multiplying the mean by that negative constant, but multiplying the standard deviation by the absolute value.

Multiplying your dataset by -2 doubles the standard deviation (as required), and changes the mean from 15 to -30.
You then need to add 70 to bring the mean back up to 40.

9. thank you pete! it's really that simple? If i wanted to tiems the s.d by 3 all i would have to do is triple the results?

10. Yes, that's as hard as it has to get for this problem.

11. In case this interests anyone, it's possible to simplify rpenner's notation just using linearity of the mean $\langle X \rangle$ of a random variable $X$.

So, $\langle a X + b \rangle \,=\, a \langle X \rangle \,+\, b$.

As for the variance,

\begin{align}
\textrm{Var}(aX + b) \,&=\, \langle \bigl( (aX + b) \,-\, \langle aX + b \rangle \bigr)^{2} \rangle \\
\,&=\, \langle ( aX \,-\, \langle aX \rangle )^{2} \rangle \\
\,&=\, a^{2} \langle ( X \,-\, \langle X \rangle )^{2} \rangle \\
\,&=\, a^{2} \textrm{Var}(X) \,.
\end{align}

12. Pete, I don't know if I'm more impressed with the simplicity of your explanation or your use of TeX in a manner new to me. Nice

13. Thanks RJ. I've been meaning to put something about that in the tex thread.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•