stats.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Basics</title>
<meta name="generator" content="Org mode">
<meta name="author" content="George Kontsevich">
<meta name="description" content="Basics in Stats using Octave/MATLAB"
>
<link rel="stylesheet" type="text/css" href="../web/worg.css" />
<link rel="shortcut icon" href="../web/panda.svg" type="image/x-icon">
</head>
<body>
<div id="content">
<h1 class="title">Basics</h1>
<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#org5ac9d02">Intro</a></li>
<li><a href="#orgf6deb6f">Continious random variables</a></li>
<li><a href="#org7adbf62">Probability density functions (PDFs)</a>
<ul>
<li><a href="#org32b487e">Example: Normal distribution</a></li>
<li><a href="#org1ecfa53">Probability densities</a></li>
<li><a href="#org46baf89">Estimating a PDF</a></li>
<li><a href="#orgea681fb">Example: Estimating the Normal</a></li>
</ul>
</li>
<li><a href="#org8ee8f3e">Descriptive statistics</a>
<ul>
<li>
<ul>
<li><a href="#orgdaeaac3">Expected Value/Mean</a></li>
<li><a href="#org69339b3">Variance</a></li>
<li><a href="#orgc4e8c40">Standard deviation</a></li>
</ul>
</li>
<li><a href="#org82c6112">Estimating the Normal's descriptive statistics</a>
<ul>
<li><a href="#org043132c">Mean of the Normal</a></li>
<li><a href="#org4c2d329">Standard deviation of the normal</a></li>
</ul>
</li>
<li><a href="#orgf6a0a23">SigFigs</a></li>
<li><a href="#org8d7ee41">Adding two random variables</a>
<ul>
<li><a href="#org81b1c18">Expected Value</a></li>
<li><a href="#orgaffa2e4">Variance</a></li>
<li><a href="#org211f5f3">Standard Deviation</a></li>
<li><a href="#org0d67699">SigFigs</a></li>
<li><a href="#orga1ba726">Example: Standard deviation of the mean (SDOM) for a normal distribution</a></li>
</ul>
</li>
<li><a href="#org4dc42f6">Multiplying two random variables</a>
<ul>
<li><a href="#org4647717">Expected Value</a></li>
<li><a href="#org2ae08f5">Variance</a></li>
<li><a href="#org0cd8e57">Standard Deviation</a></li>
<li><a href="#org87de359">SigFigs</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#orgf5e3c80">Other Topics</a>
<ul>
<li><a href="#org84d2205">Data Rejection</a>
<ul>
<li><a href="#org1d664dc">Chauvenet's Criterion</a></li>
</ul>
</li>
<li><a href="#org1c6fc4f">Weighted Average</a>
<ul>
<li><a href="#org233602a">Mean</a></li>
<li><a href="#org3752565">Standard Deviation</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#org83847d2">Multiple Variables</a>
<ul>
<li><a href="#org6e57917">Covariance</a>
<ul>
<li><a href="#orgec3a23a">Example</a></li>
<li><a href="#org1222066">Improved!</a></li>
</ul>
</li>
<li><a href="#orgede6e65">Correlation Coefficient</a>
<ul>
<li><a href="#org804f4f1">Example</a></li>
<li><a href="#org1be3ac5">Improved!</a></li>
<li><a href="#org3b47044">Confidence</a></li>
</ul>
</li>
<li><a href="#org3cb1135">Functions of two variables</a>
<ul>
<li><a href="#org6a14d9d">Mean</a></li>
<li><a href="#org82116f4">Variance</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#orgca534bf">Binomial Distribution</a>
<ul>
<li><a href="#orgc1ebdc7">Binomial Coefficient</a></li>
<li><a href="#org2f39609">Coin toss</a></li>
<li><a href="#org15cfecc">Gaussian approximation</a>
<ul>
<li><a href="#org56dadda">Descriptive Statistics</a></li>
<li><a href="#orgfa2c846">Mean</a></li>
<li><a href="#org91dbba5">Standard Deviation</a></li>
</ul>
</li>
<li><a href="#orga8b002a">Random errors</a>
<ul>
<li><a href="#org68a9613">Null hypothesis test</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#org888a630">Poisson Distribution</a>
<ul>
<li><a href="#org4685be1">Mean</a></li>
<li><a href="#org960de03">Standard Deviation</a></li>
<li><a href="#org087bcad">Gaussian</a></li>
<li><a href="#org4942943">Subtracting a background</a></li>
</ul>
</li>
<li><a href="#org7994465">Chi-Squared</a>
<ul>
<li><a href="#org73e4720">Degrees of Freedom</a></li>
<li><a href="#orgaa6e558">Reduced <b>&chi;<sup>2</sup></b></a></li>
<li><a href="#org8f9334e">General equation</a>
<ul>
<li><a href="#org2f7a2f5">Example</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
</div>

<div id="outline-container-org5ac9d02" class="outline-2">
<h2 id="org5ac9d02">Intro</h2>
<div class="outline-text-2" id="text-org5ac9d02">
<p>
A digest of basic statistics from reading John Taylor's <i>An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements</i>
</p>

<p>
The book generally works "backwards": Starting from useful principles of working with uncertain values and then working through the justification and proofs. Since this is a quick digest and personal referesher/reference, the order is opposite of that presented in the book. When it makes sense I had inserted some MATLAB/Octave code to demonstrate things
</p>
</div>
</div>
<div id="outline-container-orgf6deb6f" class="outline-2">
<h2 id="orgf6deb6f">Continious random variables</h2>
<div class="outline-text-2" id="text-orgf6deb6f">
<p>
Are variables that when measured give a value of infinite precision. So no number can be measured twice
</p>
</div>
</div>

<div id="outline-container-org7adbf62" class="outline-2">
<h2 id="org7adbf62">Probability density functions (PDFs)</h2>
<div class="outline-text-2" id="text-org7adbf62">
<p>
Probability density functions (PDFs) are the basic functions that describe a continious random variable. The basic property of a PDF is that it can be integrated between any two value. The resulting value is the probability that your measurement will appear between the two values. Since each measurements yields <i>some</i> value, integrate between -&infin; and &infin; must be equal to 1 (ie. 100%). The condition places a constraint on which functions are valid PDFs.
</p>

<div class="org-src-container">
<pre class="src src-octave">plot_range <span style="color: #483d8b;">=</span> 5
sigma <span style="color: #483d8b;">=</span> 1
center <span style="color: #483d8b;">=</span> 3
tau <span style="color: #483d8b;">=</span> 3.5
t <span style="color: #483d8b;">=</span> [<span style="color: #483d8b;">-</span>10<span style="color: #483d8b;">*</span>plot_range<span style="color: #483d8b;">:</span>plot_range<span style="color: #483d8b;">/</span>100<span style="color: #483d8b;">:</span>plot_range]
plot(t<span style="color: #483d8b;">,</span>(1<span style="color: #483d8b;">/</span>tau)<span style="color: #483d8b;">*</span>exp(<span style="color: #483d8b;">-</span>t<span style="color: #483d8b;">/</span>tau))
<span style="color: #b22222;">%</span><span style="color: #b22222;">axis "off"</span>
hold on<span style="color: #483d8b;">;</span>
<span style="color: #b22222;">%</span><span style="color: #b22222;">integration_xs = [a:(plot_range/50):b]</span>
<span style="color: #b22222;">%</span><span style="color: #b22222;">area(integration_xs,normal_dist(integration_xs,center,sigma), "facecolor",[0.74 0.9 1.0])</span>
print <span style="color: #8b2252;">"-S720,160"</span> <span style="color: #8b2252;">"exponential.svg"</span>
hold off<span style="color: #483d8b;">;</span>
</pre>
</div>

<div class="figure">
<p><object type="image/svg+xml" data="./exponential.svg" class="org-svg">
Sorry, your browser does not support SVG.</object>
</p>
</div>
</div>

<div id="outline-container-org32b487e" class="outline-3">
<h3 id="org32b487e">Example: Normal distribution</h3>
<div class="outline-text-3" id="text-org32b487e">
<p>
The most common PDF is the <i>normal distribution</i> and it'll serve as a frequent example
</p>

<blockquote>
<p>
[1/(&sigma;&radic;2&pi;)] * e<sup>([x-X<sub>center</sub>]<sup>2</sup>)/(2&sigma;<sup>2</sup>)</sup>
</p>
</blockquote>

<p>
It has two parameters, <b>&sigma;</b> and <b>X<sub>center</sub></b>. Other functions may have different parameters. As we will see later, in this case these two happen to correspond to the <i>mean</i> and <i>standard deviation</i>, but this won't be the case for all PDFs.
</p>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> [points] <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">normal_dist</span>(xs<span style="color: #483d8b;">,</span> center<span style="color: #483d8b;">,</span> sigma)
  normalization_factor <span style="color: #483d8b;">=</span> 1<span style="color: #483d8b;">/</span>(sigma<span style="color: #483d8b;">*</span>sqrt(2<span style="color: #483d8b;">*</span>pi))
  points <span style="color: #483d8b;">=</span> normalization_factor <span style="color: #483d8b;">*</span> e<span style="color: #483d8b;">.^</span>(<span style="color: #483d8b;">-</span>((xs<span style="color: #483d8b;">-</span>center)<span style="color: #483d8b;">.^</span>2)<span style="color: #483d8b;">/</span>(2<span style="color: #483d8b;">*</span>sigma<span style="color: #483d8b;">^</span>2))
<span style="color: #a020f0;">end</span>
</pre>
</div>

<p>
We can plot it to see how it looks
</p>


<div class="figure">
<p><object type="image/svg+xml" data="./normal.svg" class="org-svg">
Sorry, your browser does not support SVG.</object>
</p>
</div>
</div>
</div>

<div id="outline-container-org1ecfa53" class="outline-3">
<h3 id="org1ecfa53">Probability densities</h3>
<div class="outline-text-3" id="text-org1ecfa53">
<p>
Again, because the random variable is continious, no value can be measured twice. We can see this in the PDF as well. If we integrate around some point x<sub>0</sub>. The integral from <i>x<sub>0</sub></i> to <i>x<sub>0</sub>+&delta;x</i> will go to zero as <i>&delta;x</i> goes to zero. We know that for very small integrals we can approximate them with the area of the immediate rectangle
</p>


<div class="figure">
<p><object type="image/svg+xml" data="./prob-density.svg" class="org-svg">
Sorry, your browser does not support SVG.</object>
</p>
</div>


<p>
The size of the <i>&delta;x</i> isn't really important, but we can see that given a few points <i>x<sub>0</sub></i>, <i>x<sub>1</sub></i>, <i>x<sub>2</sub></i> .. etc.  we can use their equivalent <b>probability densities</b> <i>f(x<sub>0</sub>)</i>, <i>f(x<sub>1</sub>)</i>,  <i>f(x<sub>2</sub>)</i>, etc.  to say in effect that one was more likely than the other
</p>
</div>
</div>

<div id="outline-container-org46baf89" class="outline-3">
<h3 id="org46baf89">Estimating a PDF</h3>
<div class="outline-text-3" id="text-org46baf89">
<p>
Often in an experimental scenario we will get a series of measurements from which we woujld like to estimate a PDF. There is typically some meta information about the process that generated the values which tells you the general formula for the PDF.
</p>

<p>
We can guess parameters for the PDF and then generate probabilities that the values landed in the vicinity of each measurment. For each measurement, the probability of a measurement being in its vicinity is the area of its equivalent skinny rectangle: <i>f(x<sub>n</sub>)&delta;x</i>. And the cumulative probability of all the measurements landing where they did is just the product of the individual probabilities.
</p>
<blockquote>
<p>
f(x<sub>0</sub>) &delta;x &times; f(x<sub>1</sub>) &delta;x &times; f(x<sub>2</sub>) &delta;x &times; &hellip;
</p>
</blockquote>
<p>
This ofcourse is a tiny value, that gets smaller the smaller the <i>&delta;x</i> and the more measurements you make.
</p>

<p>
However the goal now is to adjust the PDF parameters so that this cumulative probability is maximized. Maximizing this product clearly doesn't depend on the value of <i>&delta;x</i> you've chosen, so we can remove those
</p>
<blockquote>
<p>
f(x<sub>0</sub>) &times; f(x<sub>1</sub>) &times; f(x<sub>2</sub>) &times; &hellip;
</p>
</blockquote>
<p>
As is most often the case, maximization is done by differentiation and setting the result to zero.
</p>
</div>
</div>

<div id="outline-container-orgea681fb" class="outline-3">
<h3 id="orgea681fb">Example: Estimating the Normal</h3>
<div class="outline-text-3" id="text-orgea681fb">
<p>
As an example, we can plug in the previously introduced <i>normal distribution</i> for the PDF <i>f(x)</i> and then assume some measurements <i>x<sub>0</sub></i> <i>x<sub>1</sub></i> <i>x<sub>2</sub></i>  .. 
We know the PDF equation:
</p>
<blockquote>
<p>
f(x) = [1/(&sigma;&radic;2&pi;)]  e<sup>([x-X<sub>center</sub>]<sup>2</sup>)/(2&sigma;<sup>2</sup>)</sup><br>
</p>
</blockquote>
<p>
and so the cumulative probability will be:
</p>
<blockquote>
<p>
f(x<sub>0</sub>)&times;f(x<sub>1</sub>)&times;f(x<sub>2</sub>)&times; .. <br>
[1/(&sigma;&radic;2&pi;)]<sup>n</sup>  e<sup>&sum; ([x<sub>i</sub>-X<sub>center</sub>]<sup>2</sup>)/(2&sigma;<sup>2</sup>)</sup>
</p>
</blockquote>
<p>
<i>X<sub>center</sub></i> and <i>&sigma;</i> are our two unknow parameters. We could now differentiate and solve to find the maximal value
</p>
</div>
</div>
</div>

<div id="outline-container-org8ee8f3e" class="outline-2">
<h2 id="org8ee8f3e">Descriptive statistics</h2>
<div class="outline-text-2" id="text-org8ee8f3e">
<p>
A very unusual effect of continious random variables is that we can often skip their probability distributions entirely and summerizing them with just the <i>mean</i> and <i>variance</i>
</p>
</div>

<div id="outline-container-orgdaeaac3" class="outline-4">
<h4 id="orgdaeaac3">Expected Value/Mean</h4>
<div class="outline-text-4" id="text-orgdaeaac3">
<p>
The expected value of a random variable is the "average" of the probability distribution. Usually written as <b>E[x]</b>. For continious functions this translates to the integral 
</p>
<blockquote>
<p>
E[x] = &int;f(u)&times;u   &hellip; <i>(from  -&infin; to &infin;)</i>
</p>
</blockquote>
<p>
(the notation is a bit confusing b/c <b>x</b> is a random variable and <b>u</b> is a possible value for <b>x</b> and <b>f(u)</b> is a probability density for that value)
</p>
</div>
</div>

<div id="outline-container-org69339b3" class="outline-4">
<h4 id="org69339b3">Variance</h4>
<div class="outline-text-4" id="text-org69339b3">
<p>
The variance is a value to describe how much the random variable differs from its <i>expected value</i>
</p>
<blockquote>
<p>
Var[x]= E[(E[x]-x)<sup>2</sup>]
</p>
</blockquote>
<p>
The power-of-2 ensures that the you are integrating over positive values. If the expected value of the random variable is <b>0</b> then the variance is just <b>E[x<sup>2</sup>]</b> or <i>&int;(f(u))<sup>2</sup></i>
</p>
<dl class="org-dl">
<dt><b>TODO</b></dt><dd>Why not the absolute value? It's not just a mathematical convenience&#x2026; It has some numerical advantage (I just forget what it is)</dd>
</dl>
</div>
</div>

<div id="outline-container-orgc4e8c40" class="outline-4">
<h4 id="orgc4e8c40">Standard deviation</h4>
<div class="outline-text-4" id="text-orgc4e8c40">
<p>
The <i>standard deviation</i> is just the square root of the variance and denoted with <i>&sigma;</i>: 
</p>
<blockquote>
<p>
Var[x] = &sigma;<sub>x</sub><sup>2</sup>
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org82c6112" class="outline-3">
<h3 id="org82c6112">Estimating the Normal's descriptive statistics</h3>
<div class="outline-text-3" id="text-org82c6112">
<p>
The Normal Distributions is a bit unusual in that the mean and standard deviation are parameters of the PDF. So when we differentiate and solve for the PDF that fits our measurements we automatically end up with the best estimates for the mean and variance
</p>
</div>
<div id="outline-container-org043132c" class="outline-4">
<h4 id="org043132c">Mean of the Normal</h4>
<div class="outline-text-4" id="text-org043132c">
<p>
Finishing what we started before, we want to pick a <b>X<sub>center</sub></b> to maximize the cumulative probability of our measurements, which  was
</p>
<blockquote>
<p>
[1/(&sigma;&radic;2&pi;)]<sup>n</sup>  e<sup>&sum; ([x<sub>i</sub>-X<sub>center</sub>]<sup>2</sup>)/(2&sigma;<sup>2</sup>)</sup>
</p>
</blockquote>
<p>
Fortunately, even though the cumulative probability equation  a bit complicated, we only need to maximize the exponent part b/c everything else is not a function of <b>X<sub>center</sub></b>
</p>
<blockquote>
<p>
&sum; (x<sub>i</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sup>2</sup>)
</p>
</blockquote>
<p>
To find its maximum we differentiating and set equal to zero:
</p>
<blockquote>
<p>
&sum; (x<sub>i</sub>-X<sub>center</sub>) = 0
</p>
</blockquote>

<p>
Which gives us a 
</p>
<blockquote>
<p>
X<sub>center</sub> = &sum;x<sub>i</sub>/N
</p>
</blockquote>
</div>
</div>

<div id="outline-container-org4c2d329" class="outline-4">
<h4 id="org4c2d329">Standard deviation of the normal</h4>
<div class="outline-text-4" id="text-org4c2d329">
<p>
To estimate an optimal standard deviation <b>&sigma;</b> we follow the procedure as with the mean <b>X<sub>center</sub></b>. We differentiate with respect to <b>&sigma;</b> and then solve for when it's equal to zero. Unlike before, we can't just maximize the exponent part b/c in this case there is a <b>/sigma{}</b> in the factor in front as well as in the exponent. So we need to differentiate <i>by-parts</i>
</p>
<blockquote>
<p>
0 = d/d&sigma;  [1/(&sigma;&radic;2&pi;)]<sup>n</sup>  e<sup>&sum; ([x<sub>i</sub>-X<sub>center</sub>]<sup>2</sup>)/(2&sigma;<sup>2</sup>)</sup><br>
&sigma;<sub>best-estimate</sub> = &radic; (1/N) &sum;(x<sub>i</sub>-X<sub>center</sub>)<sup>2</sup>
</p>
</blockquote>
<p>
The next issue we see is that the solution for <b>&sigma;</b> is a function of <b>X<sub>center</sub></b> which we also don't know. The best we can do here is to subsitute in the best estimate we'd just found in the previous section:  <b>&sum;x<sub>i</sub>/N</b>
</p>

<p>
<b>This unfortunately introducing a <i>bias</i>!</b> <br>
</p>

<p>
The estimated mean is such that the measurments are situated "optimally" around it. The true mean, whatever it is, will be <i>guaranteed</i> to be worse relative to our measurements and so the measurements in actuality deviate more from the mean than from its estimate. The true  <b>X<sub>center</sub></b> of the underlying random variable will not be perfectly centered and the true standard deviation will always be a bit higher. The correct estimate for <b>&sigma;</b> will be:
</p>
<blockquote>
<p>
&sigma;<sub>best-estimate</sub> = &radic; (1/[N-1]) &sum;(x<sub>i</sub>-X<sub>center</sub>)<sup>2</sup>
</p>
</blockquote>
<p>
Proof of this is ommitted.. but do note that as N gets larger this bias becomes insignificant
</p>
</div>
</div>
</div>

<div id="outline-container-orgf6a0a23" class="outline-3">
<h3 id="orgf6a0a23">SigFigs</h3>
<div class="outline-text-3" id="text-orgf6a0a23">
<p>
We can then use these two descriptive statistics to write out any random variable (normally distributed or otherwise) as a <i>value</i> + <i>standard deviation</i> pair. This is often what you will see on an instrument or quoted in a scientific publication. The underlying probability distribution will often either be infered or, as we'll soon see, be irrelevant
</p>

<p>
Uncertainties/standard-deviation should typically be one digit. And they should be rounded to be on the order of the significant figure.
</p>
<blockquote>
<p>
2.45 &plusmn; 0.04
</p>
</blockquote>
<p>
When we later see how to estimate these values it'll become clear that it doesn't make much sense to have more digits of precision in your uncertainty than in your expected value. In some cases you may want to two sigfigs in the uncertainty - if for instance the value is small
</p>
<blockquote>
<p>
1.3 &plusmn; 0.4<br>
1.3 &plusmn; 0.37
</p>
</blockquote>
<p>
You can also express the descriptive statistics as a percentage. <br>
Starting with:
</p>
<blockquote>
<p>
x<sub>0</sub> &plusmn; &delta;x<sub>0</sub>
</p>
</blockquote>
<p>
You could also rewrite it as:
</p>
<blockquote>
<p>
x<sub>0</sub> with a  [100*&delta;x<sub>0</sub>/x<sub>0</sub>]  percent uncertainty<br>
or<br>
x<sub>0</sub> &plusmn; xx%
</p>
</blockquote>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> [value<span style="color: #483d8b;">,</span> fractional] <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">uncertainty2fractional</span>(x<span style="color: #483d8b;">,</span> dx)
  value <span style="color: #483d8b;">=</span> x
  fractional <span style="color: #483d8b;">=</span> (dx<span style="color: #483d8b;">/</span>x)
<span style="color: #a020f0;">end</span>
</pre>
</div>

<div class="org-src-container">
<pre class="src src-octave">[x fx]  <span style="color: #483d8b;">=</span> uncertainty2fractional(50<span style="color: #483d8b;">,</span>25)
ans <span style="color: #483d8b;">=</span> [x fx]
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org"><span style="color: #0000ff;">| 50 | 0.5 |</span>
</pre>
</div>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> [value<span style="color: #483d8b;">,</span> uncertainty] <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">fractional2uncertainty</span>(x<span style="color: #483d8b;">,</span> fx)
  value <span style="color: #483d8b;">=</span> x
  uncertainty <span style="color: #483d8b;">=</span> x<span style="color: #483d8b;">*</span>fx
<span style="color: #a020f0;">end</span>
</pre>
</div>

<div class="org-src-container">
<pre class="src src-octave">[x dx]  <span style="color: #483d8b;">=</span> fractional2uncertainty(50<span style="color: #483d8b;">,</span>0.5)
ans <span style="color: #483d8b;">=</span> [x dx]
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org"><span style="color: #0000ff;">| 50 | 25 |</span>
</pre>
</div>
</div>
</div>

<div id="outline-container-org8d7ee41" class="outline-3">
<h3 id="org8d7ee41">Adding two random variables</h3>
<div class="outline-text-3" id="text-org8d7ee41">
<p>
Here we begin to see the utility of these <i>descriptive statistics</i>. They don't just serve as a quick summary of the random variable, but are something that can be manipulated directly!
</p>
</div>

<div id="outline-container-org81b1c18" class="outline-4">
<h4 id="org81b1c18">Expected Value</h4>
<div class="outline-text-4" id="text-org81b1c18">
<p>
If we add two random variables <b>z=x+y</b> then, regardless of their underlying probability distributions, their estimated values will add directly.
</p>
<blockquote>
<p>
<b>E[z] = E[x+y] = E[x] + E[y]</b>
</p>
</blockquote>
<p>
This is a consequence of integrals of sums <i>&int;( f(u)&times;u+g(u)&times;u ) = &int;f(u)&times;u+&int;g(u)&times;u</i> - and is independent of the distributions <i>f()</i> and <i>g()</i> (<i>they don't need to be normal</i>!)
</p>
</div>
</div>

<div id="outline-container-orgaffa2e4" class="outline-4">
<h4 id="orgaffa2e4">Variance</h4>
<div class="outline-text-4" id="text-orgaffa2e4">
<p>
The variance will also add directly, irrespective of distribution, as long as <i>a</i> and <i>b</i> are not correlated. To see this, imagine for simplicity that you have two random variables <b>x</b> and <b>y</b> that have an expected value of <b>0</b>.  We will say the random variable <b>z</b> is their sum <b>z=x+y</b>
</p>

<blockquote>
<p>
E[x] = 0 <br>
Var(x) = E[0 - x<sup>2</sup>] = E[x<sup>2</sup>] <br>
E[y] = 0 <br>
Var(y) = E[0 - y<sup>2</sup>] = E[y<sup>2</sup>] <br>
E[z] = E[x] + E[y] = 0 <br>
Var[z]= E[(E[z]-(z))<sup>2</sup>] = E[(0-(x+y))<sup>2</sup>] = E[(x+y)<sup>2</sup>] = E[x<sup>2</sup>+y<sup>2</sup>+xy] = E[x<sup>2</sup>] + E[y<sup>2</sup>] + E[xy]
</p>
</blockquote>

<p>
As long as <i>a</i> and <i>b</i> are uncorrelated, we can split the last term <b>E[xy]</b> into <b>E[x]*E[y]</b> to see that it is equal to <b>0</b> (you can again go back to the integral definition here). So we are left with the fact that the variance of the sum is the sum of the variances
</p>
<blockquote>
<p>
Var[z]= Var[x] + Var[y]
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org211f5f3" class="outline-4">
<h4 id="org211f5f3">Standard Deviation</h4>
<div class="outline-text-4" id="text-org211f5f3">
<p>
This makes the standard deviation the <b>quadrature sum</b> of the two random variables
</p>

<blockquote>
<p>
&sigma;<sub>z</sub> = &radic;Var[z] <br>
= &radic;[E[x<sup>2</sup>] + E[y<sup>2</sup>]] <br>
= &radic;(&sigma;<sub>x</sub><sup>2</sup> + &sigma;<sub>y</sub><sup>2</sup>)
</p>
</blockquote>
</div>
</div>

<div id="outline-container-org0d67699" class="outline-4">
<h4 id="org0d67699">SigFigs</h4>
<div class="outline-text-4" id="text-org0d67699">
<p>
Combining the two, we can now write out the sum of two random variables as:
</p>

<blockquote>
<p>
(x&plusmn;&delta;x) + (y&plusmn;&delta;y) = q&plusmn;&delta;q<br>
q = x+y<br>
&delta;q = &radic;{&delta;x<sup>2</sup>+&delta;y<sup>2</sup>}
</p>
</blockquote>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> [z<span style="color: #483d8b;">,</span> dz] <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">uncertainSum</span>(x<span style="color: #483d8b;">,</span> dx<span style="color: #483d8b;">,</span> y<span style="color: #483d8b;">,</span> dy)
  z <span style="color: #483d8b;">=</span> x <span style="color: #483d8b;">+</span> y
  dz <span style="color: #483d8b;">=</span> sqrt(dx<span style="color: #483d8b;">^</span>2<span style="color: #483d8b;">+</span>dy<span style="color: #483d8b;">^</span>2)
<span style="color: #a020f0;">end</span>
</pre>
</div>

<div class="org-src-container">
<pre class="src src-octave">[z dz] <span style="color: #483d8b;">=</span> uncertainSum(10<span style="color: #483d8b;">,</span> 1<span style="color: #483d8b;">,</span> 20<span style="color: #483d8b;">,</span> 3)
ans <span style="color: #483d8b;">=</span> [z dz]
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org"><span style="color: #0000ff;">| 30 | 3.16227766016838 |</span>
</pre>
</div>
</div>
</div>

<div id="outline-container-orga1ba726" class="outline-4">
<h4 id="orga1ba726">Example: Standard deviation of the mean (SDOM) for a normal distribution</h4>
<div class="outline-text-4" id="text-orga1ba726">
<p>
Using these new distribution-independent properties we can now calculate how accuracy of our previous estimate for the mean of a normally distributed random variable. Remember that the estimate turned out to just be <i>X<sub>center</sub> = &sum;x<sub>i</sub>/N</i> . ie. add up the measurements and then divide by the total number of them. We can write out the sum explicitely:
</p>

<blockquote>
<p>
 X<sub>center</sub> = &sum;x<sub>i</sub>/N <br>
= x<sub>0</sub>/N + x<sub>1</sub>/N + x<sub>2</sub>/N + &hellip;
</p>
</blockquote>

<p>
We know each measurement <i>x<sub>0|1|..</sub></i> came from a random variable of mean <b>X<sub>center</sub></b> and standard deviation <b>&sigma;<sub>x</sub></b> so we can write this sum using descriptive statistics:
</p>
<blockquote>
<p>
= (X<sub>center</sub>&plusmn;&sigma;<sub>X</sub>)/N + (X<sub>center</sub>&plusmn;&sigma;<sub>X</sub>)/N + (X<sub>center</sub>&plusmn;&sigma;<sub>X</sub>)/N + &hellip; <br>
= (X<sub>center</sub>/N&plusmn;&sigma;<sub>X</sub>/N) + (X<sub>center</sub>/N&plusmn;&sigma;<sub>X</sub>/N) + (X<sub>center</sub>/N&plusmn;&sigma;<sub>X</sub>/N) + &hellip;
</p>
</blockquote>

<p>
So this is simply a sum of random variables and we know that the variance of the sum of random variables is always a quadrature sum:
</p>

<blockquote>
<p>
&sigma;<sub>X<sub>center</sub></sub> = &radic; &sum; [ (&sigma;<sub>x</sub>/N)<sup>2</sup> + (&sigma;<sub>x</sub>/N)<sup>2</sup> + (&sigma;<sub>x</sub>/N)<sup>2</sup> + &hellip; ] <br>
= &radic;[N &times; (&sigma;<sub>x</sub>/N)<sup>2</sup> ] <br>
= &radic;[&sigma;<sub>x</sub><sup>2</sup>/N ] <br>
= &sigma;<sub>x</sub>/&radic;N
</p>
</blockquote>
<p>
In some situations this can provide a useful error. Notably if what you are measuring has very little error, but your measurements has a relatively large error. For instance if you're measuring a resistor. The resistor is of a constant value and doesn't fluctuate (ie. it basically have no variance itself), but your voltmeter/power source will have some errors that affect each measurement you make
</p>
<blockquote>
<p>
<b>Side note..</b> : the uncertainty in <b>&sigma;<sub>x</sub></b> is: <b>1 /&radic;[2(N-1)] <br>
</b>
<i>Proof omitted for now..</i>
</p>
</blockquote>
</div>
</div>
</div>
<div id="outline-container-org4dc42f6" class="outline-3">
<h3 id="org4dc42f6">Multiplying two random variables</h3>
<div class="outline-text-3" id="text-org4dc42f6">
<p>
Again, we work directly with the descriptive statistics without even looking at the variables' PDFs
</p>
</div>
<div id="outline-container-org4647717" class="outline-4">
<h4 id="org4647717">Expected Value</h4>
<div class="outline-text-4" id="text-org4647717">
<blockquote>
<p>
E[z] = E[x] &times; E[y]
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org2ae08f5" class="outline-4">
<h4 id="org2ae08f5">Variance</h4>
<div class="outline-text-4" id="text-org2ae08f5">
<p>
This proof is quite tricky..<br>
We start directly with directly applying the equation for the variance
</p>
<blockquote>
<p>
Var(xy) = E[(xy-XY)<sup>2</sup>] &hellip;
</p>
</blockquote>
<p>
<b>X</b> and <b>Y</b> are short hand for <b>E[x]</b> and <b>E[y]</b>
</p>

<p>
First we focus on massaging the inner term <b>xy-XY</b>
</p>
<blockquote>
<p>
xy-XY <br>
XY &times; [x/X &times; y/Y -1] <br>
XY &times; [ ([x-X]/X + 1) &times; ([y-Y]/Y + 1)  - 1 ] <br>
XY &times; [ [x-X]/X + [y-Y]/Y + [x-X]&times;[y-Y]/XY ]
</p>
</blockquote>
<p>
if <b>&delta;x = [x-X]/X</b> and <b>&delta;y = [y - Y]/Y</b>, then we can rewrite this as
</p>
<blockquote>
<p>
XY &times; [&delta;x+&delta;y+ &delta;x&delta;y]
</p>
</blockquote>
<p>
So now we can rewrite the original equation
</p>
<blockquote>
<p>
Var(xy) = E[(xy-XY)<sup>2</sup>] <br>
= (XY)<sup>2</sup> &times; E[(&delta;x + &delta;y + &delta;x&delta;y)<sup>2</sup>] <br>
= (XY)<sup>2</sup> &times; E[&delta;x<sup>2</sup> + &delta;x&delta;y + &delta;x<sup>2</sup>&delta;y + &delta;y&delta;x + &delta;y<sup>2</sup> + &delta;x&delta;y<sup>2</sup> + &delta;x<sup>2</sup>&delta;y + &delta;x&delta;y<sup>2</sup> + &delta;x<sup>2</sup>&delta;y<sup>2</sup>] <br>
= (XY)<sup>2</sup> &times; (E[&delta;x<sup>2</sup>] + E[&delta;x&delta;y] + E[&delta;x<sup>2</sup>&delta;y] + E[&delta;y&delta;x] + E[&delta;y<sup>2</sup>] + E[&delta;x&delta;y<sup>2</sup>] + E[&delta;x<sup>2</sup>&delta;y] + E[&delta;x&delta;y<sup>2</sup>] + E[&delta;x<sup>2</sup>&delta;y<sup>2</sup>]) <br>
</p>
</blockquote>
<p>
Not mentioned explicitely till now, but when we have want to find the <i>expected value</i> of a equation of multiple random variables  we need to integrate across all variables. So in this case we want to integrate across <b>x</b> and <b>y</b> from -&infin; to &infin;. As premised at the beginning, <b>x</b> and <b>y</b> are independent, so when we write out these expected values we can actually rearrange them as each integral is independent (the other terms is treated as a constant in each integral). 
</p>

<blockquote>
<p>
E[x<sup>n</sup>y<sup>m</sup>] <br>
= &int;<sub>x</sub> &int;<sub>y</sub> x<sup>n</sup>y<sup>m</sup><br>
= &int;<sub>x</sub> x<sup>n</sup>&int;<sub>y</sub> y<sup>m</sup>
</p>
</blockquote>
<p>
The final trick is to notice that <b>&delta;x</b> and <b>&delta;y</b> have an expected values are <i>zero</i> - so if either <b>m</b> or <b>n</b> are equal to <b>1</b> then it's integral goes to zero and so <b>E[x<sup>n</sup>y<sup>m</sup>]</b> comes out to <b>0</b>. The only nonzero terms that remain are
</p>
<blockquote>
<p>
= (XY)<sup>2</sup> &times; (E[&delta;x<sup>2</sup>] + E[&delta;y<sup>2</sup>] + E[(&delta;x&delta;y)<sup>2</sup>]) <br>
</p>
</blockquote>
<blockquote>
<p>
= (XY)<sup>2</sup> &times; (V[x]/X<sup>2</sup> + V[y]/Y<sup>2</sup> + V[x]V[y]/(XY)<sup>2</sup>) <br>
= Y<sup>2</sup>V[x] + X<sup>2</sup>V[y] + V[x]V[y] <br>
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org0cd8e57" class="outline-4">
<h4 id="org0cd8e57">Standard Deviation</h4>
<div class="outline-text-4" id="text-org0cd8e57">
<p>
The standard deviation is the square root of the variance.. 
</p>
<blockquote>
<p>
&sigma;<sub>xy</sub> &cong; &radic; [Y<sup>2</sup>V[x] + X<sup>2</sup>V[y] + V[x]V[y]]
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org87de359" class="outline-4">
<h4 id="org87de359">SigFigs</h4>
<div class="outline-text-4" id="text-org87de359">
<p>
Again, summarizing in our standard notation we get
</p>
<blockquote>
<p>
(x&plusmn;&delta;x) &times; (y&plusmn;&delta;y) = q&plusmn;&delta;q<br>
q = x&times;y<br>
&delta;q/q = &radic;[x<sup>2</sup>&delta;y<sup>2</sup> + y<sup>2</sup>&delta;x<sup>2</sup> + &delta;x<sup>2</sup>&delta;y<sup>2</sup>]
</p>
</blockquote>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> [z<span style="color: #483d8b;">,</span> dz] <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">uncertainProduct</span>(x<span style="color: #483d8b;">,</span> dx<span style="color: #483d8b;">,</span> y<span style="color: #483d8b;">,</span> dy)
  z <span style="color: #483d8b;">=</span> x<span style="color: #483d8b;">*</span>y
  dz <span style="color: #483d8b;">=</span> sqrt(x<span style="color: #483d8b;">^</span>2<span style="color: #483d8b;">*</span>dy<span style="color: #483d8b;">^</span>2 <span style="color: #483d8b;">+</span> y<span style="color: #483d8b;">^</span>2<span style="color: #483d8b;">*</span>dx<span style="color: #483d8b;">^</span>2 <span style="color: #483d8b;">+</span> dx<span style="color: #483d8b;">^</span>2<span style="color: #483d8b;">*</span>dy<span style="color: #483d8b;">^</span>2)
<span style="color: #a020f0;">end</span>
</pre>
</div>
</div>
</div>
</div>
</div>

<div id="outline-container-orgf5e3c80" class="outline-2">
<h2 id="orgf5e3c80">Other Topics</h2>
<div class="outline-text-2" id="text-orgf5e3c80">
</div>
<div id="outline-container-org84d2205" class="outline-3">
<h3 id="org84d2205">Data Rejection</h3>
<div class="outline-text-3" id="text-org84d2205">
<p>
Sometimes things go wrong and data is just way out there
</p>
</div>
<div id="outline-container-org1d664dc" class="outline-4">
<h4 id="org1d664dc">Chauvenet's Criterion</h4>
<div class="outline-text-4" id="text-org1d664dc">
<p>
This is a scheme for rejecting data - but it only makes sense if your data is normally distributed
You first find how many standard deviation this measurement is away from your estimated mean
</p>
<blockquote>
<p>
t<sub>sus</sub> = |x<sub>sus</sub> - X<sub>center</sub>|/&sigma;<sub>x</sub>
</p>
</blockquote>
<p>
Then you look up the probability a measurement is that far off in a table. This probability represents the fraction of points that will lie either at the point in question or further out. Multiply it times the amount of measurements you've done, <b>N</b>,  and you get the amount of points you'd expect to lie at the point-or-further after <b>N</b> measurements.
</p>
<blockquote>
<p>
N &times; Prob(outside t<sub>sus</sub>&sigma;)
</p>
</blockquote>
<p>
If the value is smaller than <b>1/2</b> then we expect less tan half a point out as far as the point in question - so you can reject the point and recalculate your mean/sigma.
</p>
</div>
</div>
</div>
<div id="outline-container-org1c6fc4f" class="outline-3">
<h3 id="org1c6fc4f">Weighted Average</h3>
<div class="outline-text-3" id="text-org1c6fc4f">
<p>
We often want to average different measurements into one. It's clear a measurement with a larger error should have a smaller effect on the outcome than a measurement with a smaller error. 
</p>
</div>
<div id="outline-container-org233602a" class="outline-4">
<h4 id="org233602a">Mean</h4>
<div class="outline-text-4" id="text-org233602a">
<p>
To get the combined mean we proceed in a direct manner. We multiply the two PDFs and minimize the result
</p>
<blockquote>
<p>
pdf(x<sub>a</sub>) = (1/(&sigma;<sub>a</sub>&radic;2&pi;))  e<sup>(x<sub>a</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>a</sub><sup>2</sup>)</sup><br>
pdf(x<sub>b</sub>) = (1/(&sigma;<sub>b</sub>&radic;2&pi;))  e<sup>(x<sub>b</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>b</sub><sup>2</sup>)</sup><br>
</p>
</blockquote>
<p>
Here <b>x<sub>a</sub></b> and <b>x<sub>b</sub></b> are the two prior measurements/estimates we have, and the <b>X<sub>center</sub></b> will be our combined average mean. We need to choose an  <b>X<sub>center</sub></b> that will maximize the probability of both functions. We combine probabilities by multiplication
</p>
<blockquote>
<p>
pdf(x<sub>a</sub>) &times; pdf(x<sub>b</sub>) <br>
(1/(&sigma;<sub>a</sub>&radic;2&pi;))  e<sup>(x<sub>a</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>a</sub><sup>2</sup>)</sup> &times; (1/(&sigma;<sub>b</sub>&radic;2&pi;))  e<sup>(x<sub>b</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>b</sub><sup>2</sup>)</sup> <br>
(1/(&sigma;<sub>a</sub>&sigma;<sub>b</sub>2&pi;)) e<sup>(x<sub>a</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>a</sub><sup>2</sup>) + (x<sub>b</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>b</sub><sup>2</sup>)</sup> <br>
</p>
</blockquote>
<p>
Again here we just care to maximize the exponent (the only part that's a function of <b>X<sub>center</sub></b>). We take its derivative and set to zero
</p>
<blockquote>
<p>
0 = d/dX (x<sub>a</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>a</sub><sup>2</sup>) + (x<sub>b</sub>-X<sub>center</sub>)<sup>2</sup>/(2&sigma;<sub>b</sub><sup>2</sup>)<br>
0 = (x<sub>a</sub>-X<sub>center</sub>)/&sigma;<sub>a</sub><sup>2</sup> + (x<sub>b</sub>-X<sub>center</sub>)/&sigma;<sub>b</sub><sup>2</sup> <br>
X<sub>center</sub> = [x<sub>a</sub>/&sigma;<sub>a</sub><sup>2</sup> + x<sub>b</sub>/&sigma;<sub>b</sub><sup>2</sup>] / [(1/&sigma;<sub>a</sub><sup>2</sup> + 1/&sigma;<sub>b</sub><sup>2</sup>]
</p>
</blockquote>
<p>
You can also rewrite this by introducing a new variable called a <b>weight</b>
</p>
<blockquote>
<p>
w = 1/&sigma;<sup>2</sup>
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org3752565" class="outline-4">
<h4 id="org3752565">Standard Deviation</h4>
<div class="outline-text-4" id="text-org3752565">
<p>
The standard deviation we derive from the equation for the mean by applying error propagation.
The simplifying trick is to first look at the variance of the sum: <b>Var[z]= Var[x] + Var[y]</b>. 
If you immediately try to calculate standard deviation and use the quadrature rule then things get overly complicated
(as a general rule, it's easier to deal with variances b/c you avoid the quadrature mathematical mess)
</p>
<blockquote>
<p>
X<sub>center</sub> = [x<sub>a</sub>/&sigma;<sub>a</sub><sup>2</sup> + x<sub>b</sub>/&sigma;<sub>b</sub><sup>2</sup>] / [(1/&sigma;<sub>a</sub><sup>2</sup> + 1/&sigma;<sub>b</sub><sup>2</sup>] <br>
Var[center] = Var[x<sub>a</sub>]/[&sigma;<sub>a</sub><sup>2</sup> + Var[x<sub>b</sub>]/[&sigma;<sub>b</sub><sup>2</sup> / [(1/&sigma;<sub>a</sub><sup>2</sup> + 1/&sigma;<sub>b</sub><sup>2</sup>]
</p>
</blockquote>
<p>
Since <b>Var[w] = &sigma;<sub>w</sub><sup>2</sup></b>, the top terms all cancel out and you're left with the bottom term only
</p>
<blockquote>
<p>
Var[center] = 1 / [(1/&sigma;<sub>a</sub><sup>2</sup> + 1/&sigma;<sub>b</sub><sup>2</sup>] <br>
&sigma;<sub>center</sub> = 1 / &radic;[(1/&sigma;<sub>a</sub><sup>2</sup> + 1/&sigma;<sub>b</sub><sup>2</sup>]
</p>
</blockquote>
</div>
</div>
</div>
</div>
<div id="outline-container-org83847d2" class="outline-2">
<h2 id="org83847d2">Multiple Variables</h2>
<div class="outline-text-2" id="text-org83847d2">
</div>
<div id="outline-container-org6e57917" class="outline-3">
<h3 id="org6e57917">Covariance</h3>
<div class="outline-text-3" id="text-org6e57917">
<p>
If we have two variables changing simulatenously we can quantify how much they change "together" with a term called <b>covariance</b>
</p>
<blockquote>
<p>
&sigma;<sub>xy</sub> =  1/N &sum; (x<sub>i</sub> - x<sub>mean</sub>)(y<sub>i</sub> - y<sub>mean</sub>)
</p>
</blockquote>
<p>
If the variables are independent then the <b>covariance</b> will be around zero b/c both terms in the multiplication, <b>(x<sub>i</sub> - x<sub>mean</sub>)</b> and <b>(y<sub>i</sub> - y<sub>mean</sub>)</b> are bouncing around zero randomly. If the terms are not independent and when one variable varies away from the mean then the other does as well - then this term will be non zero
</p>

<dl class="org-dl">
<dt><b>Schwarz Inequality</b></dt><dd>If the two variables perfectly correlate (one goes "high" when the other goes "high") then the maximal value attainable will be <b>&sigma;<sub>x</sub>&sigma;<sub>y</sub></b>. This provides a useful limit to the covariance.</dd>
</dl>

<blockquote>
<p>
&sigma;<sub>xy</sub> &le; &sigma;<sub>x</sub>&sigma;<sub>y</sub>
</p>
</blockquote>

<p>
While notationally they look like standard deviations (ex: <b>&sigma;<sub>x</sub></b>), dimensionally they're equivalent to variances (ie. dimension of <b>x</b> times dimension of <b>y</b>) - hence the name. This is because variances are typically written as <b>&sigma;<sub>x</sub><sup>2</sup></b> - which I've been writing so far as <b>Var[x]</b> to differentiate it and for clarity.
</p>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> result <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">covariance</span> (x<span style="color: #483d8b;">,</span> y)
  xMean <span style="color: #483d8b;">=</span> mean(x)
  yMean <span style="color: #483d8b;">=</span> mean(y)
  result <span style="color: #483d8b;">=</span> sum(((x <span style="color: #483d8b;">-</span> xMean) <span style="color: #483d8b;">.*</span> (y <span style="color: #483d8b;">-</span> yMean)))<span style="color: #483d8b;">/</span>length(x)
<span style="color: #a020f0;">end</span>
</pre>
</div>
</div>
<div id="outline-container-orgec3a23a" class="outline-4">
<h4 id="orgec3a23a">Example</h4>
<div class="outline-text-4" id="text-orgec3a23a">
<p>
Problem  9.1
</p>
<div class="org-src-container">
<pre class="src src-octave">x <span style="color: #483d8b;">=</span> [ 20 23 23 22 ]
y <span style="color: #483d8b;">=</span> [ 30 32 35 31 ]
covariance(x<span style="color: #483d8b;">,</span>y)
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">1.75
</pre>
</div>
<p>
Problem 9.2
</p>
<div class="org-src-container">
<pre class="src src-octave">t <span style="color: #483d8b;">=</span> [ 14 12 13 15 16]
T <span style="color: #483d8b;">=</span> [ 20 18 18 22 22]
mean(t)
mean(T)
covariance(t<span style="color: #483d8b;">,</span>T)
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">2.4
</pre>
</div>
</div>
</div>
<div id="outline-container-org1222066" class="outline-4">
<h4 id="org1222066">Improved!</h4>
<div class="outline-text-4" id="text-org1222066">
<p>
(from Problem 9.10)<br>
Starting with our original equation
</p>
<blockquote>
<p>
&sigma;<sub>xy</sub> =  1/N &sum; (x<sub>i</sub> - x<sub>mean</sub>)(y<sub>i</sub> - y<sub>mean</sub>)
</p>
</blockquote>
<p>
If we unwrap the product in the sum we can rewrite this as
</p>
<blockquote>
<p>
&sigma;<sub>xy</sub> <br>
=  1/N &sum; (x<sub>i</sub>y<sub>i</sub> - x<sub>i</sub>y<sub>mean</sub> - y<sub>i</sub>x<sub>mean</sub> + x<sub>mean</sub>y<sub>mean</sub>)<br>
=  1/N [ &sum; (x<sub>i</sub>y<sub>i</sub>) - &sum; (x<sub>i</sub>y<sub>mean</sub>) - &sum; (y<sub>i</sub>x<sub>mean</sub>) + &sum;(x<sub>mean</sub>y<sub>mean</sub>) ]<br>
=  1/N &sum; (x<sub>i</sub>y<sub>i</sub>) - &sum;(x<sub>mean</sub>y<sub>mean</sub>)
</p>
</blockquote>
<p>
.. just using the fact that <b>&sum; (x<sub>i</sub>y<sub>mean</sub>)</b> =&gt; <b>N&times;x<sub>mean</sub>y<sub>mean</sub></b> b/c <b>&sum; x<sub>i</sub> = N&times;x<sub>mean</sub></b>
</p>

<p>
This version is better because it can be done in one pass. You don't need to first calculate <b>x<sub>mean</sub></b> and <b>y<sub>mean</sub></b> before calculating <b>&sigma;<sub>xy</sub></b>. The two things can be calculated simultaneously
</p>
</div>
</div>
</div>

<div id="outline-container-orgede6e65" class="outline-3">
<h3 id="orgede6e65">Correlation Coefficient</h3>
<div class="outline-text-3" id="text-orgede6e65">
<p>
To get a more objective value for how much our values seem to be independent we can calculate a sort of "normalized" covariance by dividing by the magnitudes of the variations. This values is called the <b>correlation coefficient</b>, it's denoted by the letter <b>r</b> and will always lie between <b>-1</b> and <b>+1</b>
</p>
<blockquote>
<p>
r = &sigma;<sub>xy</sub> / &sigma;<sub>x</sub>&sigma;<sub>y</sub> <br>
r = &sum;(x<sub>i</sub> - x<sub>mean</sub>)(y<sub>i</sub> - y<sub>mean</sub>) / &radic; &sum;(x<sub>i</sub> - x<sub>mean</sub>)<sup>2</sup> &sum;(y<sub>i</sub> - y<sub>mean</sub>)<sup>2</sup>
</p>
</blockquote>
<p>
It's kinda like we're dividing my the <i>Schwartz Inequality</i> &#x2026;
</p>

<p>
However this value is very sensitive to the number of measurements. If our <b>N</b> is small then this value can be wildly wrong. We need an additional confidence calculation. This is done by using a look up table to see <b>Prob( ||r|| &gt; <i>value</i> | given N measurements)</b>
</p>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> result <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">corrcoeff</span> (x<span style="color: #483d8b;">,</span> y)
  xyCovariance <span style="color: #483d8b;">=</span> covariance(x<span style="color: #483d8b;">,</span>y)
  xStd <span style="color: #483d8b;">=</span> std(x<span style="color: #483d8b;">,</span>1)
  yStd <span style="color: #483d8b;">=</span> std(y<span style="color: #483d8b;">,</span>1)  
  result <span style="color: #483d8b;">=</span> xyCovariance <span style="color: #483d8b;">/</span> (xStd <span style="color: #483d8b;">*</span> yStd)
<span style="color: #a020f0;">end</span>
</pre>
</div>
<p>
Note: Not sure that the "inaccurate" standard deviation is used here (divide by <b>N</b> not <b>N-1</b>) b/c it has to match the one in the numerator to cancel out. Obviously there are some numerically unecessary things going on here..
</p>
</div>

<div id="outline-container-org804f4f1" class="outline-4">
<h4 id="org804f4f1">Example</h4>
<div class="outline-text-4" id="text-org804f4f1">
<p>
Problem  9.8
</p>
<div class="org-src-container">
<pre class="src src-octave">x <span style="color: #483d8b;">=</span> [ 1 2 3 4 5 ]
y <span style="color: #483d8b;">=</span> [ 8 8 5 6 3 ]
corrcoeff(x<span style="color: #483d8b;">,</span>y)
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">-0.8944271909999159
</pre>
</div>

<p>
Problem 9.9
</p>
<div class="org-src-container">
<pre class="src src-octave">t <span style="color: #483d8b;">=</span> [ 1 2 3 5 6 7]
T <span style="color: #483d8b;">=</span> [ 5 6 6 8 8 9]
corrcoeff(t<span style="color: #483d8b;">,</span>T)
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">0.9819805060619655
</pre>
</div>
</div>
</div>

<div id="outline-container-org1be3ac5" class="outline-4">
<h4 id="org1be3ac5">Improved!</h4>
<div class="outline-text-4" id="text-org1be3ac5">
<p>
(from Problem 9.10)<br>
using the improved covariance
</p>
<blockquote>
<p>
&sigma;<sub>xy</sub> =  &sum;(x<sub>i</sub>y<sub>i</sub>) - N x<sub>mean</sub>y<sub>mean</sub>
</p>
</blockquote>
<p>
we can also rewrite a one-pass correlation coefficient
</p>
<blockquote>
<p>
r = &sigma;<sub>xy</sub> / &sigma;<sub>x</sub>&sigma;<sub>y</sub> <br>
r = &sum;(x<sub>i</sub>y<sub>i</sub>) - N x<sub>mean</sub>y<sub>mean</sub> /&radic;[(&sum;x<sub>i</sub> - N x<sub>mean</sub><sup>2</sup>)(&sum;y<sub>i</sub> - N y<sub>mean</sub><sup>2</sup>)]
</p>
</blockquote>
<p>
All the terms in the final equation can be calculated separately in one pass across all <b>x<sub>i</sub></b> and <b>y<sub>i</sub></b>
<b>Note</b> There may be a way to do this as an inner product (and therefore a sum in the frequency domain)
</p>
</div>
</div>

<div id="outline-container-org3b47044" class="outline-4">
<h4 id="org3b47044">Confidence</h4>
<div class="outline-text-4" id="text-org3b47044">
<p>
Clearly at very small quantities of measurements, the correlation coefficient will give you wrong results. So the correlation coefficient can also be evaluated for it's confidence.
</p>

<p>
This is done by evaluating how likely a random process, given N values will gives a correlation coefficient larger than our calculated <b>r</b>. This is done with a lookup table and looking at <b>|r|</b>
</p>

<p>
If the probability is less then 5% then it's said to be <i>significant</i> and if it's less than %1 then it's said to be <i>highly significant</i>
</p>

<p>
See: Appendix C
</p>

<p>
<b>Note</b>: It seems that this is independent of the distribution of the variables&#x2026; (not clear why)
</p>

<p>
Further reading: <i>The Analysis of Physical Measurements</i> Section 12-8 <b>E.M Pugh</b> and <b>G.H. Winslow</b>
</p>

<p>
A useful application is illustrated in <i>problem 9.15</i>:
</p>
<blockquote>
<p>
If you do a least-squares fit of several points
</p>
</blockquote>
</div>
</div>
</div>

<div id="outline-container-org3cb1135" class="outline-3">
<h3 id="org3cb1135">Functions of two variables</h3>
<div class="outline-text-3" id="text-org3cb1135">
</div>
<div id="outline-container-org6a14d9d" class="outline-4">
<h4 id="org6a14d9d">Mean</h4>
<div class="outline-text-4" id="text-org6a14d9d">
<p>
Often we are left in a situation where we have a function of multiple variables. For simplicity we look at the simple case of a function of two variable <b>z=q(x,y)</b>. We can make observations of <b>x</b> and <b>y</b> and calculate corresponding <b>z</b> values. In this way we can then estimate the mean of <b>z</b>.
</p>
<blockquote>
<p>
z<sub>i</sub>=q(x<sub>i</sub>,y<sub>i</sub>) <br>
z<sub>mean</sub> = 1/N &sum; z<sub>i</sub> <br>
z<sub>mean</sub> = q(x<sub>i</sub>,y<sub>i</sub>)
</p>
</blockquote>
<p>
This solution is workable, however we can also construct an alternative by using a first order approximation if we a priori have an estimate for <b>x</b> and <b>y</b>. Notice that <b>q(x,y)</b> is some surface and if we differentiate at <b>x<sub>mean</sub></b> and <b>y<sub>mean</sub></b> we can get an estimate for the local slope. Then each measurement <b>x<sub>i</sub></b>, <b>y<sub>i</sub></b> will have a direct linear effect on <b>q<sub>i</sub></b>
</p>
<blockquote>
<p>
z<sub>i</sub> = q(x<sub>mean</sub>,y<sub>mean</sub>) + &part;q/&part;x (x<sub>i</sub> - x<sub>mean</sub>) + &part;q/&part;y (y<sub>i</sub> - y<sub>mean</sub>)
</p>
</blockquote>
<p>
And now the average will be the sum of these approximations
</p>
<blockquote>
<p>
z<sub>mean</sub> = 1/N &sum; z<sub>i</sub> <br>
z<sub>mean</sub> = 1/N &sum; q(x<sub>mean</sub>,y<sub>mean</sub>) + &part;q/&part;x (x<sub>i</sub> - x<sub>mean</sub>) + &part;q/&part;y (y<sub>i</sub> - y<sub>mean</sub>)
</p>
</blockquote>
<p>
The slope terms add up to zero and we are left with 
</p>
<blockquote>
<p>
z<sub>mean</sub> = 1/N &sum; q(x<sub>mean</sub>,y<sub>mean</sub>) <br>
z<sub>mean</sub> = q(x<sub>mean</sub>,y<sub>mean</sub>) 
</p>
</blockquote>
<p>
Which makes life much easier. This should hold true as long as the slopes are good approximations (ie, the variance is relatively small)
</p>
</div>
</div>
<div id="outline-container-org82116f4" class="outline-4">
<h4 id="org82116f4">Variance</h4>
<div class="outline-text-4" id="text-org82116f4">
<p>
We can continue similarly for the variance
</p>
<blockquote>
<p>
Var(z) = 1/N &sum; ( z<sub>i</sub> - z<sub>mean</sub> )<sup>2</sup> <br>
Var(z) = 1/N &sum; ( &part;q/&part;x (x<sub>i</sub> - x<sub>mean</sub>) + &part;q/&part;y (y<sub>i</sub> - y<sub>mean</sub>) )<sup>2</sup> <br>
Var(z) = (&part;q/&part;x)<sup>2</sup>  1/N &sum; (x<sub>i</sub> - x<sub>mean</sub>)<sup>2</sup> + (&part;q/&part;y)<sup>2</sup> 1/N &sum; (y<sub>i</sub> - y<sub>mean</sub>)<sup>2</sup> + 2&times;(&part;q/&part;x)(&part;q/&part;y) 1/N &sum; (x<sub>i</sub> - x<sub>mean</sub>)(y<sub>i</sub> - y<sub>mean</sub>)
</p>
</blockquote>
<p>
Which then simplifies to 
</p>
<blockquote>
<p>
Var(z) = (&part;q/&part;x)<sup>2</sup>  &sigma;<sub>x</sub><sup>2</sup> + (&part;q/&part;y)<sup>2</sup> &sigma;<sub>x</sub><sup>2</sup> + 2&times;(&part;q/&part;x)(&part;q/&part;y) 1/N &sum; (x<sub>i</sub> - x<sub>mean</sub>)(y<sub>i</sub> - y<sub>mean</sub>)
</p>
</blockquote>
<p>
The last term is the <b>covariance</b> - and if the two variables are uncorrelated then this shortens the complete equation for the variance
</p>
<blockquote>
<p>
Var(z) = (&part;q/&part;x)<sup>2</sup>  &sigma;<sub>x</sub><sup>2</sup> + (&part;q/&part;y)<sup>2</sup> &sigma;<sub>x</sub><sup>2</sup> + 2&times;(&part;q/&part;x)(&part;q/&part;y) &sigma;<sub>xy</sub>
</p>
</blockquote>
<p>
(Reminder: The partial derivatives are evaluated at <b>x<sub>mean</sub></b> and <b>y<sub>mean</sub></b> and correspond to the local slopes in the <b>x</b> and <b>y</b> directions)
</p>
</div>
<ul class="org-ul">
<li><a id="org8c7182a"></a>Example<br>
<div class="outline-text-5" id="text-org8c7182a">
<p>
Problem 9.3 - b
</p>
<div class="org-src-container">
<pre class="src src-octave">qVar <span style="color: #483d8b;">=</span>1<span style="color: #483d8b;">*</span>var(x<span style="color: #483d8b;">,</span>1)<span style="color: #483d8b;">+</span>1<span style="color: #483d8b;">*</span>var(y<span style="color: #483d8b;">,</span>1)<span style="color: #483d8b;">+</span>2<span style="color: #483d8b;">*</span>covariance(x<span style="color: #483d8b;">,</span>y)
qVar<span style="color: #483d8b;">^</span>0.5
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">0.8944271909999157
</pre>
</div>
<p>
Problem 9.3 - c
</p>
<div class="org-src-container">
<pre class="src src-octave">qVar <span style="color: #483d8b;">=</span>1<span style="color: #483d8b;">*</span>var(x<span style="color: #483d8b;">,</span>1)<span style="color: #483d8b;">+</span>1<span style="color: #483d8b;">*</span>var(y<span style="color: #483d8b;">,</span>1)
qVar<span style="color: #483d8b;">^</span>0.5
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">2.366431913239846
</pre>
</div>

<p>
Problem 9.3 - d
</p>
<div class="org-src-container">
<pre class="src src-octave">q <span style="color: #483d8b;">=</span> x <span style="color: #483d8b;">+</span> y
var(q<span style="color: #483d8b;">,</span>1)<span style="color: #483d8b;">^</span>0.5
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">0.8944271909999159
</pre>
</div>
</div>
</li>
</ul>
</div>
</div>
</div>

<div id="outline-container-orgca534bf" class="outline-2">
<h2 id="orgca534bf">Binomial Distribution</h2>
<div class="outline-text-2" id="text-orgca534bf">
<p>
As the name suggests, a distribution concerning binary/true-false trials. It answers the question - given <b>n</b> trials what is the probability of getting exactly <b>v</b> successes - where <b>p</b> is the probability of success
</p>
<blockquote>
<p>
Probability of v successes = p<sup>v</sup> <br>
Probability of the rest being failures = (1-p)<sup>n-v</sup> <br>
Binomial coefficient = (&shy;<sub>v</sub><sup>n</sup>) <br>
<br>
Probability of getting v successes <br>
= Binomial coefficient &times;  Probability of v successes &times; Probability of the rest being failures<br>
= (&shy;<sub>v</sub><sup>n</sup>) p<sup>v</sup> (1-p)<sup>n-v</sup>
</p>
</blockquote>
<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> result <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">bindist</span> (v<span style="color: #483d8b;">,</span> p<span style="color: #483d8b;">,</span> n)
  result <span style="color: #483d8b;">=</span> bincoeff(n<span style="color: #483d8b;">,</span>v) <span style="color: #483d8b;">*</span> p<span style="color: #483d8b;">^</span>v <span style="color: #483d8b;">*</span> (1<span style="color: #483d8b;">-</span>p)<span style="color: #483d8b;">^</span>(n<span style="color: #483d8b;">-</span>v)
<span style="color: #a020f0;">end</span>
</pre>
</div>
</div>
<ul class="org-ul">
<li><a id="org79ae3f0"></a>Examples<br>
<div class="outline-text-5" id="text-org79ae3f0">
<p>
Prob 10.1<br>
a
</p>
<div class="org-src-container">
<pre class="src src-octave">bindist(0 <span style="color: #483d8b;">,</span>(1<span style="color: #483d8b;">/</span>6)<span style="color: #483d8b;">,</span> 3)
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">0.5787037037037038
</pre>
</div>

<p>
b
</p>
<div class="org-src-container">
<pre class="src src-octave">bindist(1 <span style="color: #483d8b;">,</span>(1<span style="color: #483d8b;">/</span>6)<span style="color: #483d8b;">,</span> 3)
</pre>
</div>

<div class="org-src-container">
<pre class="src src-org">0.3472222222222223
</pre>
</div>
</div>
</li>
</ul>

<div id="outline-container-orgc1ebdc7" class="outline-3">
<h3 id="orgc1ebdc7">Binomial Coefficient</h3>
<div class="outline-text-3" id="text-orgc1ebdc7">
<p>
<a href="https://en.wikipedia.org/wiki/Combination">https://en.wikipedia.org/wiki/Combination</a> <br>
The <b>binomial coefficient</b> represents the total amount of combinations of orders that correspond to <b>v</b> successes and <b>n-v</b> failures*. For instance if you do 4 coin tosses then you have 4 possible "orderings" that give you 3 successes
</p>
<blockquote>
<p>
(&shy;<sub>v</sub><sup>n</sup>) = n! / [v! (n-v)!]
</p>
</blockquote>
<p>
This value grows quite quickly. The numerator <b>n!</b> is the amount of ways to arrange your <b>n</b> trials and the denominators <b>v!</b> and <b>(n-v)!</b>  are the number of ways to rearrange the successes and failures in place (b/c rearranging them leave the final sequence identical)
</p>

<p>
Notice that of course the number of ways to have <b>v</b> successes is the same as <b>v</b> failures (just flip the T and F symbols in the previous example) and so:
</p>
<blockquote>
<p>
(&shy;<sub>v</sub><sup>n</sup>) =  (&shy;<sub>n-v</sub><sup>n</sup>) <br>
n! / [v! (n-v)!] =  n! / [(n-v)! (n-(n-v))!] 
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org2f39609" class="outline-3">
<h3 id="org2f39609">Coin toss</h3>
<div class="outline-text-3" id="text-org2f39609">
<p>
Consequently when <b>p=&frac12;</b> then the distribution is symmetric
</p>
<blockquote>
<p>
Probability of getting v successes = Probability of getting (n-v) successes <br>
(&shy;<sub>v</sub><sup>n</sup>) p<sup>v</sup> (1-p)<sup>n-v</sup> = (&shy;<sub>n-v</sub><sup>n</sup>) p<sup>(n-v)</sup> (1-p)<sup>n-(n-v)</sup> <br>
(&shy;<sub>v</sub><sup>n</sup>) 0.5<sup>v</sup> 0.5<sup>n-v</sup> = (&shy;<sub>n-v</sub><sup>n</sup>) 0.5<sup>(n-v)</sup> 0.5<sup>n-n+v</sup> <br>
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org15cfecc" class="outline-3">
<h3 id="org15cfecc">Gaussian approximation</h3>
<div class="outline-text-3" id="text-org15cfecc">
<p>
As <b>n</b> gets very large the binomial distribution can be approximated with the Gaussian distribution. This is useful b/c the bionomial distribution is hard to calculate for large values (the factorials in the binomial coefficient explode very quickly). Once we have a Gaussian we just sample it at the value of <b>v</b> we are interested in. We can also estimate <b>P(v &gt; some value)</b> by integrating the gaussian from <b>v-0.5</b> to infinity (or by counting how many standard deviations that is equivalent to .. and then using a lookup table)
</p>
</div>
<div id="outline-container-org56dadda" class="outline-4">
<h4 id="org56dadda">Descriptive Statistics</h4>
<div class="outline-text-4" id="text-org56dadda">
<p>
To pick a correct corresponding Gaussian distribution we need to get the bionimial distribution's descriptive statistics which we can then plug into the Gaussian distribution (the mean and standard deviation are conveniently the parameters of the Gaussian)
</p>
</div>
</div>
<div id="outline-container-orgfa2c846" class="outline-4">
<h4 id="orgfa2c846">Mean</h4>
<div class="outline-text-4" id="text-orgfa2c846">
<p>
This was tricky and I used an alternate proof from online:<br>
<a href="https://www.probabilisticworld.com/binomial-distribution-mean-variance-formulas-proof/">https://www.probabilisticworld.com/binomial-distribution-mean-variance-formulas-proof/</a> <br>
First note that if
</p>
<blockquote>
<p>
 (&shy;<sub>v</sub><sup>n</sup>) = n! / [v! (n-v)!]<br>
<i>then</i> <br>
(&shy;<sub>v-1</sub><sup>n-1</sup>) = (n-1)! / [(v-1)! ( (n-1) - (v-1) )!]<br>
= (n-1)! / [(v-1)! (n-v)!]
</p>
</blockquote>
<p>
This is a bit counterintuitive at first glance b/c the <b>(n-v)!</b> remains unchanged <br>
We start by writing out the mean explicitely - it's the sum of the possible number of successes <b>v<sub>0..n</sub></b> times their equivalent probabilities <b>P(v<sub>0..n</sub>)</b> <br>
Then we just  rearrange and pull some terms out..
</p>
<blockquote>
<p>
v<sub>mean</sub> = &sum; v &times; P<sub>n,p</sub>(v)<br>
v<sub>mean</sub> = &sum; v &times; (&shy;<sub>v</sub><sup>n</sup>) p<sup>v</sup> (1-p)<sup>N-v</sup> <br>
= &sum; v &times; [n! / [v! (n-v)!]] &times; p<sup>v</sup> (1-p)<sup>N-v</sup> <br>
= &sum; v &times; [n/v] [(n-1)! / [(v-1)! (n-v)!]] &times; p &times; p<sup>v-1</sup> (1-p)<sup>N-v</sup> <br>
= np &sum; &times; [(n-1)! / [(v-1)! (n-v)!]] &times; p<sup>v-1</sup> (1-p)<sup>N-v</sup> <br>
= np &times; &sum; (&shy;<sub>v-1</sub><sup>n-1</sup>)  p<sup>v-1</sup> (1-p)<sup>N-v</sup> <br>
</p>
</blockquote>
<p>
Here the <b>&sum; (&shy;<sub>v-1</sub><sup>n-1</sup>)  p<sup>v-1</sup> (1-p)<sup>N-v</sup></b> is equal to <b>1</b> b/c it's the sum of all probabilities of <b>v-1</b> successes from <b>0</b> to <b>n-1</b> - and the sum of all the probability of all prossible outcomes is equal to <b>1</b>. So the final mean is simply
</p>
<blockquote>
<p>
v<sub>mean</sub> = np
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org91dbba5" class="outline-4">
<h4 id="org91dbba5">Standard Deviation</h4>
<div class="outline-text-4" id="text-org91dbba5">
<p>
Kinda torturous to show.. see the previously linked proof:
</p>
<blockquote>
<p>
&sigma;<sub>v</sub> = &radic;[np(1-p)]
</p>
</blockquote>
</div>
</div>
</div>
<div id="outline-container-orga8b002a" class="outline-3">
<h3 id="orga8b002a">Random errors</h3>
<div class="outline-text-3" id="text-orga8b002a">
<p>
This framework lets us look at random errors in our experiments. If we have many sources of error - all of the same size and probability 0.5 - that give our measurements random noise then they will behave like a bionomial. In other words the chance that all the little errors add up falls off just like getting a lot heads or tails on your coin tosses is small. If there are sufficient sources of these small errors then the totality will act like a gaussian noise.
</p>
</div>
<div id="outline-container-org68a9613" class="outline-4">
<h4 id="org68a9613">Null hypothesis test</h4>
<div class="outline-text-4" id="text-org68a9613">
<p>
Often you want to test if A is equivalent or better than B. You run multiple test runs comparing A and B and see how often A gives a better value than B. If <b>v</b> out of <b>n</b> times A is better than B, we want a way to quantify how likely this indicates that A is better. So what we do is we look at how likely you would get V out of N (or more) is A and B are equivalent - this is the <b>null hypothesis</b>.
</p>

<p>
We then say that if there was a 5% chance of randomly getting the same behavior when A and B are equivalent then we say the evidence that A is better than B is <i>significant</i>. If it's only a 1% chance then we say the evidence is <i>highly significant</i>
</p>

<p>
Here we just apply the Bionomial probability and calculate the sum of <b>P(v) + (Pv+1) + .. + P(n)</b>
</p>
</div>
</div>
</div>
</div>

<div id="outline-container-org888a630" class="outline-2">
<h2 id="org888a630">Poisson Distribution</h2>
<div class="outline-text-2" id="text-org888a630">
<p>
Unlike a binomial process which involves individual "trials"/dice-throws/coin-tosses a poisson process involves time - which is not descrete. If we take ever larger amount of low probability binomial trials in parralel in an ever shorter amount of time - then at the limit we get a poisson process. 
</p>

<blockquote>
<p>
P(&nu;) = e<sup>-&mu;</sup> &mu;<sup>&nu;</sup> / &nu;!
</p>
</blockquote>

<p>
This equation gives you the probability of getting <b>&nu;</b> successful binomial trials in a unit of time (so the result is some number of of success per second/hour/day/etc). The distribution parameter <b>&mu;</b> is also the expected-value/average. <b>&nu;</b> clearly must be an integer value (b/c it's a count of successes)
</p>

<div class="org-src-container">
<pre class="src src-octave"><span style="color: #a020f0;">function</span> result <span style="color: #483d8b;">=</span> <span style="color: #0000ff;">poisson</span> (u<span style="color: #483d8b;">,</span> v)
  result <span style="color: #483d8b;">=</span> e<span style="color: #483d8b;">.^</span>(<span style="color: #483d8b;">-</span>u)<span style="color: #483d8b;">*</span>(u<span style="color: #483d8b;">.^</span>v)<span style="color: #483d8b;">./</span>factorial(v)
<span style="color: #a020f0;">end</span>
</pre>
</div>
</div>
<div id="outline-container-org4685be1" class="outline-3">
<h3 id="org4685be1">Mean</h3>
<div class="outline-text-3" id="text-org4685be1">
<p>
Similar to before
</p>
<blockquote>
<p>
&nu;<sub>mean</sub><br>
 = &sum;<sub>0..&infin;</sub> &nu; &times; P<sub>&mu;</sub>(&nu;) <br>
<i>first sum term is 0</i> <br>
 = &sum;<sub>1..&infin;</sub> &nu; &times; P<sub>&mu;</sub>(&nu;) <br>
 = &sum;<sub>1..&infin;</sub> &nu; &times; e<sup>-&mu;</sup> &mu;<sup>&nu;</sup> / &nu;! <br>
 = &sum;<sub>1..&infin;</sub> e<sup>-&mu;</sup> &mu;<sup>&nu;</sup> / (&nu;-1)! <br>
 = &mu;e<sup>-&mu;</sup> &sum;<sub>1..&infin;</sub>  &mu;<sup>&nu;-1</sup> / (&nu;-1)! <br>
</p>
</blockquote>
<p>
The sum <b>&sum;<sub>1..&infin;</sub>  &mu;<sup>&nu;-1</sup> / (&nu;-1)!</b> so happens to be equal to <b>e<sup>&mu;</sup></b> so you are just left with:
</p>
<blockquote>
<p>
&nu;<sub>mean</sub> = &mu;
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org960de03" class="outline-3">
<h3 id="org960de03">Standard Deviation</h3>
<div class="outline-text-3" id="text-org960de03">
<p>
The standard deviation is <b>&radic;&nu;</b> <br>
<i>No intuitive proof :( ..</i> <br>
If you measure <b>&nu;</b> events in some time <b>R</b> then you will estimate <b>&nu;&plusmn;&radic;&nu;</b> counts in time <b>R</b>.
</p>
<blockquote>
<p>
<b>Note</b>: It's important not to normalize the rate first! Don't do <b>&nu;/R</b> to get a rate, and then end up with <b>&nu;/R &plusmn; &radic; [&nu;/R]</b>. <br>
The correct answer will be: <br>
[&nu;&plusmn;&radic;&nu;] / R <br>
[&nu;/R]&plusmn;[&radic;&nu;]/R
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org087bcad" class="outline-3">
<h3 id="org087bcad">Gaussian</h3>
<div class="outline-text-3" id="text-org087bcad">
<p>
Again, for large values you can use the Gaussian to approximate (with mean = <b>&mu;</b> and standard deviation <b>&radic;&mu;</b>). This is especially useful if you want to do problems where you want to know the probability of getting some value <b>x</b> <i>or more</i> - b/c integrating the Gaussian is straightforward with a table. You look at how many <b>&sigma;</b>'s <b>x</b> is from the mean and check the table
</p>
</div>
</div>

<div id="outline-container-org4942943" class="outline-3">
<h3 id="org4942943">Subtracting a background</h3>
<div class="outline-text-3" id="text-org4942943">
<p>
You measure some process <b>&nu;<sub>tot</sub></b> for <b>T<sub>tot</sub></b> seconds but it includes some background "noise" which you can then calculate separately <b>&nu;<sub>bgd</sub></b> .. maybe for a longer time <b>T<sub>bgd</sub></b>. You can then calculate their rates and subtracting
</p>
<blockquote>
<p>
R<sub>tot</sub> = &nu;<sub>tot</sub>/T<sub>tot</sub> <br>
R<sub>bgd</sub> = &nu;<sub>bgd</sub>/T<sub>bgd</sub> <br>
R<sub>sce</sub> = R<sub>tot</sub> - R<sub>bgd</sub>
</p>
</blockquote>
<p>
However the errors need to be propogated correctly (b/c the errors are on the original <b>&nu;</b> values!) and then added in quadrature
</p>

<blockquote>
<p>
<b>Note</b>: <i>Problem 11.21</i> Shows that the optimal amount of time to measure something and a background is 
</p>

<p>
T<sub>tot</sub>/T<sub>bgd</sub> = &radic;[r<sub>tot</sub>/r<sub>bgd</sub>]
</p>

<p>
It's some minimizing the error of the quadrature error&#x2026;
</p>
</blockquote>
</div>
</div>
</div>

<div id="outline-container-org7994465" class="outline-2">
<h2 id="org7994465">Chi-Squared</h2>
<div class="outline-text-2" id="text-org7994465">
<p>
Finally we want a method to judge how well a probability density corresponds to our data
</p>

<ol class="org-ol">
<li>We estimate a probability density using what we've discussed before (say a Gaussian with some mean and variance)</li>
<li>We take our data and split it into <b>n</b> bins. Each bin has some <b>m<sub>k</sub></b> values</li>
<li>We use the probability density and integrate over the bin to see how many measurements we expect in the bin</li>
<li>B/c measurements are either inside or outside the bin - the next result is a Poisson process</li>
<li>We expect some <b>&mu;<sub>k</sub></b> amount of measurements in the bin <b>k</b> but they will have <b>&sigma;=&radic;&mu;<sub>k</sub></b></li>
<li>We look at how many measurements we actually got in the bin and see how many standard deviations it differs from the <b>&mu;<sub>k</sub></b> given to us by theh probability density function</li>
</ol>

<blockquote>
<p>
[ m<sub>k</sub> - &mu;<sub>k</sub> ] / &radic;&mu;<sub>k</sub>
</p>
</blockquote>

<p>
This is a metric for how good the measurements fit a bin. If the measurements do follow the distribution then ofcourse the <b>m<sub>k</sub></b> values will have standard deviation of <b>&radic;&mu;</b> around <b>&mu;<sub>k</sub></b> and so on average this fraction will be <b>1</b>. If the value is far away from <b>1</b> then there is likely some problem and maybe the distribution doesn't fit the data.
</p>

<p>
To get a cumulative total for all the bins we square the values (making them all positive) and sum them up.
</p>

<blockquote>
<p>
&chi;<sup>2</sup> = &sum;<sub>bins</sub> [ m<sub>k</sub> - &mu;<sub>k</sub> ]<sup>2</sup> / &mu;<sub>k</sub>
</p>
</blockquote>

<p>
Now the expected value of the sum will be the number of bins ( 1+1+..+1 <b>n</b> times)
</p>
</div>

<div id="outline-container-org73e4720" class="outline-3">
<h3 id="org73e4720">Degrees of Freedom</h3>
<div class="outline-text-3" id="text-org73e4720">
<p>
However b/c usually <b>&mu;<sub>k</sub></b> depends on the <b>m<sub>k</sub></b> values (you've usually estimated your distribution from the values themselves) the values are biased and will vary from the estimate less than from the actual mean - so the expected value of the <b>&chi;<sup>2</sup></b> will actually be smaller
</p>

<p>
Instead we expected value will be <b>d</b> - for degrees of freedom
</p>
<blockquote>
<p>
d = n - c
</p>
</blockquote>
<p>
Where <b>c</b> is the number of contraints. If you are determining a mean from the values and then determining how the values fit back to the same mean - then you are artificially making your values fit better than in reality.
</p>

<p>
For instance if you do the <i>Chi-squared</i> on a Gaussian you will have 3 constraints
</p>
<ol class="org-ol">
<li>the total number of measurements N</li>
<li>the estimated mean <b>X</b></li>
<li>the estimated standard deviation <b>&sigma;</b></li>
</ol>
<p>
All 3 of these values are needed to determine how many measurements we expect in each bin (the <b>&mu;<sub>k</sub></b> before)
</p>

<p>
So if your measurements are actually Gaussian then you expect the <b>&chi;<sup>2</sup></b> value to be around <b>n-3</b>
</p>

<p>
Therefore your <b>n</b>, or number of bins, needs to be at least 4
</p>

<blockquote>
<p>
<b>Note</b> If you are comparing to a distribution that you know a-priori, like a coin toss or dice throw - then you are not estimating <b>&mu;<sub>k</sub></b> from the data. But you are determining the total so you get <b>n-1</b>
</p>
</blockquote>
<blockquote>
<p>
<b>Note</b> If you have too many bins then you won't have many values in each bin and the error &radic;&mu;<sub>k</sub> will be too large. It's unclear what the optimal bin size is..
</p>
</blockquote>
</div>
</div>
<div id="outline-container-orgaa6e558" class="outline-3">
<h3 id="orgaa6e558">Reduced <b>&chi;<sup>2</sup></b></h3>
<div class="outline-text-3" id="text-orgaa6e558">
<p>
You can normalize your <b>&chi;<sup>2</sup></b> by dividing by the degrees of freedom
</p>

<blockquote>
<p>
&chi;<sup>2</sup>/d
</p>
</blockquote>

<p>
When things fit then the expected value of this is <b>1</b> - but in reality we will get some off-value. We can then look up in a table the probability of getting the value or larger <b>P( &chi;<sup>2</sup>/d &gt; &chi;<sub>ours</sub><sup>2</sup>/d )</b>. If it's 5% or less then we say it's a <i>significant</i> rejection. If it's 1% or lower than we say it's <i>highly significant</i>
</p>

<p>
(This language is a bit confusing here.. later it's stated something like "the distribution is accepted at the 5%/1% level")
</p>

<p>
As you have more degrees of freedom (more bins) the constraint gets tighter and you expect your reduced 
 &chi;<sup>2</sup>/d to be closer to <b>1</b>
</p>
<blockquote>
<p>
<b>Note</b> that this probability solely depends on the number of degrees of freedom - the underlying distribution seemingly doesn't matter (for now Poisson - but later it can be anything)
</p>
</blockquote>
</div>
</div>
<div id="outline-container-org8f9334e" class="outline-3">
<h3 id="org8f9334e">General equation</h3>
<div class="outline-text-3" id="text-org8f9334e">
<blockquote>
<p>
&chi;<sup>2</sup> = &sum;<sub>n</sub> [ (observed<sub>value</sub> - expected<sub>value</sub>) / standard<sub>deviation</sub> ]<sup>2</sup>
</p>
</blockquote>
<p>
And you always then see how <b>&chi;<sup>2</sup></b> compares to the number <b>n</b>
</p>

<p>
This can be used generally. If you have some function <b>f(x)</b> and some measured <b>y</b> values then you'd write
</p>
<blockquote>
<p>
&chi;<sup>2</sup> = &sum;<sub>n</sub> [ ( y<sub>i</sub> - f(x<sub>i</sub>) ) / &sigma;<sub>i</sub> ]<sup>2</sup>
</p>
</blockquote>
<p>
Of course actually having a value for <b>&sigma;<sub>i</sub></b> will be a bit situation-dependent&#x2026;
</p>
</div>


<div id="outline-container-org2f7a2f5" class="outline-4">
<h4 id="org2f7a2f5">Example</h4>
<div class="outline-text-4" id="text-org2f7a2f5">
<p>
You start by fitting your <b>n</b> measurements of <b>[x,y]</b> - where <b>x</b> have negligible uncertainty and <b>y</b> has a standard deviation of <b>&sigma;</b> <br>
You then do a least squares fit to a line
</p>
<blockquote>
<p>
y = f(x) = A + Bx
</p>
</blockquote>
<p>
Calculating <b>A</b> and <b>B</b> provides <b>2</b> constraints - so you have <b>n-2</b> <i>degrees of freedom</i> <br>
You then calculate a <b>&chi;<sup>2</sup></b>
</p>
<blockquote>
<p>
&chi;<sup>2</sup> = &sum;<sub>n</sub> [ ( y<sub>i</sub> - f(x<sub>i</sub>) ) / &sigma;<sub>i</sub> ]<sup>2</sup>
</p>
</blockquote>
<p>
Divide by your degrees of freedom
</p>
<blockquote>
<p>
&chi;<sup>2</sup>/(n-2)
</p>
</blockquote>
<p>
And then look up in the table if you can accept the fit at the 5%/1% level or not
</p>
<blockquote>
<p>
<b>Note</b>: Again, this seems to independent of the distribution of <b>y</b>&#x2026; so it doesn't only hold for "bins" and the Poisson process. It's just a question of how often/likely you are exceeding the standard deviation several times in a row
</p>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</body>
</html>