*Submitted by Marcos Woehrmann, Dimitris Skourtis and Jonathan Magasin
20 January 2009*

Our points are in R^{d}. Suppose we have d of them
**x**_{1} through **x**_{d}. Let each of these
be a row in the d×d matrix **M**. Then for any **w**
vector in R^{d} we can define a matrix multiplication
**M****w**^{T} = **c** where **c** is a
d×1 vector as follows:

cWe can pick_{i}> 0 ifx_{i}is + c_{i}< 0 ifx_{i}is -

What if we add one point? Then **M'** is d+1×d and
**c'** has d+1 components:

**M'****w**^{T} = **c'**

We know that our original d points are linearly independent
(because we chose them so that the d×d matrix **M** is
invertible). It follows that the new point is a linear
combination of the original d points (theorem mentioned in the
problem). This is the key: Because the new point is a linear
combination of the other points, with elementary row operations we
can zero out its row in **M'** -- just subtract the
appropriately scaled other rows from the new row -- and we apply
the same operations to the **c'** column vector. If we now
examine the equation for the new point it is:

where cx_{d+1}•w^{T}= 0(w_{1}) + 0(w_{2}) + ... + 0(w_{d}) = c^{*}_{d+1}

*We used several web sites to brush up on our linear algebra for
this problem [2].*

Casella and Berger define a sample space as the set of possible events [1]. Using this definition &Omega has 6 events:

(0,0,0), (1,1,1), (0.5,0,0), (0.5,0,1), (0.5,1,0) (0.5,1,1)and no points have 0 probability.

Note: We could also define &Omega as the cartesian product {0,0.5,1}×{0,1}×{0,1} and assign 0 to impossible events (the 6 events not listed above: (0,1,1), (0,1,0), (0,0,1), (1,0,0), (1,1,0), (1,0,1) In this case &Omega has 6 events with probability 0).

P(fB=1|fA=1) = P(fB=1,fA=1) / P(fA=1) = [P(fB=1,fA=1|c=0.5)P(c=0.5) + P(fB=1,fA=1|c=1)P(c=1)] / P(fA=1) = (5/12)(2/1) = 5/6 P(fB=1,fA=1) = P(fB=1,fA=1|c=0.5)P(c=0.5) + P(fB=1,fA=1|c=1)P(c=1) = (1/4)(1/3) + 1(1/3) = 5/12 P(fA=1) = P(fA=1|c=1)P(c=1) + P(fA=1|c=0.5)P(c=0.5) = 1(1/3) + (1/2)(1/3) = 1/2So P(flipB=1|flipA=1) = 5/6.

P(v=+) = Σ_{h∈H}P(v=+|h)P(h) = P(v=+|h_{1})P(h_{1}) + P(v=+|h_{2})P(h_{2}) + P(v=+|h_{3})P(h_{3}) = P(v=+|h_{1})*0.5 + P(v=+|h_{2})*0.4 + P(v=+|h_{3})*0.1 P(a=+) = 0.8*0.5 + 0.25*0.4 + 0.5*0.1 = 0.55 P(b=+) = 0.8*0.5 + 0.75*0.4 + 0.5*0.1 = 0.75 P(c=+) = 0.2*0.5 + 0.75*0.4 + 0.5*0.1 = 0.45

P(The maximum likelihood estimate hypothesis is hx|h) = P(a=+|h) * P(b=-|h) * P(c=+|h) P(x|h_{1}) = 0.8*0.2*0.2 = 0.032 P(x|h_{2}) = 0.25*0.25*0.75 = 0.046875 P(x|h_{3}) = 0.5^{3}= 0.125

P(h|The constant P(x) = P(x|h)P(h)/P(x)

P(hThe maximum_{1}|x) ∝ P(x|h_{1})P(h_{1}) = 0.8*0.2*0.2 * 0.5 = 0.016 P(h_{2}|x) ∝ P(x|h_{2})P(h_{2}) = 0.25*0.25*0.75 * 0.4 = 0.01875 P(h_{3}|x) ∝ P(x|h_{3})P(h_{3}) = 0.5^3 * 0.1 = 0.0125

P(We know P(h|v=+|x) = &Sigma_{h∈H}P(v=+,h|x) = &Sigma_{h∈H}P(v=+|h,x)P(h|x) = &Sigma_{h∈H}P(v=+|h)P(h|x) Given h,xdoesn't affect distribution ofv.

P(Giving us:x) = &Sigma_{h∈H}P(x|h)P(h) = 0.032*0.5 + 0.046875*0.4 + 0.125*0.1 = 0.04725

P(hSo we have_{1}|x) = 0.016/0.04725 = 0.3386243 P(h_{2}|x) = 0.01875/0.04725 = 0.3968254 P(h_{3}|x) = 0.0125/0.04725 = 0.2645503

P(a=+|x) = P(a=+|h_{1})P(h_{1}|x) + P(a=+|h_{2})P(h_{2}|x) + P(a=+|h_{3})P(h_{3}|x) = 0.8*0.3386243 + 0.25*0.3968254 + 0.5*0.2645503 =0.5023809P(b=+|x) = P(b=+|h_{1})P(h_{1}|x) + P(b=+|h_{2})P(h_{2}|x) + P(b=+|h_{3})P(h_{3}|x) = 0.8*0.3386243 + 0.75*0.3968254 + 0.5*0.2645503 =0.7007936P(c=+|x) = P(c=+|h_{1})P(h_{1}|x) + P(c=+|h_{2})P(h_{2}|x) + P(c=+|h_{3})P(h_{3}|x) = 0.2*0.3386243 + 0.75*0.3968254 + 0.5*0.2645503 =0.4976191

The prior probabilities of a,b,c would **not** change since they
are unaffected by (new) sample data.

Note: The calculations below were made under the belief that only a single new data point, (a,+), was added. (Part of the problem was cut off. I now see that all three points were to be new data.) For three new data points we would still have to recalculate MLE, MAP and the mean posterior probabilities because they depend on the observed data.

MLE would multiply in a factor for the new event:

P(The MLE hypothesis is still hx'|h) = P(a=+|h)^{2}* P(b=-|h) * P(c=+|h) P(x'|h_{1}) = 0.8^{2}*0.2*0.2 = 0.0256 P(x'|h_{2}) = 0.25^{2}*0.25*0.75 = 0.01171875 P(x'|h_{3}) = 0.5^{4}= 0.0625

MAP with the new data point:

P(hThe maximum_{1}|x') ∝ P(x'|h_{1})P(h_{1}) = 0.8^{2}*0.2*0.2 * 0.5 = 0.0128 P(h_{2}|x') ∝ P(x'|h_{2})P(h_{2}) = 0.25^{2}*0.25*0.75 * 0.4 = 0.0046875 P(h_{3}|x') ∝ P(x'|h_{3})P(h_{3}) = 0.5^4 * 0.1 = 0.00625

The mean posterior probabilities for a=+, b=+, c=+ must be
recalculated since they depend on **x'**.

P(Since hx) = &Sigma_{h∈H}P(x|h)P(h) = 0.0256*0.5 + 0.01171875*0.4 + 0.0625*0.1 = 0.0237375 P(h_{1}|x') = 0.0128 / 0.0237375 = 0.5392312 P(h_{2}|x') = 0.0046875 / 0.0237375 = 0.1974724 P(h_{3}|x') = 0.00625 / 0.0237375 = 0.2632965 P(a=+|x) = P(a=+|h_{1})P(h_{1}|x) + P(a=+|h_{2})P(h_{2}|x) + P(a=+|h_{3})P(h_{3}|x) = 0.8*0.5392312 + 0.25*0.1974724 + 0.5*0.2632965 = 0.6124013 P(b=+|x) = P(b=+|h_{1})P(h_{1}|x) + P(b=+|h_{2})P(h_{2}|x) + P(b=+|h_{3})P(h_{3}|x) = 0.8*0.5392312 + 0.75*0.1974724 + 0.5*0.2632965 = 0.7111375 P(c=+|x) = P(c=+|h_{1})P(h_{1}|x) + P(c=+|h_{2})P(h_{2}|x) + P(c=+|h_{3})P(h_{3}|x) = 0.2*0.5392312 + 0.75*0.1974724 + 0.5*0.2632965 = 0.3875988

- Casella, G. and Berger, R.
__Statistical Inference: Second Edition__. Wadsworth Group (Thomson Learning), 2002. - The following sites were used for problem 1. They appear to have
the same content:

http://en.wikipedia.org/wiki/Rank_(linear_algebra)

http://www.algebra.com/algebra/college/linear/Rank-of-a-matrix.wikipedia