大学数学基礎解説

【不偏推定量】母集団と復元抽出による標本

統計,確率,データサイエンス

この著者は初心者として投稿しています。間違いや考慮が足りていない点が含まれている可能性が高いです。見つけたらコメント欄で優しく指摘してあげましょう。

Def.

母集団

ある調査または統計的推論において、対象として採用する条件を満たすものの全体を母集団という。

調査対象となりうる対象全体を表す集合 $U$ と、$U$ 上の述語 $C$ が与えられているとき、
$$ \Omega:=\{\omega\in U\mid C(\omega)\} $$
によって定まる部分集合 $\Omega\subseteq U$ を、この調査における母集団という。
ここで、$C$ は各 $\omega\in U$ に対して真偽が定まる条件であり、
$C(\omega)$ は「$\omega$ がこの調査において母集団に含まれるための条件を満たす」という命題である。

本稿では、特に断らない限り、母集団 $\Omega$ は空でない集合であるとする。

母集団 $\Omega$ は集合であり、それ自体は確率空間ではない。
確率を用いて母集団を扱う場合には、母集団 $\Omega$ に $\sigma$-代数 $\mathcal F$ と確率測度 $\mathbb P$ を加えて、
$$ (\Omega,\mathcal F,\mathbb P) $$
という確率空間として扱うことがある。

有限母集団と無限母集団

$\Omega$ を母集団とする。

ある $n\in\mathbb N_0$ が存在して
$$ |\Omega|=n $$
が成り立つとき、$\Omega$ を有限母集団という。
$\Omega$ が有限母集団でないとき、$\Omega$ を無限母集団という。
すなわち、任意の $n\in\mathbb N_0$ に対して
$$ |\Omega|\ne n $$
が成り立つとき、$\Omega$ を無限母集団という。

本稿では、特に断らない限り、母集団 $\Omega$ は空でない集合であると仮定している。
したがって、本稿で有限母集団を扱う場合には、通常、ある $n\in\mathbb N_{\geq 1}$ が存在して
$$ |\Omega|=n $$
が成り立つ。

無限母集団には、可算無限母集団と非可算無限母集団が含まれる。
たとえば、母集団 $\Omega$ が $\mathbb N$ と同じ濃度をもつ場合、$\Omega$ は可算無限母集団である。
また、母集団 $\Omega$ が区間 $[0,1]$ と同じ濃度をもつ場合、$\Omega$ は非可算無限母集団である。

母集団確率空間

$\Omega$ を空でない母集団とする。
$\Omega$ 上の $\sigma$-代数 $\mathcal F$ と、$\mathcal F$ 上の確率測度 $\mathbb P$ が与えられているとする。
すなわち、$\mathcal F\subseteq 2^\Omega$ は $\Omega$ 上の $\sigma$-代数であり、
$$ \mathbb P:\mathcal F\to[0,1] $$
は
$$ \mathbb P(\Omega)=1 $$
を満たす可算加法的な写像であるとする。
このとき、
$$ (\Omega,\mathcal F,\mathbb P) $$
を母集団確率空間という。
また、$\mathcal F$ の元を母集団上の事象という。

母集団 $\Omega$ は調査対象の全体集合である。
一方、確率測度 $\mathbb P$ は、母集団上の事象に確率を割り当てる構造であり、母集団上の各事象をどのような確率的重みで見るかを表す。
この意味で、$\mathbb P$ は可測空間 $(\Omega,\mathcal F)$ 上の確率分布として解釈できる。
$ $
ただし、個体 $\omega\in\Omega$ そのものの確率 $\mathbb P(\{\omega\})$ を考えるためには、単点集合 $\{\omega\}$ が $\mathcal F$ に属している必要がある。
特に、$\mathcal F=2^\Omega$ の場合には、任意の $\omega\in\Omega$ に対して $\{\omega\}\in\mathcal F$ であるから、各個体の確率 $\mathbb P(\{\omega\})$ を考えることができる。

複数個体からなる標本の抽出方法を厳密に扱う場合には、
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ とは別に、標本全体を表す集合の上に確率構造を定める必要がある。
$ $
たとえば、順序付きで $n$ 個体を復元抽出する場合には、標本全体の集合として
$$ \Omega^n $$
を考える。
一方、順序付きで $n$ 個体を非復元抽出する場合には、標本全体の集合として
$$ \{(\omega_1,\ldots,\omega_n)\in\Omega^n\mid i\ne j\Rightarrow \omega_i\ne\omega_j\} $$
を考える。
また、順序を区別しない $n$ 個体の標本を扱う場合には、標本全体の集合として
$$ \{S\subseteq\Omega\mid |S|=n\} $$
を考える。
これらの標本空間の上に、目的に応じて $\sigma$-代数と確率測度を定めることで、標本抽出の確率構造を記述できる。
したがって、同じ母集団 $\Omega$ に対しても、異なる $\sigma$-代数 $\mathcal F$ や異なる確率測度 $\mathbb P$ を入れれば、異なる母集団確率空間が得られる。

(実数値)母集団確率変数

$(\Omega,\mathcal F,\mathbb P)$ を母集団確率空間とする。
このとき、可測写像
$$ X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R)) $$
を(実数値)母集団確率変数という。

すなわち、$X$ は写像 $X:\Omega\to\mathbb R$ であり、任意の $B\in\mathcal B(\mathbb R)$ に対して、
$$ X^{-1}(B)\in\mathcal F $$
を満たす。

各 $\omega\in\Omega$ に対して、$X(\omega)$ は個体 $\omega$ における変数 $X$ の値を表す。
母集団確率変数 $X$ が可測であるという条件は、$X$ によって定まる集合
$$ \{\omega\in\Omega\mid X(\omega)\in B\} $$
が、任意の $B\in\mathcal B(\mathbb R)$ に対して母集団上の事象になることを意味する。
したがって、任意の $B\in\mathcal B(\mathbb R)$ に対して、
$$ \mathbb P(X\in B):=\mathbb P(X^{-1}(B)) $$
と定めることができる。
これは、母集団から確率測度 $\mathbb P$ に従って $1$ 個体を選ぶとき、その個体の変数値が $B$ に属する確率を表す。

実数値に限らず、一般の値をとる変数を扱う場合には、可測空間 $(E,\mathcal E)$ に対する可測写像
$$ X:(\Omega,\mathcal F)\to(E,\mathcal E) $$
を考える。
このとき、$X$ を $E$ 値母集団確率変数という。
特に、$E$ が有限集合または可算集合であり、$\mathcal E=2^E$ とすれば、カテゴリ値をとる母集団確率変数も同じ枠組みで扱うことができる。

母集団分布

$(\Omega,\mathcal F,\mathbb P)$ を母集団確率空間とし、
$$ X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R)) $$
を実数値母集団確率変数とする。
このとき、$\mathcal B(\mathbb R)$ 上の写像
$$ \mathbb P_X:\mathcal B(\mathbb R)\to[0,1] $$
を
$$ \mathbb P_X(B):=\mathbb P(X^{-1}(B)) \quad (B\in\mathcal B(\mathbb R)) $$
によって定める。
この確率測度 $\mathbb P_X$ を、母集団確率変数 $X$ の母集団分布という。

すなわち、任意の $B\in\mathcal B(\mathbb R)$ に対して、
$$ \mathbb P_X(B) = \mathbb P(\{\omega\in\Omega\mid X(\omega)\in B\}) $$
である。
また、$\mathbb P_X$ は
$$ \mathbb P_X=\mathbb P\circ X^{-1} $$
とも書かれる。
ここで、$X^{-1}$ は逆写像ではなく、逆像(作用素)
$$ X^{-1}:\mathcal B(\mathbb R)\to\mathcal F $$
を表す。

$X$ は可測であるから、任意の $B\in\mathcal B(\mathbb R)$ に対して、
$$ X^{-1}(B)=\{\omega\in\Omega\mid X(\omega)\in B\}\in\mathcal F $$
が成り立つ。
したがって、
$$ \mathbb P_X(B):=\mathbb P(X^{-1}(B)) $$
は矛盾なく定義されている。

$\mathbb P_X$ は可測空間 $(\mathbb R,\mathcal B(\mathbb R))$ 上の確率測度である。
実際、
$$ \mathbb P_X(\mathbb R) = \mathbb P(X^{-1}(\mathbb R)) = \mathbb P(\Omega) = 1 $$
である。
また、互いに素なボレル集合列 $(B_k)_{k\in\mathbb N}$ に対して、
$$ X^{-1}\left(\bigcup_{k=1}^{\infty}B_k\right) = \bigcup_{k=1}^{\infty}X^{-1}(B_k) $$
が成り立つ(証明要)。
さらに、$i\ne j$ ならば $B_i\cap B_j=\varnothing$ であるから、
$$ X^{-1}(B_i)\cap X^{-1}(B_j) = X^{-1}(B_i\cap B_j) = X^{-1}(\varnothing) = \varnothing $$
である。
したがって、$(X^{-1}(B_k))_{k\in\mathbb N}$ は互いに素な $\mathcal F$ の元からなる列である。
よって、$\mathbb P$ の可算加法性より、
$$ \begin{aligned} \mathbb P_X\left(\bigcup_{k=1}^{\infty}B_k\right) &= \mathbb P\left(X^{-1}\left(\bigcup_{k=1}^{\infty}B_k\right)\right)\\ &= \mathbb P\left(\bigcup_{k=1}^{\infty}X^{-1}(B_k)\right)\\ &= \sum_{k=1}^{\infty}\mathbb P(X^{-1}(B_k))\\ &= \sum_{k=1}^{\infty}\mathbb P_X(B_k) \end{aligned} $$
が成り立つ。
以上より、$\mathbb P_X$ は可測空間 $(\mathbb R,\mathcal B(\mathbb R))$ 上の確率測度である。

標本抽出空間

$n\in\mathbb N_{\geq 1}$ とする。
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ が与えられているとする。
母集団から確率測度 $\mathbb P$ に従って独立に $n$ 回抽出する順序付き標本抽出を、積確率空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
によって表す。
この積確率空間を、$\mathbb P$ に従う独立な $n$ 回抽出に対応する標本抽出空間という。

ここで、
$$ \Omega^n = \underbrace{\Omega\times\cdots\times\Omega}_{n\text{個}} $$
であり、$\Omega^n$ の元
$$ (\omega_1,\ldots,\omega_n) $$
は、$1$ 回目に $\omega_1$、$2$ 回目に $\omega_2$、$\ldots$、$n$ 回目に $\omega_n$ が抽出されたことを表す。
したがって、この標本抽出空間は順序付き標本を表す。

$\mathcal F^{\otimes n}$ は
$$ \mathcal F^{\otimes n} := \sigma\left(\{A_1\times\cdots\times A_n\mid A_i\in\mathcal F,\ i=1,\ldots,n\}\right) $$
によって定まる $\Omega^n$ 上の積 $\sigma$-代数である。
また、$\mathbb P^{\otimes n}$ は、任意の $A_1,\ldots,A_n\in\mathcal F$ に対して
$$ \mathbb P^{\otimes n}(A_1\times\cdots\times A_n) = \prod_{i=1}^n\mathbb P(A_i) $$
を満たす $\mathcal F^{\otimes n}$ 上の積確率測度である。

各 $i=1,\ldots,n$ に対して、写像
$$ \pi_i:\Omega^n\to\Omega $$
を
$$ \pi_i(\omega_1,\ldots,\omega_n)=\omega_i $$
によって定める。
このとき、$\pi_i$ は
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) \to (\Omega,\mathcal F) $$
なる可測写像である。
さらに、$\pi_1,\ldots,\pi_n$ は独立であり、任意の $i=1,\ldots,n$ に対して、$\pi_i$ の分布は $\mathbb P$ である。
したがって、積確率空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
は、母集団確率測度 $\mathbb P$ に従う独立同分布抽出、すなわち $\mathrm{i.i.d} $ 抽出を表す。

有限母集団の場合、このモデルは典型的には復元抽出に対応する。
特に、$\Omega$ が有限集合であり、$\mathcal F=2^\Omega$ かつ $\mathbb P$ が $\Omega$ 上の一様分布である場合、
このモデルは順序付き復元抽出としての単純無作為抽出に対応する。
この場合、任意の $(\omega_1,\ldots,\omega_n)\in\Omega^n$ に対して、
$$ \mathbb P^{\otimes n}(\{(\omega_1,\ldots,\omega_n)\}) = \prod_{i=1}^n\mathbb P(\{\omega_i\}) = \left(\frac{1}{|\Omega|}\right)^n $$
である。

有限母集団から非復元抽出を扱う場合には、$\Omega^n$ 全体ではなく、重複を許さない順序付き標本全体
$$ D_n := \{(\omega_1,\ldots,\omega_n)\in\Omega^n\mid i\ne j\Rightarrow \omega_i\ne\omega_j\} $$
を標本空間として用いる。
この場合、通常は
$$ n\leq |\Omega| $$
を仮定する必要がある。
また、$D_n$ の上に、抽出設計に応じた $\sigma$-代数と確率測度を別途定める必要がある。
たとえば、$\Omega$ が有限集合で $\mathcal F=2^\Omega$ である場合には、$D_n$ 上の一様分布を用いることで、順序付き非復元単純無作為抽出を表すことができる。

一般に、標本抽出とは、母集団から統計的な分析や推測を目的として、観測対象となる標本を選び出す過程をいう。
ただし、標本抽出の方法は一意ではない。
たとえば、代表的な標本抽出法として、次のようなものがある。
$$ \begin{array}{|c|c|} \hline \text{標本抽出法} & \text{概要}\\ \hline \text{単純無作為抽出} & \text{母集団の各個体が等しい確率で選ばれるように抽出する方法}\\ \hline \text{系統抽出} & \text{母集団に順序を付け、一定間隔ごとに個体を抽出する方法}\\ \hline \text{層化抽出} & \text{母集団をいくつかの層に分け、各層から標本を抽出する方法}\\ \hline \text{集落抽出} & \text{母集団を集落に分け、選ばれた集落に属する個体を調べる方法}\\ \hline \text{多段抽出} & \text{複数の段階を経て標本を抽出する方法}\\ \hline \end{array} $$
これらの抽出法では、標本空間や確率測度は抽出設計に応じて異なる。
したがって、すべての標本抽出が
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
によって表されるわけではない。
この積確率空間による標本抽出は、母集団確率測度 $\mathbb P$ に従って独立に $n$ 回抽出するモデルであり、
一般には単純無作為抽出そのものではなく、独立同分布抽出、すなわち i.i.d. 抽出を表す。

独立同分布の標本確率変数列

$n\in\mathbb N_{\geq 1}$ とする。母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と母集団確率変数 $X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。
また、独立な $n$ 回抽出に対応する標本抽出空間を $(\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n})$ とする。
各 $i=1,\ldots,n$ に対して、第 $i$ 成分への射影を
$$ \pi_i:\Omega^n\to\Omega, \quad \pi_i(\omega_1,\ldots,\omega_n)=\omega_i $$
とする。
このとき、
$$ X_i:=X\circ\pi_i: (\Omega^n,\mathcal F^{\otimes n})\to(\mathbb R,\mathcal B(\mathbb R)) $$
すなわち
$$ X_i(\omega_1,\ldots,\omega_n)=X(\omega_i) $$
によって定まる確率変数 $X_i$ を、第 $i$ 番目の標本確率変数という。
このようにして得られる確率変数列
$$ X_1,\ldots,X_n $$
を、母集団確率変数 $X$ から得られる大きさ $n$ の標本確率変数列という。

各 $X_i$ は確率変数である。
実際、$\pi_i$ は積 $\sigma$-代数の定義より可測であり、$X$ も可測であるため、合成写像 $X_i=X\circ\pi_i$ も可測である。

上の構成では、$X_1,\ldots,X_n$ は互いに独立であり、すべて母集団分布 $\mathbb P_X$ に従う。
すなわち、任意の $i=1,\ldots,n$ と任意の $B\in\mathcal B(\mathbb R)$ に対して、
$$ \mathbb P^{\otimes n}(X_i\in B)=\mathbb P_X(B) $$
が成り立つ。
また、任意の $B_1,\ldots,B_n\in\mathcal B(\mathbb R)$ に対して、
$$ \mathbb P^{\otimes n}(X_1\in B_1,\ldots,X_n\in B_n) = \prod_{i=1}^n\mathbb P_X(B_i) $$
が成り立つ。
したがって、$X_1,\ldots,X_n$ は母集団分布 $\mathbb P_X$ に従う独立同分布な確率変数列である。

観測標本

$n\in\mathbb N_{\geq 1}$ とする。
標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上の標本確率変数列
$$ X_1,\ldots,X_n $$
が与えられているとする。
ある標本点
$$ \boldsymbol{\omega}=(\omega_1,\ldots,\omega_n)\in\Omega^n $$
に対して、
$$ x_i:=X_i(\boldsymbol{\omega}) \quad (i=1,\ldots,n) $$
によって得られる実数列
$$ (x_1,\ldots,x_n)\in\mathbb R^n $$
を、標本確率変数列 $X_1,\ldots,X_n$ の実現値という。
また、この実現値 $(x_1,\ldots,x_n)$ を、観測標本、観測データ、または単にデータという。
ここで、各 $x_i$ は第 $i$ 番目の標本確率変数 $X_i$ の実現値である。

標本点 $\boldsymbol{\omega}\in\Omega^n$ は抽出結果を表す確率空間上の点であり、観測標本 $(x_1,\ldots,x_n)$ は標本確率変数列
$$ (X_1,\ldots,X_n) $$
によって得られる値である。
したがって、観測標本は
$$ (X_1,\ldots,X_n)(\boldsymbol{\omega}) = (x_1,\ldots,x_n) $$
と表すこともできる。
一般に、標本点 $\boldsymbol{\omega}$ と観測標本 $(x_1,\ldots,x_n)$ は同じものではない。

統計量

$n\in\mathbb N_{\geq 1}$ とする。

母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と母集団確率変数
$X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。
独立な $n$ 回抽出に対応する標本抽出空間を
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
とする。
各 $i=1,\ldots,n$ に対して、第 $i$ 成分への射影を
$$ \pi_i:\Omega^n\to\Omega $$
とし、
$$ X_i:=X\circ\pi_i $$
によって標本確率変数 $X_i$ を定める。
標本確率ベクトル $\boldsymbol X$ を
$$ \boldsymbol X:=(X_1,\ldots,X_n): (\Omega^n,\mathcal F^{\otimes n})\to(\mathbb R^n,\mathcal B(\mathbb R^n)) $$
で定める。
ボレル可測写像
$$ T:(\mathbb R^n,\mathcal B(\mathbb R^n))\to(\mathbb R,\mathcal B(\mathbb R)) $$
が与えられているとする。

-このとき、合成写像
$$ T\circ\boldsymbol X: (\Omega^n,\mathcal F^{\otimes n})\to(\mathbb R,\mathcal B(\mathbb R)) $$
によって定まる確率変数を統計量という。

この統計量を
$$ T(X_1,\ldots,X_n) $$
と書く。
また、標本点 $\boldsymbol\omega\in\Omega^n$ に対して観測標本
$$ (x_1,\ldots,x_n) = \boldsymbol X(\boldsymbol\omega) $$
が得られたとき、統計量の実現値は
$$ T(x_1,\ldots,x_n) $$
である。

$T$ は可測である必要がある。
$X_1,\ldots,X_n$ は可測であり、したがって標本確率ベクトル
$$ \boldsymbol X=(X_1,\ldots,X_n) $$
も可測である。
さらに、$T$ も可測であるから、合成写像
$$ T\circ\boldsymbol X $$
は可測である。
したがって、$T(X_1,\ldots,X_n)$ は標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上の実数値確率変数である。

統計量は、母集団の未知量を直接含む関数ではなく、標本確率変数列
$$ X_1,\ldots,X_n $$
またはその実現値
$$ x_1,\ldots,x_n $$
から計算される量である。
したがって、統計量を定める関数 $T$ は、原則として母集団の未知パラメータには依存しないものとして扱う。

推定量

$n\in\mathbb N_{\geq 1}$ とする。
母集団確率空間を $(\Omega,\mathcal F,\mathbb P)$ とし、実数値母集団確率変数を
$$ X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R)) $$
とする。
また、独立な $n$ 回抽出に対応する標本抽出空間を
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
とし、母集団確率変数 $X$ から得られる標本確率変数列を
$$ X_1,\ldots,X_n $$
とする。
母集団確率空間または母集団分布から定まる実数値(母数)を
$$ \theta\in\mathbb R $$
とする。
統計量
$$ \widehat\theta_n = T(X_1,\ldots,X_n) $$
を、母数 $\theta$ を推定するために用いるとき、$\widehat\theta_n$ を $\theta$ の推定量という。

ここで、$\theta$ は母集団確率空間または母集団分布によって定まる実数値である。
たとえば、$X$ が可積分であり、
$$ \theta=\mathbb E_{\mathbb P}[X] $$
であるとき、$\theta$ を母平均という。
また、$X$ が二乗可積分であり、
$$ \theta=\operatorname{Var}_{\mathbb P}(X) $$
であるとき、$\theta$ を母分散という。
推定量 $\widehat\theta_n$ は標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上の確率変数である。

不偏推定量

$n\in\mathbb N_{\geq 1}$ とする。
母数 $\theta\in\mathbb R$ の推定量を
$$ \widehat\theta_n = T(X_1,\ldots,X_n) $$
とする。
このとき、$\widehat\theta_n$ が $\mathbb P^{\otimes n}$ に関して可積分であり、
$$ \mathbb E_{\mathbb P^{\otimes n}}[\widehat\theta_n] = \theta $$
が成り立つならば、$\widehat\theta_n$ を $\theta$ の不偏推定量という。

すなわち、
$$ \mathbb E_{\mathbb P^{\otimes n}}[T(X_1,\ldots,X_n)] = \theta $$
が成り立つ統計量を、$\theta$ の不偏推定量という。

不偏性は、推定量の実現値が常に真の値 $\theta$ に等しいことを意味しない。
不偏性が意味するのは、同じ抽出方法を繰り返したとき、推定量の平均的な値が真の値 $\theta$ に等しいということである。
したがって、不偏推定量であっても、個々の観測標本に対しては
$$ \widehat\theta_n\ne\theta $$
となることがある。

バイアス

$n\in\mathbb N_{\geq 1}$ とする。
母数 $\theta\in\mathbb R$ の推定量を
$$ \widehat\theta_n = T(X_1,\ldots,X_n) $$
とし、$\widehat\theta_n$ は $\mathbb P^{\otimes n}$ に関して可積分であるとする。
このとき、
$$ \operatorname{Bias}_{\mathbb P}(\widehat\theta_n;\theta) := \mathbb E_{\mathbb P^{\otimes n}}[\widehat\theta_n]-\theta $$
を、$\widehat\theta_n$ の $\theta$ に対するバイアスという。

したがって、
$$ \widehat\theta_n\text{ が }\theta\text{ の不偏推定量である} \Longleftrightarrow \operatorname{Bias}_{\mathbb P}(\widehat\theta_n;\theta)=0 $$
である。

パラメータ空間 $\Theta$ をもつ統計モデルを扱う場合には、各 $\theta\in\Theta$ に対して標本抽出空間上の確率測度
$$ \mathbb P_\theta^{\otimes n} $$
が定まっていると考える。
推定したい実数値関数を
$$ g:\Theta\to\mathbb R $$
とする。
このとき、統計量
$$ \widehat g_n = T(X_1,\ldots,X_n) $$
が任意の $\theta\in\Theta$ に対して $\mathbb P_\theta^{\otimes n}$ に関して可積分であり、
$$ \mathbb E_{\mathbb P_\theta^{\otimes n}}[\widehat g_n] = g(\theta) $$
を満たすならば、$\widehat g_n$ を $g(\theta)$ の不偏推定量という。
このとき、$\theta\in\Theta$ におけるバイアスを
$$ \operatorname{Bias}_{\theta}(\widehat g_n) := \mathbb E_{\mathbb P_\theta^{\otimes n}}[\widehat g_n]-g(\theta) $$
と定める。

標本平均

$n\in\mathbb N_{\geq 1}$ とする。
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と実数値母集団確率変数 $X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。
また、独立な $n$ 回抽出に対応する標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上に、母集団確率変数 $X$ から得られる標本確率変数列
$$ X_1,\ldots,X_n $$
が定められているとする。
このとき、統計量
$$ \overline X_n := \frac{1}{n}\sum_{i=1}^n X_i $$
を、$X_1,\ldots,X_n$ の標本平均という。

標本平均は、ボレル可測写像
$$ T_{\mathrm{mean}}:\mathbb R^n\to\mathbb R $$
を
$$ T_{\mathrm{mean}}(x_1,\ldots,x_n) := \frac{1}{n}\sum_{i=1}^n x_i $$
によって定めたときの統計量
$$ T_{\mathrm{mean}}(X_1,\ldots,X_n) $$
である。すなわち、
$$ \overline X_n = T_{\mathrm{mean}}(X_1,\ldots,X_n) $$
である。

標本点 $\boldsymbol\omega\in\Omega^n$ に対して観測標本
$$ (x_1,\ldots,x_n) = (X_1(\boldsymbol\omega),\ldots,X_n(\boldsymbol\omega)) $$
が得られたとき、標本平均の実現値は
$$ \overline x_n := \frac{1}{n}\sum_{i=1}^n x_i $$
である。

不偏標本分散

$n\in\mathbb N_{\geq 2}$ とする。
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と実数値母集団確率変数 $X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。
また、
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
を仮定する。
独立な $n$ 回抽出に対応する標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上に、母集団確率変数 $X$ から得られる標本確率変数列
$$ X_1,\ldots,X_n $$
が定められているとする。
標本平均を
$$ \overline X_n := \frac{1}{n}\sum_{i=1}^n X_i $$
と定める。
このとき、統計量
$$ U_n^2 := \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
を、$X_1,\ldots,X_n$ の不偏標本分散という。

$U_n^2$ の分母は $n-1$ であるため、
$$ n\geq 2 $$
が必要である。

不偏標本分散は、ボレル可測写像
$$ T_{\mathrm{var}}:\mathbb R^n\to\mathbb R $$
を
$$ T_{\mathrm{var}}(x_1,\ldots,x_n) := \frac{1}{n-1}\sum_{i=1}^n \left(x_i-\frac{1}{n}\sum_{j=1}^n x_j\right)^2 $$
によって定めたときの統計量
$$ T_{\mathrm{var}}(X_1,\ldots,X_n) $$
である。
すなわち、
$$ U_n^2 = T_{\mathrm{var}}(X_1,\ldots,X_n) $$
である。

母集団確率変数 $X$ が
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
を満たすとする。
このとき、母分散を
$$ \sigma^2 := \operatorname{Var}_{\mathbb P}(X) $$
とおく。
標本確率変数列 $X_1,\ldots,X_n$ は、標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上で独立同分布に従い、各 $X_i$ の分布は母集団分布 $\mathbb P_X$ である。
したがって、
$$ \mathbb E_{\mathbb P^{\otimes n}}[U_n^2] = \sigma^2 $$
が成り立つ。
この意味で、$U_n^2$ は母分散 $\sigma^2$ の不偏推定量である。

分母を $n$ とする統計量
$$ \frac{1}{n}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
も標本のばらつきを表す量として用いられる。
しかし、母集団確率変数 $X$ が
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
を満たす場合でも、これは一般には母分散
$$ \operatorname{Var}_{\mathbb P}(X) $$
の不偏推定量ではない。
そのため、本稿で母分散の不偏推定量として扱う標本分散は
$$ U_n^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
とする。

不偏標本分散
$$ U_n^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
に対して、その平方根
$$ U_n := \sqrt{U_n^2} $$
を標本標準偏差という。すなわち、
$$ U_n = \sqrt{ \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 } $$
である。標本標準偏差 $U_n$ は、標本のばらつきを元の単位で表す統計量である。
一方、$U_n^2$ は母分散 $\sigma^2$ の不偏推定量であるが、一般には $U_n$ が母標準偏差 $\sigma$ の不偏推定量であるとは限らない。
つまり、
$$ \mathbb E_{\mathbb P^{\otimes n}}[U_n^2] = \sigma^2 $$
が成り立っても、一般には
$$ \mathbb E_{\mathbb P^{\otimes n}}[U_n] = \sigma $$
とは限らない。
したがって、「不偏標本分散」という名称は $U_n^2$ に対するものであり、
その平方根 $U_n$ を「不偏標本標準偏差」と呼ぶのは一般には適切ではない。
本稿では、
$$ U_n^2 $$
を不偏標本分散と呼び、
$$ U_n=\sqrt{U_n^2} $$
を標本標準偏差と呼ぶ。

Prop&Proof

同分布な確率変数の期待値と分散

$(\Omega,\mathcal F,\mathbb P)$ を確率空間とし、
$$ X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R)) $$
を実数値確率変数とする。また、$n\in\mathbb N_{\geq 1}$ とし、確率空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
を考える。
各 $i=1,\ldots,n$ に対して、実数値確率変数
$$ X_i:(\Omega^n,\mathcal F^{\otimes n})\to(\mathbb R,\mathcal B(\mathbb R)) $$
が $X$ と同じ分布に従うとする。すなわち、
$$ (\mathbb P^{\otimes n})_{X_i}=\mathbb P_X $$
が成り立つとする。

$X$ が可積分ならば、$X_i$ も $\mathbb P^{\otimes n}$ に関して可積分であり、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i] = \mathbb E_{\mathbb P}[X] $$
が成り立つ。
$X$ が二乗可積分ならば、$X_i$ も $\mathbb P^{\otimes n}$ に関して二乗可積分であり、
$$ \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i) = \operatorname{Var}_{\mathbb P}(X) $$
が成り立つ。

$X$ が可積分であるとする。
$X_i$ と $X$ は同じ分布に従うので、任意の $B\in\mathcal B(\mathbb R)$ に対して、
$$ \mathbb P^{\otimes n}(X_i\in B)=\mathbb P(X\in B) $$
が成り立つ。
したがって、$X_i$ の分布 $\mathbb P_{X_i}$ と $X$ の分布 $\mathbb P_X$ は等しい。
よって、
$$ \mathbb E_{\mathbb P^{\otimes n}}[|X_i|] = \int_{\mathbb R}|x|\,d\mathbb P_{X_i}(x) = \int_{\mathbb R}|x|\,d\mathbb P_X(x) = \mathbb E_{\mathbb P}[|X|] <\infty $$
である。したがって、$X_i$ は $\mathbb P^{\otimes n}$ に関して可積分である。
さらに、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i] = \int_{\mathbb R}x\,d\mathbb P_{X_i}(x) = \int_{\mathbb R}x\,d\mathbb P_X(x) = \mathbb E_{\mathbb P}[X] $$
である。
$ $
$X$ が二乗可積分であるとする。
このとき、
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
である。
$X_i$ と $X$ は同じ分布に従うので、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i^2] = \int_{\mathbb R}x^2\,d\mathbb P_{X_i}(x) = \int_{\mathbb R}x^2\,d\mathbb P_X(x) = \mathbb E_{\mathbb P}[X^2] <\infty $$
である。したがって、$X_i$ は $\mathbb P^{\otimes n}$ に関して二乗可積分である。
また、二乗可積分性から $X$ と $X_i$ は可積分であり、$1.$ より
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i] = \mathbb E_{\mathbb P}[X] $$
である。この共通の値を $\mu$ とおく。
再び $X_i$ と $X$ は同じ分布に従うので、
$$ \mathbb E_{\mathbb P^{\otimes n}}[(X_i-\mu)^2] = \int_{\mathbb R}(x-\mu)^2\,d\mathbb P_{X_i}(x) = \int_{\mathbb R}(x-\mu)^2\,d\mathbb P_X(x) = \mathbb E_{\mathbb P}[(X-\mu)^2] $$
である。
したがって、
$$ \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i) = \operatorname{Var}_{\mathbb P}(X) $$
が成り立つ。

-以上より、主張が従う。
$$ \Box$$

特に、
$$ \mu:=\mathbb E_{\mathbb P}[X], \quad \sigma^2:=\operatorname{Var}_{\mathbb P}(X) $$
とおけば、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i]=\mu, \quad \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i)=\sigma^2 $$
である。

標本平均の不偏性

$n\in\mathbb N_{\geq 1}$ とする。
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と実数値母集団確率変数 $X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。
また、独立な $n$ 回抽出に対応する標本抽出空間 $(\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n})$ 上に、
母集団確率変数 $X$ から得られる標本確率変数列 $X_1,\ldots,X_n$ が定められているとする。

さらに、$X$ は可積分である、すなわち
$$ \mathbb E_{\mathbb P}[|X|]<\infty $$
を仮定する。
母平均を
$$ \mu:=\mathbb E_{\mathbb P}[X] $$
と定める。
標本平均を
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
と定める。

-このとき、
$$ \mathbb E_{\mathbb P^{\otimes n}}[\overline X_n]=\mu $$
が成り立つ。

標本確率変数列 $X_1,\ldots,X_n$ は、標本抽出空間 $(\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n})$ 上で独立同分布に従い、各 $X_i$ の分布は母集団分布 $\mathbb P_X$ である。
したがって、任意の $i=1,\ldots,n$ に対して、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i] = \mathbb E_{\mathbb P}[X] = \mu $$
である。
また、$X$ は可積分であり、各 $X_i$ の分布は $X$ の分布と同じであるから、任意の $i=1,\ldots,n$ に対して $X_i$ も可積分である。
よって、標本平均 $\overline X_n$ も可積分である。
期待値の線形性証明はコチラ )より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}[\overline X_n] &= \mathbb E_{\mathbb P^{\otimes n}}\left[\frac{1}{n}\sum_{i=1}^n X_i\right]\\ &= \frac{1}{n}\sum_{i=1}^n\mathbb E_{\mathbb P^{\otimes n}}[X_i]\\ &= \frac{1}{n}\sum_{i=1}^n\mu\\ &= \mu \end{aligned} $$
である。
以上より、
$$ \mathbb E_{\mathbb P^{\otimes n}}[\overline X_n]=\mu $$
が成り立つ。
$$ \Box$$

したがって、不偏推定量の定義より、$\overline X_n$ は母平均 $\mu$ の不偏推定量である。

標本平均の不偏性の証明では、標本確率変数列 $X_1,\ldots,X_n$ の独立性は本質的には用いられていない。
実際、この証明で必要なのは、各 $X_i$ が可積分であり、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i]=\mu \quad (i=1,\ldots,n) $$
を満たすことである。
この条件のもとで、期待値の線形性より、
$$ \mathbb E_{\mathbb P^{\otimes n}}\left[\overline X_n\right] = \mathbb E_{\mathbb P^{\otimes n}}\left[\frac{1}{n}\sum_{i=1}^n X_i\right] = \frac{1}{n}\sum_{i=1}^n\mathbb E_{\mathbb P^{\otimes n}}[X_i] = \frac{1}{n}\sum_{i=1}^n\mu = \mu $$
が従う。
したがって、標本平均の不偏性には、独立性そのものではなく、各 $X_i$ が同じ期待値 $\mu$ をもつことが本質的である。

復元抽出における標本平均の分散

$n\in\mathbb N_{\geq 1}$ とする。
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と実数値母集団確率変数 $X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。

また、
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
を仮定し、母平均と母分散をそれぞれ
$$ \mu:=\mathbb E_{\mathbb P}[X], \quad \sigma^2:=\operatorname{Var}_{\mathbb P}(X) $$
と定める。
$X_1,\ldots,X_n$ を、独立な $n$ 回復元抽出に対応する標本抽出空間上で、母集団確率変数 $X$ から得られる標本確率変数列とする。
すなわち、
$$ X_1,\ldots,X_n \overset{\mathrm{i.i.d.}}{\sim} \mathbb P_X $$
であるとする。
標本平均を
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^nX_i $$
と定める。

-このとき、
$$ \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n)=\frac{\sigma^2}{n} $$
が成り立つ。

$\mathbb E_{\mathbb P}[X^2]<\infty$ であるから、$X$ は二乗可積分であり、特に可積分である。
また、各 $X_i$ の分布は $X$ の分布と同じであるから、任意の $i=1,\ldots,n$ に対して
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i]=\mu $$
かつ
$$ \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i)=\sigma^2 $$
である。
さらに、各 $X_i$ は二乗可積分であるから、$\overline X_n$ も二乗可積分であり、$\operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n)$ は有限に定義される。
$X_1,\ldots,X_n$ は互いに独立であるから、二乗可積分性より
$$ \operatorname{Cov}_{\mathbb P^{\otimes n}}(X_i,X_j)=0 \quad (i\ne j) $$
が成り立つ( 証明はコチラ )。
したがって、分散の性質( 証明はコチラ )と共分散の性質( 証明はコチラ )より、
$$ \begin{aligned} \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) &= \operatorname{Var}_{\mathbb P^{\otimes n}} \left( \frac{1}{n}\sum_{i=1}^nX_i \right)\\ &= \frac{1}{n^2} \operatorname{Var}_{\mathbb P^{\otimes n}} \left( \sum_{i=1}^nX_i \right)\\ &= \frac{1}{n^2} \left( \sum_{i=1}^n\operatorname{Var}_{\mathbb P^{\otimes n}}(X_i) + 2\sum_{1\leq i< j\leq n} \operatorname{Cov}_{\mathbb P^{\otimes n}}(X_i,X_j) \right)\\ &= \frac{1}{n^2} \sum_{i=1}^n\operatorname{Var}_{\mathbb P^{\otimes n}}(X_i)\\ &= \frac{1}{n^2} \sum_{i=1}^n\sigma^2\\ &= \frac{\sigma^2}{n} \end{aligned} $$
である。
以上より、
$$ \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n)=\frac{\sigma^2}{n} $$
が成り立つ。
$$ \Box$$

分母を $n$ とする標本分散のバイアス

さらに、
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
を仮定する。
母平均と母分散をそれぞれ
$$ \mu:=\mathbb E_{\mathbb P}[X], \quad \sigma^2:=\operatorname{Var}_{\mathbb P}(X) $$
と定める。
標本平均を
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
とし、分母を $n$ とする標本分散を
$$ V_n^2:=\frac{1}{n}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
と定める。

-このとき、
$$ \mathbb E_{\mathbb P^{\otimes n}}[V_n^2]=\frac{n-1}{n}\sigma^2 $$
が成り立つ。
したがって、$V_n^2$ の母分散 $\sigma^2$ に対するバイアスは
$$ -\frac{1}{n}\sigma^2 $$
である。

まず、標本確率変数列 $X_1,\ldots,X_n$ は、標本抽出空間
$$ (\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n}) $$
上で独立同分布であり、各 $X_i$ の分布は母集団分布 $\mathbb P_X$ である。
したがって、任意の $i=1,\ldots,n$ に対して、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i]=\mu $$
かつ
$$ \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i)=\sigma^2 $$
である。
また、$X_i$ の分布は $X$ の母集団分布 $\mathbb P_X$ と同じであるから、
$$ \mathbb E_{\mathbb P^{\otimes n}}[X_i^2] = \mathbb E_{\mathbb P}[X^2] < \infty $$
である。
したがって、各 $X_i$ は二乗可積分であり、以下の期待値はすべて有限に定義される。

平方和の恒等式を示す。
任意の $i=1,\ldots,n$ に対して、
$$ X_i-\overline X_n = (X_i-\mu)-(\overline X_n-\mu) $$
である。
したがって、
$$ \begin{aligned} \sum_{i=1}^n (X_i-\overline X_n)^2 &= \sum_{i=1}^n \left((X_i-\mu)-(\overline X_n-\mu)\right)^2\\ &= \sum_{i=1}^n (X_i-\mu)^2 -2(\overline X_n-\mu)\sum_{i=1}^n(X_i-\mu) +\sum_{i=1}^n(\overline X_n-\mu)^2 \end{aligned} $$
である。
ここで、
$$ \sum_{i=1}^n(X_i-\mu) = \sum_{i=1}^n X_i-n\mu = n(\overline X_n-\mu) $$
であり、また
$$ \sum_{i=1}^n(\overline X_n-\mu)^2 = n(\overline X_n-\mu)^2 $$
である。
よって、
$$ \begin{aligned} \sum_{i=1}^n (X_i-\overline X_n)^2 &= \sum_{i=1}^n (X_i-\mu)^2 -2n(\overline X_n-\mu)^2 +n(\overline X_n-\mu)^2\\ &= \sum_{i=1}^n (X_i-\mu)^2 -n(\overline X_n-\mu)^2 \end{aligned} $$
である。
$ $
両辺の期待値を計算する。
上の恒等式より、
$$ \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\overline X_n)^2\right] = \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\mu)^2\right] - n\mathbb E_{\mathbb P^{\otimes n}}\left[(\overline X_n-\mu)^2\right] $$
である。
まず、期待値の線形性( 証明はコチラ )より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\mu)^2\right] &= \sum_{i=1}^n \mathbb E_{\mathbb P^{\otimes n}}\left[(X_i-\mu)^2\right]\\ &= \sum_{i=1}^n \sigma^2\\ &= n\sigma^2 \end{aligned} $$
である。
さらに、期待値の線形性( 証明はコチラ )より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}[\overline X_n] &= \mathbb E_{\mathbb P^{\otimes n}}\left[\frac{1}{n}\sum_{i=1}^n X_i\right]\\ &= \frac{1}{n}\sum_{i=1}^n\mathbb E_{\mathbb P^{\otimes n}}[X_i]\\ &= \frac{1}{n}\sum_{i=1}^n\mu\\ &= \mu \end{aligned} $$
である。
したがって、分散の定義より、
$$ \begin{aligned} \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) &= \mathbb E_{\mathbb P^{\otimes n}} \left[ \left(\overline X_n-\mathbb E_{\mathbb P^{\otimes n}}[\overline X_n]\right)^2 \right]\\ &= \mathbb E_{\mathbb P^{\otimes n}} \left[ (\overline X_n-\mu)^2 \right] \end{aligned} $$
である。
すなわち、
$$ \mathbb E_{\mathbb P^{\otimes n}} \left[ (\overline X_n-\mu)^2 \right] = \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) $$
である。
また、$X_1,\ldots,X_n$ は互いに独立であるから、
$$ \begin{aligned} \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) &= \operatorname{Var}_{\mathbb P^{\otimes n}}\left(\frac{1}{n}\sum_{i=1}^n X_i\right)\\ &= \frac{1}{n^2}\sum_{i=1}^n \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i)\\ &= \frac{1}{n^2}\sum_{i=1}^n \sigma^2\\ &= \frac{\sigma^2}{n} \end{aligned} $$
である。
したがって、
$$ \mathbb E_{\mathbb P^{\otimes n}}\left[(\overline X_n-\mu)^2\right] = \frac{\sigma^2}{n} $$
である。
以上より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\overline X_n)^2\right] &= n\sigma^2 - n\cdot\frac{\sigma^2}{n}\\ &= n\sigma^2-\sigma^2\\ &= (n-1)\sigma^2 \end{aligned} $$
である。
$ $
分母を $n$ とする標本分散の期待値を求める。
分母を $n$ とする標本分散の定義より、
$$ V_n^2 = \frac{1}{n}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
であるから、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}[V_n^2] &= \mathbb E_{\mathbb P^{\otimes n}}\left[ \frac{1}{n}\sum_{i=1}^n (X_i-\overline X_n)^2 \right]\\ &= \frac{1}{n} \mathbb E_{\mathbb P^{\otimes n}}\left[ \sum_{i=1}^n (X_i-\overline X_n)^2 \right]\\ &= \frac{1}{n}(n-1)\sigma^2\\ &= \frac{n-1}{n}\sigma^2 \end{aligned} $$
である。

-以上より、
$$ \mathbb E_{\mathbb P^{\otimes n}}[V_n^2]=\frac{n-1}{n}\sigma^2 $$
が成り立つ。
また、母分散 $\sigma^2$ に対するバイアスは、
$$ \begin{aligned} \operatorname{Bias}_{\mathbb P}(V_n^2;\sigma^2) &= \mathbb E_{\mathbb P^{\otimes n}}[V_n^2]-\sigma^2\\ &= \frac{n-1}{n}\sigma^2-\sigma^2\\ &= -\frac{1}{n}\sigma^2 \end{aligned} $$
である。
$$ \Box$$

この命題より、分母を $n$ とする標本分散 $V_n^2$ は、一般には母分散 $\sigma^2$ の不偏推定量ではない。
実際、$\sigma^2>0$ の場合、
$$ \mathbb E_{\mathbb P^{\otimes n}}[V_n^2] = \frac{n-1}{n}\sigma^2 < \sigma^2 $$
である。
したがって、$V_n^2$ は母分散を平均的に過小評価する。
一方、分母を $n-1$ とする不偏標本分散
$$ U_n^2 := \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
は、
$$ \mathbb E_{\mathbb P^{\otimes n}}[U_n^2]=\sigma^2 $$
を満たす。

不偏標本分散の計算公式

$n\in\mathbb N_{\geq 2}$ とする。
母集団確率空間 $(\Omega,\mathcal F,\mathbb P)$ と実数値母集団確率変数 $X:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ が与えられているとする。
また、独立な $n$ 回抽出に対応する標本抽出空間 $(\Omega^n,\mathcal F^{\otimes n},\mathbb P^{\otimes n})$ 上に、
母集団確率変数 $X$ から得られる標本確率変数列 $X_1,\ldots,X_n$ が定められているとする。
標本平均を
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
と定め、不偏標本分散を
$$ U_n^2 := \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
と定める。
このとき、
$$ U_n^2 = \frac{1}{n-1} \left( \sum_{i=1}^n X_i^2-n\overline X_n^2 \right) $$
が成り立つ。

不偏標本分散の定義より、
$$ U_n^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
である。
各 $i=1,\ldots,n$ に対して、
$$ (X_i-\overline X_n)^2 = X_i^2-2X_i\overline X_n+\overline X_n^2 $$
であるから、
$$ \begin{aligned} \sum_{i=1}^n (X_i-\overline X_n)^2 &= \sum_{i=1}^n X_i^2 -2\overline X_n\sum_{i=1}^n X_i +\sum_{i=1}^n\overline X_n^2 \end{aligned} $$
である。
ここで、標本平均の定義より、
$$ \sum_{i=1}^n X_i=n\overline X_n $$
である。
また、$\overline X_n^2$ は添字 $i$ に依存しないので、
$$ \sum_{i=1}^n\overline X_n^2=n\overline X_n^2 $$
である。
したがって、
$$ \begin{aligned} \sum_{i=1}^n (X_i-\overline X_n)^2 &= \sum_{i=1}^n X_i^2 -2\overline X_n\cdot n\overline X_n +n\overline X_n^2\\ &= \sum_{i=1}^n X_i^2 -2n\overline X_n^2 +n\overline X_n^2\\ &= \sum_{i=1}^n X_i^2 -n\overline X_n^2 \end{aligned} $$
である。
これを不偏標本分散の定義に代入すると、
$$ \begin{aligned} U_n^2 &= \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2\\ &= \frac{1}{n-1} \left( \sum_{i=1}^n X_i^2-n\overline X_n^2 \right) \end{aligned} $$
である。
以上より、
$$ U_n^2 = \frac{1}{n-1} \left( \sum_{i=1}^n X_i^2-n\overline X_n^2 \right) $$
が成り立つ。
$$ \Box$$

観測標本 $(x_1,\ldots,x_n)$ が得られたとき、標本平均の実現値を
$$ \overline x_n:=\frac{1}{n}\sum_{i=1}^n x_i $$
とすると、不偏標本分散の実現値は
$$ u_n^2 = \frac{1}{n-1} \left( \sum_{i=1}^n x_i^2-n\overline x_n^2 \right) $$
である。

この計算公式は、確率分布の性質ではなく、確率変数の間の恒等式である。
したがって、この命題の証明では、標本確率変数列 $X_1,\ldots,X_n$ の独立性や同分布性は本質的には用いられていない。
実際、任意の実数値確率変数列
$$ X_1,\ldots,X_n $$
に対して、標本平均を
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
と定めるだけで、
$$ \sum_{i=1}^n (X_i-\overline X_n)^2 = \sum_{i=1}^n X_i^2-n\overline X_n^2 $$
が点ごとに成り立つ。

不偏標本分散の不偏性

さらに、
$$ \mathbb E_{\mathbb P}[X^2]<\infty $$
を仮定する。
母平均と母分散をそれぞれ
$$ \mu:=\mathbb E_{\mathbb P}[X], \quad \sigma^2:=\operatorname{Var}_{\mathbb P}(X) $$
と定める。
標本平均を
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
とし、不偏標本分散を
$$ U_n^2:=\frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
と定める。

-このとき、
$$ \mathbb E_{\mathbb P^{\otimes n}}[U_n^2]=\sigma^2 $$
が成り立つ。

既に示した『分母を $n$ とする標本分散のバイアス』の証明と重複する。

平方和の恒等式を示す。
任意の $i=1,\ldots,n$ に対して、
$$ X_i-\overline X_n = (X_i-\mu)-(\overline X_n-\mu) $$
である。
したがって、
$$ \begin{aligned} \sum_{i=1}^n (X_i-\overline X_n)^2 &= \sum_{i=1}^n \left((X_i-\mu)-(\overline X_n-\mu)\right)^2\\ &= \sum_{i=1}^n (X_i-\mu)^2 -2(\overline X_n-\mu)\sum_{i=1}^n(X_i-\mu) +\sum_{i=1}^n(\overline X_n-\mu)^2 \end{aligned} $$
である。
ここで、
$$ \sum_{i=1}^n(X_i-\mu) = \sum_{i=1}^n X_i-n\mu = n(\overline X_n-\mu) $$
であり、また
$$ \sum_{i=1}^n(\overline X_n-\mu)^2 = n(\overline X_n-\mu)^2 $$
である。
よって、
$$ \begin{aligned} \sum_{i=1}^n (X_i-\overline X_n)^2 &= \sum_{i=1}^n (X_i-\mu)^2 -2n(\overline X_n-\mu)^2 +n(\overline X_n-\mu)^2\\ &= \sum_{i=1}^n (X_i-\mu)^2 -n(\overline X_n-\mu)^2 \end{aligned} $$
である。
$ $
両辺の期待値を計算する。
上の恒等式より、
$$ \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\overline X_n)^2\right] = \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\mu)^2\right] - n\mathbb E_{\mathbb P^{\otimes n}}\left[(\overline X_n-\mu)^2\right] $$
である。
まず、期待値の線形性( 証明はコチラ )より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\mu)^2\right] &= \sum_{i=1}^n \mathbb E_{\mathbb P^{\otimes n}}\left[(X_i-\mu)^2\right]\\ &= \sum_{i=1}^n \sigma^2\\ &= n\sigma^2 \end{aligned} $$
である。
さらに、期待値の線形性( 証明はコチラ )より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}[\overline X_n] &= \mathbb E_{\mathbb P^{\otimes n}}\left[\frac{1}{n}\sum_{i=1}^n X_i\right]\\ &= \frac{1}{n}\sum_{i=1}^n\mathbb E_{\mathbb P^{\otimes n}}[X_i]\\ &= \frac{1}{n}\sum_{i=1}^n\mu\\ &= \mu \end{aligned} $$
である。
したがって、分散の定義より、
$$ \begin{aligned} \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) &= \mathbb E_{\mathbb P^{\otimes n}} \left[ \left(\overline X_n-\mathbb E_{\mathbb P^{\otimes n}}[\overline X_n]\right)^2 \right]\\ &= \mathbb E_{\mathbb P^{\otimes n}} \left[ (\overline X_n-\mu)^2 \right] \end{aligned} $$
である。
すなわち、
$$ \mathbb E_{\mathbb P^{\otimes n}} \left[ (\overline X_n-\mu)^2 \right] = \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) $$
である。
また、$X_1,\ldots,X_n$ は互いに独立であるから、
$$ \begin{aligned} \operatorname{Var}_{\mathbb P^{\otimes n}}(\overline X_n) &= \operatorname{Var}_{\mathbb P^{\otimes n}}\left(\frac{1}{n}\sum_{i=1}^n X_i\right)\\ &= \frac{1}{n^2}\sum_{i=1}^n \operatorname{Var}_{\mathbb P^{\otimes n}}(X_i)\\ &= \frac{1}{n^2}\sum_{i=1}^n \sigma^2\\ &= \frac{\sigma^2}{n} \end{aligned} $$
である( 分散の性質はコチラ )。
したがって、
$$ \mathbb E_{\mathbb P^{\otimes n}}\left[(\overline X_n-\mu)^2\right] = \frac{\sigma^2}{n} $$
である。
以上より、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}\left[\sum_{i=1}^n (X_i-\overline X_n)^2\right] &= n\sigma^2 - n\cdot\frac{\sigma^2}{n}\\ &= n\sigma^2-\sigma^2\\ &= (n-1)\sigma^2 \end{aligned} $$
である。
$ $
不偏標本分散の期待値を求める。
不偏標本分散の定義より、
$$ U_n^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 $$
であるから、
$$ \begin{aligned} \mathbb E_{\mathbb P^{\otimes n}}[U_n^2] &= \mathbb E_{\mathbb P^{\otimes n}}\left[ \frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X_n)^2 \right]\\ &= \frac{1}{n-1} \mathbb E_{\mathbb P^{\otimes n}}\left[ \sum_{i=1}^n (X_i-\overline X_n)^2 \right]\\ &= \frac{1}{n-1}(n-1)\sigma^2\\ &= \sigma^2 \end{aligned} $$
である( 期待値の性質はコチラ )。

-以上より、
$$ \mathbb E_{\mathbb P^{\otimes n}}[U_n^2]=\sigma^2 $$
が成り立つ。
$$ \Box$$

したがって、不偏推定量の定義より、$U_n^2$ は母分散 $\sigma^2$ の不偏推定量である。

証明で用いたように、母集団確率変数 $X$ から得られる標本確率変数列 $X_1,\ldots,X_n$ が独立同分布であり、
$$ \mathbb E[X_i]=\mu, \quad \operatorname{Var}(X_i)=\sigma^2 $$
を満たすならば、標本平均
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
について、
$$ \mathbb E[\overline X_n]=\mu $$
および
$$ \operatorname{Var}(\overline X_n)=\frac{\sigma^2}{n} $$
が成り立つ。

ここで重要なのは、分散の計算では独立性を本質的に用いていることである。
復元抽出の場合には、各回の抽出は独立であり、標本確率変数列 $X_1,\ldots,X_n$ は独立同分布となるため、上の関係がそのまま成り立つ。
一方、有限母集団から非復元抽出を行う場合には、標本確率変数列は一般に独立ではない。
たとえば、$1$ 回目にある個体が選ばれると、その個体は $2$ 回目以降には選ばれないため、抽出結果どうしの間に依存が生じる。
このため、非復元抽出では一般に
$$ \operatorname{Var}(\overline X_n)=\frac{\sigma^2}{n} $$
とはならない。
$ $
有限母集団 $\mathcal P$ の大きさを $N$ とし、$N\geq 2$ とする。また、$1\leq n\leq N$ とする。
$\mathcal P$ 上の一様分布による母分散を
$$ \sigma^2 = \frac{1}{N}\sum_{\omega\in\mathcal P}(X(\omega)-\mu)^2 $$
として定める場合、単純無作為非復元抽出では標本平均の分散は
$$ \operatorname{Var}(\overline X_n) = \frac{\sigma^2}{n}\cdot\frac{N-n}{N-1} $$
となる。ここで、
$$ \frac{N-n}{N-1} $$
を有限母集団補正係数という。この係数は非復元抽出によって生じる依存を反映している。
特に、$n$ が $N$ に近づくほど、母集団の大部分を観測しているため、標本平均のばらつきは小さくなる。
実際、$n=N$ のとき、
$$ \frac{N-n}{N-1}=0 $$
であり、標本平均は母平均と一致するので分散は $0$ になる。
一方、$N$ が十分大きく、$n$ が $N$ に比べて小さい場合には、
$$ \frac{N-n}{N-1}\approx 1 $$
となるため、非復元抽出であっても
$$ \operatorname{Var}(\overline X_n)\approx \frac{\sigma^2}{n} $$
と近似できる。

復元抽出における標本確率変数列の独立同分布性

$N,n\in\mathbb N_{\geq 1}$ とする。

有限母集団を
$$ \mathcal P=\{1,\ldots,N\} $$
とする。
復元抽出を表す確率空間
$$ (\Omega,\mathcal F,\mathbb P) $$
を
$$ \Omega=\mathcal P^n, \quad \mathcal F=2^\Omega, \quad \mathbb P(\{\omega\})=\frac{1}{N^n} \quad (\omega\in\Omega) $$
によって定める。
各 $j=1,\ldots,n$ に対して、射影
$$ X_j:\Omega\to\mathcal P, \quad X_j(i_1,\ldots,i_n)=i_j $$
を定める。
ここで、$\mathcal P$ には $\sigma$-代数 $2^{\mathcal P}$ を入れる。

-このとき、
$$ X_1,\ldots,X_n \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Unif}(\mathcal P) $$
である。

まず、$\Omega=\mathcal P^n$ は有限集合であり、$\mathcal F=2^\Omega$ であるから、各 $X_j$ は可測である。
また、任意の事象 $A\subseteq\Omega$ に対して、
$$ \mathbb P(A) = \sum_{\omega\in A}\mathbb P(\{\omega\}) = \frac{|A|}{N^n} $$
である。

同一分布性を示す。
$j\in\{1,\ldots,n\}$ とし、$B\subseteq\mathcal P$ とする。
このとき、
$$ \{X_j\in B\} = \{(i_1,\ldots,i_n)\in\mathcal P^n\mid i_j\in B\} $$
である。
第 $j$ 成分は $B$ から選ばれ、残りの $n-1$ 個の成分はそれぞれ $\mathcal P$ から任意に選ばれるので、
$$ |\{X_j\in B\}| = |B|N^{n-1} $$
である。
したがって、
$$ \mathbb P(X_j\in B) = \frac{|B|N^{n-1}}{N^n} = \frac{|B|}{N} $$
である。
特に、任意の $k\in\mathcal P$ に対して、$B=\{k\}$ とおくと、
$$ \mathbb P(X_j=k) = \frac{1}{N} $$
である。
よって、各 $X_j$ は $\mathcal P$ 上の一様分布 $\mathrm{Unif}(\mathcal P)$ に従う。
$ $
独立性を示す。
任意に $B_1,\ldots,B_n\subseteq\mathcal P$ を取る。
任意に $\omega\in\Omega=\mathcal P^n$ を取る。$\omega=(i_1,\ldots,i_n)$ と書く。
このとき、
$$ \begin{aligned} \omega\in \bigcap_{j=1}^n\{X_j\in B_j\} &\Longleftrightarrow \text{任意の }j=1,\ldots,n\text{ に対して }\omega\in\{X_j\in B_j\}\\ &\Longleftrightarrow \text{任意の }j=1,\ldots,n\text{ に対して }X_j(\omega)\in B_j\\ &\Longleftrightarrow \text{任意の }j=1,\ldots,n\text{ に対して }i_j\in B_j\\ &\Longleftrightarrow (i_1,\ldots,i_n)\in B_1\times\cdots\times B_n\\ &\Longleftrightarrow \omega\in B_1\times\cdots\times B_n \end{aligned} $$
したがって、
$$ \bigcap_{j=1}^n\{X_j\in B_j\} = B_1\times\cdots\times B_n $$
である。よって、有限集合の直積の濃度公式(証明要)より、
$$ \left|\bigcap_{j=1}^n\{X_j\in B_j\}\right| = |B_1\times\cdots\times B_n| = |B_1|\cdots |B_n| $$
である。
したがって、
$$ \begin{aligned} \mathbb P(X_1\in B_1,\ldots,X_n\in B_n) &= \mathbb P\left(\bigcap_{j=1}^n\{X_j\in B_j\}\right)\\ &= \mathbb P(B_1\times\cdots\times B_n)\\ &= \frac{|B_1|\cdots |B_n|}{N^n}\\ &= \prod_{j=1}^n\frac{|B_j|}{N}\\ &= \prod_{j=1}^n\mathbb P(X_j\in B_j) \end{aligned} $$
である。
これは任意の $B_1,\ldots,B_n\subseteq\mathcal P$ に対して成り立つので、$X_1,\ldots,X_n$ は互いに独立である。

-以上より、$X_1,\ldots,X_n$ は互いに独立であり、すべて $\mathrm{Unif}(\mathcal P)$ に従う。
したがって、
$$ X_1,\ldots,X_n \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Unif}(\mathcal P) $$
である。
$$ \Box$$

$\mathbb P$ が $\Omega$ 上の確率測度であることを確認する。
$\Omega=\mathcal P^n$ であり、$|\mathcal P|=N$ であるから、有限集合の直積の濃度公式(証明要)より、
$$ |\Omega| = |\mathcal P^n| = |\mathcal P|^n = N^n $$
である。
また、任意の $\omega\in\Omega$ に対して、
$$ \mathbb P(\{\omega\}) = \frac{1}{N^n} \geq 0 $$
であり、
$$ \sum_{\omega\in\Omega}\mathbb P(\{\omega\}) = \sum_{\omega\in\Omega}\frac{1}{N^n} = |\Omega|\frac{1}{N^n} = N^n\frac{1}{N^n} = 1 $$
である。
したがって、$\mathbb P$ は $\Omega$ 上の一様確率測度である。

本命題より、
$$ X_1,\ldots,X_n \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Unif}(\mathcal P) $$
である。
このとき、順序付き標本列の全体は
$$ \mathcal P^n $$
である。
また、$|\mathcal P|=N$ であるから、有限集合の直積の濃度公式(証明要)より、
$$ |\mathcal P^n|=|\mathcal P|^n=N^n $$
である。
さらに、任意の $(i_1,\ldots,i_n)\in\{1,\ldots,N\}^n$ に対して、独立性と一様分布性より、
$$ \begin{aligned} \mathbb P(X_1=i_1,\ldots,X_n=i_n) &= \prod_{k=1}^n\mathbb P(X_k=i_k)\\ &= \prod_{k=1}^n\frac{1}{N}\\ &= \left(\frac{1}{N}\right)^n \end{aligned} $$
である。
したがって、復元抽出における順序付き標本列は全部で $N^n$ 個あり、各順序付き標本列は同じ確率
$$ \left(\frac{1}{N}\right)^n $$
で出現する。
この事実は、標本を順序付き列として扱っていることに依存する。
復元抽出であるため、同じ個体が複数回現れてよい。
一方、順序を区別しない標本や多重集合として標本を扱う場合、各標本の確率は一般に
$$ \left(\frac{1}{N}\right)^n $$
とはならない。

復元抽出における異なる標本確率変数の無相関性

$N,n\in\mathbb N_{\geq 1}$ とし、有限集合
$$ \mathcal P=\{x_1,\ldots,x_N\}\subset\mathbb R $$
を考える。ただし、$x_1,\ldots,x_N$ は互いに異なるとする。
復元抽出を表す確率空間を
$$ \Omega=\mathcal P^n, \quad \mathcal F=2^\Omega, \quad \mathbb P(\{\omega\})=\frac{1}{N^n} \quad (\omega\in\Omega) $$
によって定める。
各 $k=1,\ldots,n$ に対して、
$$ X_k:\Omega\to\mathcal P, \quad X_k(\omega_1,\ldots,\omega_n)=\omega_k $$
と定める。
このとき、任意の $i,j\in\{1,\ldots,n\}$ について、$i\ne j$ ならば
$$ \operatorname{Cov}(X_i,X_j)=0 $$
である。

前命題を有限母集団 $\mathcal P=\{x_1,\ldots,x_N\}$ に適用することにより、
$$ X_1,\ldots,X_n \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Unif}(\mathcal P) $$
である。
したがって、任意の $i,j\in\{1,\ldots,n\}$ について、$i\ne j$ ならば $X_i$ と $X_j$ は独立である。
また、$\mathcal P=\{x_1,\ldots,x_N\}\subset\mathbb R$ は有限集合であるから、
$$ M:=\max\{|x_1|,\ldots,|x_N|\} $$
が存在する。
各 $X_i$ は $\mathcal P$ の値しか取らないので、任意の標本点 $\omega\in\Omega$ に対して、
$$ |X_i(\omega)|\leq M $$
である。同様に、
$$ |X_j(\omega)|\leq M $$
である。
したがって、$X_i$ と $X_j$ は有界であり、さらに
$$ |X_i(\omega)X_j(\omega)|\leq M^2 $$
であるから、$X_iX_j$ も有界である。
ゆえに、$\mathbb E[X_i],\ \mathbb E[X_j],\ \mathbb E[X_iX_j]$ はすべて有限である。
独立性より、
$$ \mathbb E[X_iX_j] = \mathbb E[X_i]\mathbb E[X_j] $$
が成り立つ( 証明はコチラ )。
したがって、共分散の性質( 証明はコチラ )より、
$$ \begin{aligned} \operatorname{Cov}(X_i,X_j) &= \mathbb E[X_iX_j]-\mathbb E[X_i]\mathbb E[X_j]\\ &= \mathbb E[X_i]\mathbb E[X_j]-\mathbb E[X_i]\mathbb E[X_j]\\ &= 0 \end{aligned} $$
である。
以上より、任意の $i\ne j$ に対して、
$$ \operatorname{Cov}(X_i,X_j)=0 $$
が成り立つ。
$$ \Box$$

復元抽出における母比率推定量の不偏性と分散

$N,n\in\mathbb N_{\geq 1}$ とし、有限母集団を $\mathcal P=\{1,\ldots,N\}$ とする。
また、$A\subseteq\mathcal P$ とし、$M:=|A|$ と定める。

復元抽出を表す確率空間 $(\Omega,\mathcal F,\mathbb P)$ を
$$ \Omega=\mathcal P^n, \quad \mathcal F=2^\Omega, \quad \mathbb P(\{\omega\})=\frac{1}{N^n} \quad (\omega\in\Omega) $$
によって定める。
各 $i=1,\ldots,n$ に対して、射影
$$ X_i:\Omega\to\mathcal P, \quad X_i(\omega_1,\ldots,\omega_n)=\omega_i $$
を定める。
このとき、$X_1,\ldots,X_n$ は、$\mathcal P$ 上の一様分布に従って独立に $n$ 回復元抽出することによって得られる標本確率変数列である。
各 $i=1,\ldots,n$ に対して、
$$ I_i:=\mathbf 1_{\{X_i\in A\}} $$
と定め、母比率を
$$ p:=\frac{M}{N} $$
と定める。
このとき、母比率 $p$ の推定量を
$$ \hat p:=\frac{1}{n}\sum_{i=1}^n I_i $$
と定める。

-このとき、
$$ \mathbb E[\hat p]=p, \quad \operatorname{Var}(\hat p)=\frac{p(1-p)}{n} $$
が成り立つ。

前命題より、復元抽出によって得られる標本確率変数列は
$$ X_1,\ldots,X_n \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Unif}(\mathcal P) $$
である。
まず、任意の $i=1,\ldots,n$ に対して、
$$ \mathbb P(X_i\in A) = \frac{|A|}{|\mathcal P|} = \frac{M}{N} = p $$
である。
したがって、
$$ \mathbb P(I_i=1)=p $$
かつ
$$ \mathbb P(I_i=0)=1-p $$
である。よって、
$$ I_i\sim\mathrm{Bernoulli}(p) $$
である。
また、$X_1,\ldots,X_n$ は互いに独立であり、各 $I_i$ は $X_i$ の関数であるから、
$$ I_1,\ldots,I_n $$
も互いに独立である(補足を参照)。
したがって、
$$ I_1,\ldots,I_n \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Bernoulli}(p) $$
である。

不偏性を示す。
期待値の線形性( 証明はコチラ )より、
$$ \begin{aligned} \mathbb E[\hat p] &= \mathbb E\left[\frac{1}{n}\sum_{i=1}^n I_i\right]\\ &= \frac{1}{n}\sum_{i=1}^n\mathbb E[I_i]\\ &= \frac{1}{n}\sum_{i=1}^n p\\ &= p \end{aligned} $$
である。
したがって、$\hat p$ は $p$ の不偏推定量である。
$ $
分散を求める。
$I_1,\ldots,I_n$ は互いに独立であるから、分散の加法性( 証明はコチラ )より、
$$ \operatorname{Var}\left(\sum_{i=1}^n I_i\right) = \sum_{i=1}^n\operatorname{Var}(I_i) $$
である。
また、$I_i\sim\mathrm{Bernoulli}(p)$ であり、$I_i$ は $0$ または $1$ の値しか取らないので、$I_i^2=I_i$ である。
したがって、
$$ \mathbb E_{\mathbb P}[I_i]=p, \quad \mathbb E_{\mathbb P}[I_i^2]=p $$
である。
よって、分散の公式( 証明はコチラ )より、
$$ \operatorname{Var}_{\mathbb P}(I_i) = \mathbb E_{\mathbb P}[I_i^2]-(\mathbb E_{\mathbb P}[I_i])^2 = p-p^2 = p(1-p) $$
である。
したがって、
$$ \begin{aligned} \operatorname{Var}(\hat p) &= \operatorname{Var}\left(\frac{1}{n}\sum_{i=1}^n I_i\right)\\ &= \frac{1}{n^2}\operatorname{Var}\left(\sum_{i=1}^n I_i\right)\\ &= \frac{1}{n^2}\sum_{i=1}^n\operatorname{Var}(I_i)\\ &= \frac{1}{n^2}\sum_{i=1}^n p(1-p)\\ &= \frac{p(1-p)}{n} \end{aligned} $$
である。

-以上より、
$$ \mathbb E[\hat p]=p $$
かつ
$$ \operatorname{Var}(\hat p)=\frac{p(1-p)}{n} $$
が成り立つ。
$$ \Box$$

$X_1,\ldots,X_n$ が互いに独立であり、各 $I_i$ が $X_i$ の関数であるとき、
$$ I_1,\ldots,I_n $$
も互いに独立である。
実際、各 $i=1,\ldots,n$ に対して、
$$ I_i=\mathbf 1_{\{X_i\in A\}} $$
であるから、$I_i$ は $X_i$ の値だけによって定まる。
任意の $B_i\subseteq\{0,1\}$ に対して、
$$ C_i:=\{x\in\mathcal P\mid \mathbf 1_A(x)\in B_i\} $$
とおくと、
$$ \{I_i\in B_i\} = \{X_i\in C_i\} $$
が成り立つ。
ここで、$C_i\subseteq\mathcal P$ である。
したがって、$X_1,\ldots,X_n$ の独立性より、
$$ \begin{aligned} \mathbb P(I_1\in B_1,\ldots,I_n\in B_n) &= \mathbb P(X_1\in C_1,\ldots,X_n\in C_n)\\ &= \prod_{i=1}^n\mathbb P(X_i\in C_i)\\ &= \prod_{i=1}^n\mathbb P(I_i\in B_i) \end{aligned} $$
である。
これは任意の $B_1,\ldots,B_n\subseteq\{0,1\}$ に対して成り立つので、
$$ I_1,\ldots,I_n $$
は互いに独立である。

復元抽出に基づく標本平均の中心極限定理

$N\in\mathbb N_{\geq 2}$ とし、有限母集団を
$$ \mathcal P=\{x_1,\ldots,x_N\}\subset\mathbb R $$
とする。ただし、$x_1,\ldots,x_N$ は互いに異なるとする。

$\mathcal P$ 上の一様分布に従って独立に復元抽出して得られる確率変数列を
$$ X_1,X_2,\ldots $$
とする。
すなわち、
$$ X_1,X_2,\ldots \overset{\mathrm{i.i.d.}}{\sim} \mathrm{Unif}(\mathcal P) $$
とする。
また、
$$ \mu:=\mathbb E[X_1] $$
および
$$ \sigma^2:=\operatorname{Var}(X_1) $$
とおく。

-このとき、$\sigma^2>0$ であり、各 $n\in\mathbb N_{\geq 1}$ に対して
$$ \overline X_n:=\frac{1}{n}\sum_{i=1}^n X_i $$
と定めると、
$$ \frac{\overline X_n-\mu}{\sigma/\sqrt n} \xrightarrow{d} \mathcal N(0,1) \quad (n\to\infty) $$
が成り立つ。

まず、$\mathcal P=\{x_1,\ldots,x_N\}\subset\mathbb R$ は有限集合であるから、
$$ M:=\max\{|x_1|,\ldots,|x_N|\} $$
が存在する。
各 $X_i$ は $\mathcal P$ の値しか取らないので、任意の $i\in\mathbb N_{\geq 1}$ に対して
$$ |X_i|\leq M $$
である。
したがって、各 $X_i$ は有界であり、
$$ \mathbb E[X_i] $$
および
$$ \operatorname{Var}(X_i) $$
は有限である。
$ $
また、$N\geq 2$ であり、$x_1,\ldots,x_N$ は互いに異なるから、$\mathcal P$ には少なくとも $2$ つの異なる値が含まれる。
また、$X_1\sim\mathrm{Unif}(\mathcal P)$ であるから、
$$ \sigma^2 = \operatorname{Var}(X_1) = \frac{1}{N}\sum_{k=1}^N (x_k-\mu)^2 $$
である。
もし $\sigma^2=0$ ならば、すべての $k=1,\ldots,N$ に対して $x_k=\mu$ でなければならない。
しかし、$N\geq 2$ かつ $x_1,\ldots,x_N$ は互いに異なるため、これは不可能である。
したがって、$\sigma^2>0$ である。
$ $
前提より、
$$ X_1,X_2,\ldots $$
は互いに独立であり、同一の分布 $\mathrm{Unif}(\mathcal P)$ に従う。
よって、中心極限定理より、
$$ \frac{\sum_{i=1}^n(X_i-\mu)}{\sigma\sqrt n} \xrightarrow{d} \mathcal N(0,1) \quad (n\to\infty) $$
が成り立つ。
$ $
一方、標本平均の定義より、
$$ \begin{aligned} \frac{\overline X_n-\mu}{\sigma/\sqrt n} &= \frac{\frac{1}{n}\sum_{i=1}^n X_i-\mu}{\sigma/\sqrt n}\\ &= \frac{\frac{1}{n}\sum_{i=1}^n(X_i-\mu)}{\sigma/\sqrt n}\\ &= \frac{\sum_{i=1}^n(X_i-\mu)}{\sigma\sqrt n} \end{aligned} $$
である。

-したがって、
$$ \frac{\overline X_n-\mu}{\sigma/\sqrt n} \xrightarrow{d} \mathcal N(0,1) \quad (n\to\infty) $$
が成り立つ。
$$ \Box$$

この命題は、有限母集団からの復元抽出により得られる標本平均の漸近正規性を述べている。
復元抽出では、各抽出が独立であり、各 $X_i$ は同じ母集団分布に従う。
$ $
したがって、有限母集団からの復元抽出で得られる標本列は、古典的な独立同分布列に対する中心極限定理の枠組みに入る。
一方、非復元抽出では、標本確率変数列は一般に独立ではないため、この形の中心極限定理をそのまま適用することはできない。

投稿日：6月2日

更新日：6月9日

数学の力で現場を変えるアルゴリズムエンジニア募集 - Mathlog served by OptHub

この記事を高評価した人

高評価したユーザはいません

この記事に送られたバッジ

バッジはありません。

投稿者

Kagura

7332

■ 分野を問わず数学の証明が好きです。あとで自分が読み返したときに、きちんと理解できるノートを作ることを心がけています。不定期に過去のノートを確認し、修正&更新 (追加&削除) しています。定義、命題、証明などに誤りや不正確な点がございましたら、ご指摘いただけますと幸いです(2025年12月28日)。

他の人のコメント

コメントはありません。

読み込み中

Kagura

【不偏推定量】母集団と復元抽出による標本