# 统计代写|线性回归代写linear regression代考|STAT452

## 统计代写|线性回归代写linear regression代考|Variable Selection

Variable selection, also called subset or model selection, is the search for a subset of predictor variables that can be deleted without important loss of information. A model for variable selection in multiple linear regression can be described by
$$Y-\boldsymbol{x}^T \boldsymbol{\beta}+e-\boldsymbol{\beta}^T \boldsymbol{x}+e-\boldsymbol{x}_S^T \boldsymbol{\beta}_S+\boldsymbol{x}_E^T \boldsymbol{\beta}_E+e-\boldsymbol{x}_S^T \boldsymbol{\beta}_S+e$$
where $e$ is an error, $Y$ is the response variable, $\boldsymbol{x}=\left(\boldsymbol{x}_S^T, \boldsymbol{x}_E^T\right)^T$ is a $p \times 1$ vector of predictors, $\boldsymbol{x}_S$ is a $k_S \times 1$ vector, and $\boldsymbol{x}_E$ is a $\left(p-k_S\right) \times 1$ vector. Given that $\boldsymbol{x}_S$ is in the model, $\boldsymbol{\beta}_E=\mathbf{0}$ and $E$ denotes the subset of terms that can be eliminated given that the subset $S$ is in the model.

Since $S$ is unknown, candidate subsets will be examined. Let $\boldsymbol{x}_I$ be the vector of $k$ terms from a candidate subset indexed hy $I$, and let $\boldsymbol{x}_O$ be the vector of the remaining predictors (out of the candidate submodel). Then
$$Y=\boldsymbol{x}_I^T \boldsymbol{\beta}_I+\boldsymbol{x}_O^T \boldsymbol{\beta}_O+e .$$
Definition 3.7. The model $Y=\boldsymbol{x}^T \boldsymbol{\beta}+e$ that uses all of the predictors is called the full model. A model $Y=\boldsymbol{x}_I^T \boldsymbol{\beta}_I+e$ that only uses a subset $\boldsymbol{x}_I$ of the predictors is called a submodel. The full model is always a submodel. The sufficient predictor (SP) is the linear combination of the predictor variables used in the model. Hence the full model has $S P=\boldsymbol{x}^T \boldsymbol{\beta}$ and the submodel has $S P=\boldsymbol{x}_I^T \boldsymbol{\beta}_I$.

## 统计代写|线性回归代写linear regression代考|Bootstrapping Variable Selection

The bootstrap will be described and then applied to variable selection. Suppose there is data $\boldsymbol{w}_1, \ldots, \boldsymbol{w}_n$ collected from a distribution with $c$ df $F$ into an $n \times p$ matrix $\boldsymbol{W}$. The empirical distribution, with $\mathrm{cdf} F_n$, gives each observed data case $\boldsymbol{w}_i$ probability $1 / n$. Let the statistic $T_n=t(\boldsymbol{W})=t\left(F_n\right)$ be computed from the data. Suppose the statistic estimates $\boldsymbol{\mu}=t(F)$. Let $t\left(\boldsymbol{W}^\right)=t\left(F_n^\right)=T_n^*$ indicate that $t$ was computed from an iid sample from the empirical distribution $F_n$ : a sample of size $n$ was drawn with replacement from the observed sample $\boldsymbol{w}_1, \ldots, \boldsymbol{w}_n$.

Some notation is needed to give the Olive (2013a) prediction region used to bootstrap a hypothesis test. Suppose $\boldsymbol{w}1, \ldots, \boldsymbol{w}_n$ are iid $p \times 1$ random vectors with mean $\boldsymbol{\mu}$ and nonsingular covariance matrix $\boldsymbol{\Sigma} \boldsymbol{w}$. Let a future test observation $\boldsymbol{w}_f$ be independent of the $\boldsymbol{w}_i$ but from the same distribution. Let $(\overline{\boldsymbol{w}}, S)$ be the sample mean and sample covariance matrix where $$\overline{\boldsymbol{w}}=\frac{1}{n} \sum{i=1}^n \boldsymbol{w}i \text { and } \boldsymbol{S}=\boldsymbol{S} \boldsymbol{w}=\frac{1}{\mathrm{n}-1} \sum{\mathrm{i}=1}^{\mathrm{n}}\left(\boldsymbol{w}{\mathrm{i}}-\overline{\boldsymbol{w}}\right)\left(\boldsymbol{w}{\mathrm{i}}-\overline{\boldsymbol{w}}\right)^{\mathrm{T}} .$$
Then the ith squared sample Mahalanobis distance is the scalar
$$D_{\boldsymbol{w}}^2=D_{\boldsymbol{w}}^2(\overline{\boldsymbol{w}}, \boldsymbol{S})=(\boldsymbol{w}-\overline{\boldsymbol{w}})^T \boldsymbol{S}^{-1}(\boldsymbol{w}-\overline{\boldsymbol{w}}) .$$
Let $D_i^2=D_{\boldsymbol{w}i}^2$ for each observation $\boldsymbol{w}_i$. Let $D{(c)}$ be the $c$ th order statistic of $D_1, \ldots, D_n$. Consider the hyperellipsoid
$$\mathcal{A}n=\left{\boldsymbol{w}: D{\boldsymbol{w}}^2(\overline{\boldsymbol{w}}, \boldsymbol{S}) \leq D_{(c)}^2\right}=\left{\boldsymbol{w}: D \boldsymbol{w}(\overline{\boldsymbol{w}}, \boldsymbol{S}) \leq D_{(c)}\right} .$$
If $n$ is large, we can use $c=k_n=\lceil n(1-\delta)\rceil$. If $n$ is not large, using $c=$ $U_n$ where $U_n$ decreases to $k_n$, can improve small sample performance. Olive (2013a) showed that (3.10) is a large sample $100(1-\delta) \%$ prediction region for a large class of distributions, although regions with smaller volumes may exist.

# 线性回归代考

## 统计代写|线性回归代写线性回归代考|变量选择

$$Y-\boldsymbol{x}^T \boldsymbol{\beta}+e-\boldsymbol{\beta}^T \boldsymbol{x}+e-\boldsymbol{x}_S^T \boldsymbol{\beta}_S+\boldsymbol{x}_E^T \boldsymbol{\beta}_E+e-\boldsymbol{x}_S^T \boldsymbol{\beta}_S+e$$

$$Y=\boldsymbol{x}_I^T \boldsymbol{\beta}_I+\boldsymbol{x}_O^T \boldsymbol{\beta}_O+e .$$

## 统计代写|线性回归代写linear regression代考|Bootstrapping Variable Selection

. zip bootstrap将被描述，然后应用于变量选择。假设有数据$\boldsymbol{w}_1, \ldots, \boldsymbol{w}_n$从分布$c$ df $F$收集到$n \times p$矩阵$\boldsymbol{W}$。经验分布，用$\mathrm{cdf} F_n$，给出每个观测数据情况$\boldsymbol{w}_i$概率$1 / n$。让统计数字$T_n=t(\boldsymbol{W})=t\left(F_n\right)$从数据中计算出来。假设统计估计$\boldsymbol{\mu}=t(F)$。设$t\left(\boldsymbol{W}^\right)=t\left(F_n^\right)=T_n^*$表示$t$是从经验分布的iid样本$F_n$中计算出来的:从观察样本$\boldsymbol{w}_1, \ldots, \boldsymbol{w}_n$中替换出大小为$n$的样本。

$$D{\boldsymbol{w}}^2=D_{\boldsymbol{w}}^2(\overline{\boldsymbol{w}}, \boldsymbol{S})=(\boldsymbol{w}-\overline{\boldsymbol{w}})^T \boldsymbol{S}^{-1}(\boldsymbol{w}-\overline{\boldsymbol{w}}) .$$

myassignments-help数学代考价格说明

1、客户需提供物理代考的网址，相关账户，以及课程名称，Textbook等相关资料~客服会根据作业数量和持续时间给您定价~使收费透明，让您清楚的知道您的钱花在什么地方。

2、数学代写一般每篇报价约为600—1000rmb，费用根据持续时间、周作业量、成绩要求有所浮动(持续时间越长约便宜、周作业量越多约贵、成绩要求越高越贵)，报价后价格觉得合适，可以先付一周的款，我们帮你试做，满意后再继续，遇到Fail全额退款。

3、myassignments-help公司所有MATH作业代写服务支持付半款，全款，周付款，周付款一方面方便大家查阅自己的分数，一方面也方便大家资金周转，注意:每周固定周一时先预付下周的定金，不付定金不予继续做。物理代写一次性付清打9.5折。

Math作业代写、数学代写常见问题

myassignments-help擅长领域包含但不是全部: