# 统计代写|线性回归代写linear regression代考|STAT452

## 统计代写|线性回归代写linear regression代考|Diagnostics

Automatic or blind use of regression models, especially in exploratory work, all too often leads to incorrect or meaningless results and to confusion rather than insight. At the very least, a user should be prepared to make and study a number of plots before, during, and after fitting the model.
Chambers et al. (1983, p. 306 )
Diagnostics are used to check whether model assumptions are reasonable. This section focuses on diagnostics for the unimodal $M L R$ model $Y_i=\boldsymbol{x}i^T \boldsymbol{\beta}+e_i$ for $i=1, \ldots, n$ where the errors are iid from a unimodal distribution that is not highly skewed with $\mathrm{E}\left(e_i\right)=0$ and $\operatorname{VAR}\left(e_i\right)=\sigma^2$. See Definition 2.6. It is often useful to use notation to separate the constant from the nontrivial predictors. Assume that $\boldsymbol{x}_i=\left(1, x{i, 2}, \ldots, x_{i, p}\right)^T \equiv\left(1, \boldsymbol{u}i^T\right)^T$ where the $(p-1) \times 1$ vector of nontrivial predictors $\boldsymbol{u}_i=\left(x{i, 2}, \ldots, x_{i, p}\right)^T$. In matrix form,
$$\begin{gathered} \boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+\boldsymbol{e}, \ \boldsymbol{X}=\left[X_1, X_2, \ldots, X_p\right]=[\mathbf{1}, \boldsymbol{U}] \end{gathered}$$

1 is an $n \times 1$ vector of ones, and $\boldsymbol{U}=\left[X_2, \ldots, X_p\right]$ is the $n \times(p-1)$ matrix of nontrivial predictors. The $k$ th column of $U$ is the $n \times 1$ vector of the $j$ th predictor $X_j=\left(x_{1, j}, \ldots, x_{n, j}\right)^T$ where $j=k+1$. The sample mean and covariance matrix of the nontrivial predictors are
$$\overline{\boldsymbol{u}}=\frac{1}{n} \sum_{i=1}^n \boldsymbol{u}i$$ and $$\boldsymbol{C}=\operatorname{Cov}(\boldsymbol{U})=\frac{1}{n-1} \sum{i=1}^n\left(\boldsymbol{u}_i-\overline{\boldsymbol{u}}\right)\left(\boldsymbol{u}_i-\overline{\boldsymbol{u}}\right)^T$$
respectively, where $\boldsymbol{u}_i^T$ is the $i$ th row of $\boldsymbol{U}$.

## 统计代写|线性回归代写linear regression代考|Outlier Detection

Definition 3.12. Outliers are cases that lie far from the bulk of the data. Hence $Y$ outliers are cases that have unusually large vertical distances from the MLR fit to the bulk of the data while $\boldsymbol{x}$ outliers are cases with predictors $\boldsymbol{x}$ that lie far from the bulk of the $\boldsymbol{x}_i$. Suppose that some analysis to detect outliers is performed. Masking occurs if the analysis suggests that one or more outliers are in fact good cases. Swamping occurs if the analysis suggests that one or more good cases are outliers.

The residual and response plots are very useful for detecting outliers. If there is a cluster of cases with outlying $Y \mathrm{~s}$, the identity line will often pass through the outliers. If there are two clusters with similar $Y \mathrm{~s}$, then the two plots may fail to show the clusters. Then using methods to detect $\boldsymbol{x}$ outliers may be useful.

Let the $q$ continuous predictors in the MLR model be collected into vectors $\boldsymbol{u}_i$ for $i=1, \ldots, n$. Let the $n \times q$ matrix $\boldsymbol{W}$ have $n$ rows $\boldsymbol{u}_1^T, \ldots, \boldsymbol{u}_n^T$. Let the $q \times 1$ column vector $T(\boldsymbol{W})$ be a multivariate location estimator, and let the $q \times q$ symmetric positive definite matrix $\boldsymbol{C}(\boldsymbol{W})$ be a covariance estimator. Often $q=p-1$ and only the constant is omitted from $\boldsymbol{x}_i$ to create $\boldsymbol{u}_i$.
Definition 3.13. The $i$ th squared Mahalanobis distance is
$$D_i^2=D_i^2(T(\boldsymbol{W}), \boldsymbol{C}(\boldsymbol{W}))=\left(\boldsymbol{u}_i-T(\boldsymbol{W})\right)^T \boldsymbol{C}^{-1}(\boldsymbol{W})\left(\boldsymbol{u}_i-T(\boldsymbol{W})\right)$$
for each point $\boldsymbol{u}_i$. Notice that $D_i^2$ is a random variable (scalar valued).

## 统计代写|线性回归代写linear regression代考|Diagnostics

$$\boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+\boldsymbol{e}, \boldsymbol{X}=\left[X_1, X_2, \ldots, X_p\right]=[\mathbf{1}, \boldsymbol{U}]$$
1 是一个 $n \times 1$ 个向量，和 $\boldsymbol{U}=\left[X_2, \ldots, X_p\right]$ 是个 $n \times(p-1)$ 非平凡预测变量矩阵。这 $k$ 第列 $U$ 是个 $n \times 1$ 的向量 $j$ 预测器 $X_j=\left(x_{1, j}, \ldots, x_{n, j}\right)^T$ 在哪里 $j=k+1$. 非平凡预测变量的样本均值和协方差矩阵 为
$$\overline{\boldsymbol{u}}=\frac{1}{n} \sum_{i=1}^n \boldsymbol{u} i$$

$$\boldsymbol{C}=\operatorname{Cov}(\boldsymbol{U})=\frac{1}{n-1} \sum i=1^n\left(\boldsymbol{u}_i-\overline{\boldsymbol{u}}\right)\left(\boldsymbol{u}_i-\overline{\boldsymbol{u}}\right)^T$$

## 统计代写|线性回归代写linear regression代考|Outlier Detection

$$D_i^2=D_i^2(T(\boldsymbol{W}), \boldsymbol{C}(\boldsymbol{W}))=\left(\boldsymbol{u}_i-T(\boldsymbol{W})\right)^T \boldsymbol{C}^{-1}(\boldsymbol{W})\left(\boldsymbol{u}_i-T(\boldsymbol{W})\right)$$

myassignments-help数学代考价格说明

1、客户需提供物理代考的网址，相关账户，以及课程名称，Textbook等相关资料~客服会根据作业数量和持续时间给您定价~使收费透明，让您清楚的知道您的钱花在什么地方。

2、数学代写一般每篇报价约为600—1000rmb，费用根据持续时间、周作业量、成绩要求有所浮动(持续时间越长约便宜、周作业量越多约贵、成绩要求越高越贵)，报价后价格觉得合适，可以先付一周的款，我们帮你试做，满意后再继续，遇到Fail全额退款。

3、myassignments-help公司所有MATH作业代写服务支持付半款，全款，周付款，周付款一方面方便大家查阅自己的分数，一方面也方便大家资金周转，注意:每周固定周一时先预付下周的定金，不付定金不予继续做。物理代写一次性付清打9.5折。

Math作业代写、数学代写常见问题

myassignments-help擅长领域包含但不是全部: