# 计算机代写|机器学习代写machine learning代考|COMP30027

## 计算机代写|机器学习代写machine learning代考|Evaluating Classification Models

So far, when developing classifiers, we have focused on maximizing the alignment between the labels and the model’s outputs. For example, in the case of logistic regression, we want the predicted probability $p_\theta\left(y_i=1 \mid x_i\right)$ to be as close as possible to the label $y_i$. Implicitly, when doing so, we are trying to maximize the model’s accuracy:
$$\operatorname{accuracy}\left(y, f_\theta(X)\right)=\frac{1}{|y|} \sum_{i=1}^{|y|} \delta\left(f_\theta\left(x_i\right)=y_i\right),$$
where $\delta$ is an indicator function, and $f_\theta\left(x_i\right)$ is the binarized output of the model (e.g., in the case of logistic regression, $f_\theta\left(x_i\right)-\delta\left(x_i \cdot \theta>0\right)$ ). ${ }^3$ Equivalently we are minimizing the error, that is,
$$\operatorname{error}\left(y, f_\theta(X)\right)=1-\operatorname{accuracy}\left(y, f_\theta(X)\right) .$$
To motivate the difficulty of properly evaluating classifiers, consider the following classification task. We saw in Figure $2.9$ that there was a slight relationship between gender and review length; now, let us see if we can develop a simple classifier that attempts to predict gender based on review length:

Surprisingly, the classifier produced by this code is $98.5 \%$ accurate. This result might seem implausible, but turns out to be a limitation of the error measure itself. Counting the number of negative labels in the dataset reveals that the data is $98.5 \%$ male (i.e., $98.5 \%$ negative labels). Not only does this reveal that the accuracy is unlikely to be an informative metric in this case, but it reveals that our goal of optimizing the accuracy caused us to learn a trivial classifier-the model simply predicts zero everywhere.

The aforementioned example demonstrates the problem with naively computing (or optimizing) model accuracy. Several situations where we might need more nuanced evaluation measures include:

• Datasets whose labels are highly imbalanced, such as the previous example.
• Situations where different types of errors have different associated costs. For example, failing to detect dangerous luggage in an airport is a more severe mistake than an erroneous positive identification.
• When we use classifiers for search or retrieval (as we will often do when developing recommender systems), we often care about the ability of the model to confidently identify a few positive instances (e.g., those surfaced on a results page), and are not interested in its overall accuracy.

Below we will develop error measures designed to handle each of these scenarios.

## 计算机代写|机器学习代写machine learning代考|Optimizing the Balanced Error Rate

Having argued that the BER may be preferable to the accuracy if we wish to avoid trivial solutions, we next ask how to train a classifier to avoid producing trivial solutions in the first place.

Intuitively, the degenerate solutions we saw in Section $3.3$ (i.e., a classifier which predicted zero everywhere) arose due to an imbalance in our training data (that is, a high ratio of positive or negative labels). Trivially, we might correct this by re-sampling our training data: that is, sampling either a fraction of our negative instances, or sampling negative instances (with replacement) until we have an equal number of positive and negative instances.

While this is a common and reasonably effective strategy, the same goal can be achieved more directly simply by weighting the positive and negative instances. Note that in our objective for logistic regression:
$$\sum_{y_i=1} \log \left(\frac{1}{1+e^{-x_i \cdot \theta}}\right)+\sum_{y_1=0} \log \left(\frac{e^{-x_i \cdot \theta}}{1+e^{-X_r \cdot \theta}}\right),$$
the two summations (over $y_i=1$ and $y_i=0$ ) essentially reward the model for correctly predicting positive instances and negative instances. The issue with this objective is that one of the two terms can dominate the expression in the event that positive or negative instances are over-represented in our dataset.
To address this, we can normalize the two expressions by the number of samples in the positive and negative classes:
$$\frac{|y|}{2\left|\left{i \mid y_i=1\right}\right|} \sum_{y_i=1} \log \left(\frac{1}{1+e^{-x_i \cdot \theta}}\right)+\frac{|y|}{2\left|\left{i \mid y_i=0\right}\right|} \sum_{y_1=0} \log \left(\frac{e^{-x_i \cdot \theta}}{1+e^{-X_l \cdot \theta}}\right) .$$
By doing so, the left- and right-hand expressions have equal importance, such that all positively labeled instances have the same importance as all negative instances; in other words the two expressions (after normalization) roughly correspond to the True Positive Rate and True Negative Rate, as in Equation (3.20). Note that in addition to normalizing by the number of samples, both sides are multiplied by $\frac{|x|}{2}$; this is not strictly necessary but is done by convention such that the total ‘weight’ of all instances is still $|y|$.

# 机器学习代考

## 计算机代写|机器学习代写machine learning代考|评价分类模型

$$\operatorname{accuracy}\left(y, f_\theta(X)\right)=\frac{1}{|y|} \sum_{i=1}^{|y|} \delta\left(f_\theta\left(x_i\right)=y_i\right),$$
，其中$\delta$是一个指标函数，$f_\theta\left(x_i\right)$是模型的二值化输出(例如，在逻辑回归的情况下，$f_\theta\left(x_i\right)-\delta\left(x_i \cdot \theta>0\right)$)。${ }^3$同样地，我们正在最小化错误，即
$$\operatorname{error}\left(y, f_\theta(X)\right)=1-\operatorname{accuracy}\left(y, f_\theta(X)\right) .$$

• 标签高度不平衡的数据集，如上例。
• 不同类型的错误具有不同相关代价的情况。例如，在机场没有发现危险行李是比错误的阳性识别更严重的错误。当我们使用分类器进行搜索或检索时(就像我们在开发推荐系统时经常做的那样)，我们通常关心模型有信心识别一些积极实例的能力(例如，那些出现在结果页面上的实例)，而对它的总体准确性不感兴趣下面我们将为处理每一种情况制定错误度量方法
经济代写|微观经济学代写微观经济学代考|优化均衡错误率
在讨论了如果我们希望避免琐碎的解，误码率可能比准确性更好之后，我们接下来问如何训练分类器在一开始就避免产生琐碎的解直观地说，我们在$3.3$节中看到的简并解(即，一个预测处处为零的分类器)是由于我们训练数据中的不平衡(即，阳性或阴性标签的比例很高)而产生的。简单地说，我们可以通过重新采样我们的训练数据来纠正这个问题:也就是说，对我们的负实例的一部分进行采样，或者对负实例进行采样(使用替换)，直到我们有相等数量的正实例和负实例虽然这是一种常见且相当有效的策略，但通过对积极和消极实例进行加权，可以更直接地达到同样的目标。注意，在我们的逻辑回归目标中:
$$\sum_{y_i=1} \log \left(\frac{1}{1+e^{-x_i \cdot \theta}}\right)+\sum_{y_1=0} \log \left(\frac{e^{-x_i \cdot \theta}}{1+e^{-X_r \cdot \theta}}\right),$$
两个求和(在$y_i=1$和$y_i=0$上)本质上奖励模型正确预测积极实例和消极实例。这个目标的问题是，在数据集中阳性或阴性实例过度表示的情况下，这两个术语中的一个可能主导表达式。为了解决这个问题，我们可以通过正类和负类中的样本数量来规范化这两个表达式:
$$\frac{|y|}{2\left|\left{i \mid y_i=1\right}\right|} \sum_{y_i=1} \log \left(\frac{1}{1+e^{-x_i \cdot \theta}}\right)+\frac{|y|}{2\left|\left{i \mid y_i=0\right}\right|} \sum_{y_1=0} \log \left(\frac{e^{-x_i \cdot \theta}}{1+e^{-X_l \cdot \theta}}\right) .$$
通过这样做，左边和右边的表达式具有相等的重要性，这样所有正标记的实例与所有负标记的实例具有相同的重要性;也就是说，归一化后的两个表达式大致对应True Positive Rate和True Negative Rate，如式(3.20)所示。注意，除了由样本数量归一化外，两边都乘以$\frac{|x|}{2}$;这不是严格必要的，但按照惯例，所有实例的总“权重”仍然是$|y|$ .

myassignments-help数学代考价格说明

1、客户需提供物理代考的网址，相关账户，以及课程名称，Textbook等相关资料~客服会根据作业数量和持续时间给您定价~使收费透明，让您清楚的知道您的钱花在什么地方。

2、数学代写一般每篇报价约为600—1000rmb，费用根据持续时间、周作业量、成绩要求有所浮动(持续时间越长约便宜、周作业量越多约贵、成绩要求越高越贵)，报价后价格觉得合适，可以先付一周的款，我们帮你试做，满意后再继续，遇到Fail全额退款。

3、myassignments-help公司所有MATH作业代写服务支持付半款，全款，周付款，周付款一方面方便大家查阅自己的分数，一方面也方便大家资金周转，注意:每周固定周一时先预付下周的定金，不付定金不予继续做。物理代写一次性付清打9.5折。

Math作业代写、数学代写常见问题

myassignments-help擅长领域包含但不是全部: