# 电子工程代写|数据管理和数据系统代写Data Management and Data Systems代考|CMSC724

## 电子工程代写|数据管理和数据系统代写Data Management and Data Systems代考|Landslide Conditioning Factor Selection Results

The prediction capability of collected conditioning factors was evaluated using a tree-based feature importance method. As per the results shown in Figs. 6 and 7 , rainfall has the highest prediction capability among all conditioning factors in both study areas. This result validates a previous research in the literature about rainfall being a primary factor which causes landslides [15] compared to other conditioning factors which have less impact compared to rainfall. In comparison to Ireland, aspect and curvature score in Ratnapura have very small average importance values.
If any conditioning factor scores a negligible value for Average Importance (AI), that factor needs to be removed [22] from the feature set. Since the curvature and aspect have shown a very low importance in Ratnapura area, they were not utilized in building the susceptibility map for Ratnapura. Also, the rest of the factors were selected to train the models since they all got significant AI score. This observation proves the hypothesis from previous research that the impact on these conditioning factors can vary based on the geological location [23]. Hence, these average feature importance values are most likely to change for a new landslide zone with different geological properties. Thus, it is highly recommended that the susceptibility map building researchers use the proposed feature importance calculating technique to quantify the importance of factors for each zone afresh.

In landslide modeling, it is essential to evaluate and assess the quality and productivity of the trained models. $F$-score, precision, and recall measurements scored by the trained models for both training dataset and test dataset are included in Tables 2 and 3 for Ratnapura and Ireland, respectively. All three models exhibit reasonably good predictive capability. For the test set, the highest F-score and precision values were scored by the random forest classifier. XGBoost produced higher recall value compared to the random forest and rotation forest classifiers. These observations can be seen in both study areas. Cross-validation results justify the fact that random forest and XGBoost models are not overfitted. However, rotation forest model can be considered as overfitted since it scores low performance on the test set relative to its train set.

## 电子工程代写|数据管理和数据系统代写Data Management and Data Systems代考|Model Validation

In machine learning-based modeling, one of the most critical phases is the validation phase of the prediction model. Validation help in two ways, it quantifies the ability of the model to work well with unseen examples and it also quantifies how accurately the model can perform for both seen and unseen examples. In this research, all the models were tested thoroughly to verify that the model is properly fitted to the training dataset without overfitting or underfitting.

Models can be assessed by referring to the known historical landslides data and comparing with the model predictions. A common approach is to split the dataset into two subsets labeled as a training and testing dataset with 80-20 split. 80\% portion will be used as training set, while the other unseen portion is used for testing. Since a landslide dataset is imbalanced, SMOTE sampling method was applied before the model training process in this study. This technique increases the number of samples for the minor class by interpolation.

However, the problem of overfitting in machine learning can still reside in the trained model which leads to being less precise on unseen data. The tenfold crossvalidation was conducted to make sure that the model is not overfitted. Data is divided into ten subsets such that each time, one of the subsets is used as a test set while other nine subsets are put together to form the training set.

In this study, landslide mapping was treated as a binary classification which produces two outputs as either as a landslide occurrence or a non-landslide occurrence.
Four possible prediction types are shown in the confusion matrix in Fig. $5 .$
TP (True Positive) and TN (True Negative) are the numbers of landslide cells that are correctly classified, and FP (False Positive) and FN (False Negative) are the numbers of landslide cells incorrectly classified.

## 电子工程代写|数据管理和数据系统代写Data Management and Data Systems代考|Model Validation

TP（True Positive）和TN（True Negative）是正确分类的滑坡单元数，FP（False Positive）和FN（False Negative）是错误分类的滑坡单元数。

myassignments-help数学代考价格说明

1、客户需提供物理代考的网址，相关账户，以及课程名称，Textbook等相关资料~客服会根据作业数量和持续时间给您定价~使收费透明，让您清楚的知道您的钱花在什么地方。

2、数学代写一般每篇报价约为600—1000rmb，费用根据持续时间、周作业量、成绩要求有所浮动(持续时间越长约便宜、周作业量越多约贵、成绩要求越高越贵)，报价后价格觉得合适，可以先付一周的款，我们帮你试做，满意后再继续，遇到Fail全额退款。

3、myassignments-help公司所有MATH作业代写服务支持付半款，全款，周付款，周付款一方面方便大家查阅自己的分数，一方面也方便大家资金周转，注意:每周固定周一时先预付下周的定金，不付定金不予继续做。物理代写一次性付清打9.5折。

Math作业代写、数学代写常见问题

myassignments-help擅长领域包含但不是全部: