```
library(nntrf)
library(mlr)
library(mlrCPO)
library(FNN)
```

**nntrf** has several hyper-parameters which are important in order to obtain good results. Those are:

**size:**The number of hidden neurons**maxit:**The number of iterations**repetitions:**The number of training repetitions**use_sigmoid:**Whether the transformation should use the sigmoid or not

Machine learning pipelines usually contain two kinds of steps: pre-processing and classifier/regressor. Both kinds of steps contain hyper-parameters and they are optimized together. **nntrf** is a preprocessing step. The classifier method that will be used after preprocessing is KNN, whose main hyper-parameter is the number of neighbors (**k**). Hyper-parameter tuning could be programmed from scratch, but it is more efficient to use the procedures already available in machine learning packages such as mlr or Caret. In this case, **mlr** will be used. Code to do that is described below.

The next piece of code has nothing to do with **nntrf**. It just establishes that the doughnutRandRotated dataset is going to be used (with target variable “V11”), that grid search is going to be used for hyper-parameter tuning, that an external 3-fold crossvalidation is going to be used to evaluate models, while an inner 3-fold crossvalidation is going to be used for hyper-parameter tuning.

```
data("doughnutRandRotated")
doughnut_task <- makeClassifTask(data = doughnutRandRotated, target = "V11")
control_grid <- makeTuneControlGrid()
inner_desc <- makeResampleDesc("CV", iter=3)
outer_desc <- makeResampleDesc("CV", iter=3)
set.seed(0)
outer_inst <- makeResampleInstance(outer_desc, doughnut_task)
```

A mlr subpakage, called mlrCPO, is going to be used to combine pre-processing and learning into a single pipeline. In order to do that, **nntrf** must be defined as a pipeline step, as follows. Basically, it defines **train** and **retrafo** methods. The former, trains the neural networks and stores the hidden layer weights, the latter applies the transformation on a dataset. **pSS** is used to define the main **nntrf** hyper-parameters. The piece of code below can just be copied for use in other scripts.

```
cpo_nntrf = makeCPO("nntrfCPO",
# Here, the hyper-parameters of nntrf are defined
pSS(repetitions = 1 : integer[1, ],
size: integer[1, ],
maxit = 100 : integer[1, ],
use_sigmoid = FALSE: logical),
dataformat = "numeric",
cpo.train = function(data, target,
repetitions,
size, maxit, use_sigmoid) {
data_and_class <- cbind(as.data.frame(data), class=target[[1]])
nnpo <- nntrf(repetitions=repetitions,
formula=class~.,
data=data_and_class,
size=size, maxit=maxit, trace=FALSE)
},
cpo.retrafo = function(data, control,
repetitions,
size, maxit, use_sigmoid) {
trf_x <- control$trf(x=data,use_sigmoid=use_sigmoid)
trf_x
})
```

Next, the pipeline of pre-processing + classifier method (KNN in this case) is defined.

```
# knn is the machine learning method. The knn available in the FNN package is used
knn_lrn <- makeLearner("classif.fnn")
# Then, knn is combined with nntrf's preprocessing into a pipeline
knn_nntrf <- cpo_nntrf() %>>% knn_lrn
# Just in case, we fix the values of the hyper-parameters that we do not require to optimize
# (not necessary, because they already have default values. Just to make their values explicit)
knn_nntrf <- setHyperPars(knn_nntrf, nntrfCPO.repetitions=1, nntrfCPO.maxit=100,
nntrfCPO.use_sigmoid=FALSE)
# However, we are going to use 2 repetitions here, instead of 1 (the default):
knn_nntrf <- setHyperPars(knn_nntrf, nntrfCPO.repetitions=2)
```

Next, the hyper-parameter space for the pipeline is defined. Only two hyper-parameters will be optimized: the number of KNN neighbors (k), from 1 to 7, and the number of hidden neurons (size), from 1 to 10. The remaining hyper-parameters are left to some default values.

```
ps <- makeParamSet(makeDiscreteParam("k", values = 1:7),
makeDiscreteParam("nntrfCPO.size", values = 1:10)
)
```

Next, a mlr wrapper is used to give the **knn_nntrf** pipeline the ability to do hyper-parameter tuning.

```
knn_nntrf_tune <- makeTuneWrapper(knn_nntrf, resampling = inner_desc, par.set = ps,
control = control_grid, measures = list(acc), show.info = FALSE)
```

Finally, the complete process (3-fold hyper-parameter tuning) and 3-fold outer model evaluation is run. It takes some time.

```
set.seed(0)
# Please, note that in order to save time, results have been precomputed
cached <- system.file("extdata", "error_knn_nntrf_tune.rda", package = "nntrf")
if(file.exists(cached)){load(cached)} else {
error_knn_nntrf_tune <- resample(knn_nntrf_tune, doughnut_task, outer_inst,
measures = list(acc),
extract = getTuneResult, show.info = FALSE)
#save(error_knn_nntrf_tune, file="../inst/extdata/error_knn_nntrf_tune.rda")
}
```

Errors and optimal hyper-parameters are as follows (the 3-fold inner hyper-parameter tuning crossvalidation accuracy is also shown in **acc.test.mean** ). **nntrfCPO.size** is the number of hidden neurons selected by hyper-parameter tuning. Despite the optimal value is 2 (the actual dougnut is defined in two dimensions only), hyper-parameter tuning is not able to reducide dimensionality that much in this case. But it will be shown (later) that the accuracy obtained by **nntrf+knn** is good.

```
print(error_knn_nntrf_tune$extract)
#> [[1]]
#> Tune result:
#> Op. pars: k=4; nntrfCPO.size=10
#> acc.test.mean=0.9602523
#>
#> [[2]]
#> Tune result:
#> Op. pars: k=4; nntrfCPO.size=8
#> acc.test.mean=0.9631010
#>
#> [[3]]
#> Tune result:
#> Op. pars: k=3; nntrfCPO.size=5
#> acc.test.mean=0.9708971
```

The final outer 3-fold crossvalition accuracy is displayed in the next cell. Please, note that this **acc.test.mean** corresponds to the outer 3-fold crossvalidation, while the **acc.test.mean** above, corresponds to the inner 3-fold crossvalidation accuracy (computed during hyper-parameter tuning).

```
print(error_knn_nntrf_tune$aggr)
#> acc.test.mean
#> 0.9655999
```

Although not required, mlr allows to display the results of the different hyper-parameter values, sorted by the **inner** 3-fold crossvalidation accuracy, from best to worse.

```
library(dplyr)
results_hyper <- generateHyperParsEffectData(error_knn_nntrf_tune)
head(arrange(results_hyper$data, -acc.test.mean))
#> k nntrfCPO.size acc.test.mean iteration exec.time nested_cv_run
#> 1 3 5 0.9708971 31 2.821 3
#> 2 7 4 0.9668467 28 2.604 3
#> 3 4 8 0.9631010 53 4.002 2
#> 4 7 8 0.9610013 56 3.876 2
#> 5 4 10 0.9602523 67 4.363 1
#> 6 5 8 0.9601016 54 4.108 2
```

We can also check directly what would happen with only 4 neurons (and 5 neighbors).

```
knn_nntrf <- cpo_nntrf() %>>% makeLearner("classif.fnn")
knn_nntrf <- setHyperPars(knn_nntrf, nntrfCPO.repetitions=2, nntrfCPO.maxit=100,
nntrfCPO.use_sigmoid=FALSE, k=5, nntrfCPO.size=4)
set.seed(0)
# Please, note that in order to save time, results have been precomputed
cached <- system.file("extdata", "error_knn_nntrf.rda", package = "nntrf")
if(file.exists(cached)){load(cached)} else {
error_knn_nntrf <- resample(knn_nntrf, doughnut_task, outer_inst, measures = list(acc),
show.info = FALSE)
#save(error_knn_nntrf, file="../inst/extdata/error_knn_nntrf.rda")
}
```

```
# First, the three evaluations of the outer 3-fold crossvalidation, one per fold:
print(error_knn_nntrf$measures.test)
#> iter acc
#> 1 1 0.9564956
#> 2 2 0.9741974
#> 3 3 0.9271146
# Second, their average
print(error_knn_nntrf$aggr)
#> acc.test.mean
#> 0.9526025
```

In order to compare a supervised transformation method (**nntrf**) with an unsupervised one (PCA), it is very easy to do exactly the same pre-processing with PCA. In this case, the main hyper-parameters are **k** (number of KNN neighbors) and **Pca.rank** (the number of PCA components to be used, which would be the counterpart of **size**, the number of hidden neurons used by **nntrf**).

```
knn_pca <- cpoPca(center=TRUE, scale=TRUE, export=c("rank")) %>>% knn_lrn
ps_pca <- makeParamSet(makeDiscreteParam("k", values = 1:7),
makeDiscreteParam("pca.rank", values = 1:10)
)
knn_pca_tune <- makeTuneWrapper(knn_pca, resampling = inner_desc, par.set = ps_pca,
control = control_grid, measures = list(acc), show.info = FALSE)
```

```
set.seed(0)
# Please, note that in order to save time, results have been precomputed
cached <- system.file("extdata", "error_knn_pca_tune.rda", package = "nntrf")
if(file.exists(cached)){load(cached)} else {
error_knn_pca_tune <- resample(knn_pca_tune, doughnut_task, outer_inst,
measures = list(acc),
extract = getTuneResult, show.info = FALSE)
#save(error_knn_pca_tune, file="../inst/extdata/error_knn_pca_tune.rda")
}
```

It can be seen below that while **nntrf** was able to get a high accuracy, **PCA** only gets to nearly 0.65. Also the number of components required by **PCA** is the maximum allowed (pca.rank=10)

```
print(error_knn_pca_tune$extract)
#> [[1]]
#> Tune result:
#> Op. pars: k=2; pca.rank=10
#> acc.test.mean=0.6338697
#>
#> [[2]]
#> Tune result:
#> Op. pars: k=6; pca.rank=10
#> acc.test.mean=0.6401682
#>
#> [[3]]
#> Tune result:
#> Op. pars: k=6; pca.rank=10
#> acc.test.mean=0.6398140
print(error_knn_pca_tune$aggr)
#> acc.test.mean
#> 0.6384994
results_hyper <- generateHyperParsEffectData(error_knn_pca_tune)
head(arrange(results_hyper$data, -acc.test.mean))
#> k pca.rank acc.test.mean iteration exec.time nested_cv_run
#> 1 6 10 0.6401682 69 1.880 2
#> 2 6 10 0.6398140 69 1.760 3
#> 3 4 10 0.6380138 67 1.634 3
#> 4 4 10 0.6362675 67 1.763 2
#> 5 2 10 0.6347687 65 1.762 2
#> 6 2 10 0.6338697 65 1.475 1
```

For completeness sake, below are the results with no pre-processing, just KNN (results are very similar to the ones with PCA):

```
ps_knn <- makeParamSet(makeDiscreteParam("k", values = 1:7))
knn_tune <- makeTuneWrapper(knn_lrn, resampling = inner_desc, par.set = ps_knn,
control = control_grid, measures = list(acc), show.info = FALSE)
set.seed(0)
# Please, note that in order to save time, results have been precomputed
cached <- system.file("extdata", "error_knn_tune.rda", package = "nntrf")
if(file.exists(cached)){load(cached)} else {
error_knn_tune <- resample(knn_tune, doughnut_task, outer_inst, measures = list(acc),
extract = getTuneResult, show.info = FALSE)
#save(error_knn_tune, file="../inst/extdata/error_knn_tune.rda")
}
```

```
print(error_knn_tune$extract)
#> [[1]]
#> Tune result:
#> Op. pars: k=6
#> acc.test.mean=0.6362696
#>
#> [[2]]
#> Tune result:
#> Op. pars: k=6
#> acc.test.mean=0.6343180
#>
#> [[3]]
#> Tune result:
#> Op. pars: k=4
#> acc.test.mean=0.6336634
print(error_knn_tune$aggr)
#> acc.test.mean
#> 0.6383997
```