## Initializing Model

Due to the R6 API it is necessary to create a new class object which gets the data, the target as character, and the used loss. Note that it is important to give an initialized loss object:

Use an initialized object for the loss gives the opportunity to use a loss initialized with a custom offset.

Adding new base-learners is also done by giving a character to indicate the feature. As second argument it is important to name an identifier for the factory since we can define multiple base-learner on the same source.

### Numerical Features

For instance, we can define a spline and a linear base-learner of the same feature:

Additional arguments can be specified after naming the base-learner. For a complete list see the functionality at the project page:

### Categorical Features

When adding categorical features each group is added as single base-learner to avoid biased feature selection. Also note that we don’t need an intercept here:

Finally, we can check what factories are registered:

## Define Logger

### Time logger

This logger logs the elapsed time. The time unit can be one of microseconds, seconds or minutes. The logger stops if max_time is reached. But we do not use that logger as stopper here:

## Train Model and Access Elements

cboost$train(2000, trace = 100) #> 1/2000 risk = 0.73 oob_risk = 0.76 time = 0 #> 100/2000 risk = 0.64 oob_risk = 0.69 time = 15622 #> 200/2000 risk = 0.62 oob_risk = 0.67 time = 31711 #> 300/2000 risk = 0.61 oob_risk = 0.66 time = 47901 #> 400/2000 risk = 0.6 oob_risk = 0.65 time = 64569 #> 500/2000 risk = 0.6 oob_risk = 0.65 time = 81588 #> 600/2000 risk = 0.59 oob_risk = 0.65 time = 99339 #> 700/2000 risk = 0.59 oob_risk = 0.65 time = 117371 #> 800/2000 risk = 0.59 oob_risk = 0.65 time = 135977 #> 900/2000 risk = 0.59 oob_risk = 0.65 time = 153953 #> 1000/2000 risk = 0.59 oob_risk = 0.65 time = 172255 #> 1100/2000 risk = 0.59 oob_risk = 0.65 time = 190418 #> 1200/2000 risk = 0.59 oob_risk = 0.65 time = 208840 #> 1300/2000 risk = 0.59 oob_risk = 0.65 time = 228773 #> 1400/2000 risk = 0.59 oob_risk = 0.65 time = 250424 #> 1500/2000 risk = 0.59 oob_risk = 0.65 time = 271120 #> 1600/2000 risk = 0.59 oob_risk = 0.65 time = 290963 #> 1700/2000 risk = 0.59 oob_risk = 0.65 time = 311443 #> 1800/2000 risk = 0.59 oob_risk = 0.65 time = 332204 #> 1900/2000 risk = 0.59 oob_risk = 0.65 time = 353300 #> 2000/2000 risk = 0.59 oob_risk = 0.65 time = 374042 #> #> #> Train 2000 iterations in 0 Seconds. #> Final risk based on the train set: 0.59 cboost #> Component-Wise Gradient Boosting #> #> Trained on df_train with target Survived #> Number of base-learners: 5 #> Learning rate: 0.05 #> Iterations: 2000 #> Offset: 0.2069 #> #> LossBinomial Loss: #> #> Loss function: L(y,x) = log(1 + exp(-2yf(x)) #> #> Objects of the Compboost class do have member functions such as getEstimatedCoef(), getInbagRisk() or predict() to access the results: To obtain a vector of selected learner just call getSelectedBaselearner() We can also access predictions directly from the response object cboost$response and cboost$response_oob. Note that$response_oob was created automatically when defining an oob_fraction within the constructor:

## Visualizing Base-Learner

To visualize a base-learner it is important to exactly use a name from getBaselearnerNames():

gg1 = cboost$plot("Age_spline") gg2 = cboost$plot("Age_spline", iters = c(50, 100, 500, 1000, 1500))

gg1 = cboost$plot("Age_spline") gg2 = cboost$plot("Age_spline", iters = c(50, 100, 500, 1000, 1500))