Daniel Schalk

Daniel Schalk

Statistician / Data Scientist

Personal Profile

I am statistician, aka data scientist, with strong interests in machine learning, statistical modelling, programming, automated reporting, reproducibility, and data visualization. Besides the theoreical aspects, which I find very important, I am also interested in the practical applications of machine learning. Therefore, I am author of an model-based boosting package called compboost written in C++ and R.


Ludwig Maximilian University Munich

PhD candidate in machine learning

June 2018 - Present

  • Research focus: Modern approaches of component-wise boosting and distributed computing/federated learning.
  • Responsible for the courses "Introduction to Machine Learning" and "Predictive Modeling".
  • Development and integration of interactive teaching websites.

Ludwig Maximilian University Munich

Statistics, M.Sc.

April 2016 - Mai 2018

Thesis: Efficient and Distributed Model-Based Boosting for Large Datasets.

University of Applied Sciences Rosenheim

Business Mathematics and Actuarial Sciences, B.Sc.

October 2012 - April 2016

Thesis: Parameteridentifikation mittels linearer und nicht-linearer Regressionsmodelle zur Bestimmung thermischer Kennwerte. (en: Parameter identification using linear and non-linear regression models to determine thermal characteristics.)

Work Experience


Essential Data Science Training (former Munich R Courses)

February 2018 - Present

  • Trainer for several courses (i.a. "Unsupervised Learning", "Programming in R", "Machine Learning in R", or "Statistik Grundlagenkurs").
  • Creation of teaching material.

Consulting Project

Munich Re

October 2016 - November 2017

  • Estimation and validation of transition probabilities between customer states using machine learning algorithms as well as classic statistical models.
  • Visualisation of the results by providing an interactiv web application written in R using shiny.

Research Assistant

University of Applied Sciences Rosenheim

August 2015 - October 2016

  • Modeling and estimation of thermal characteristics under consideration of time dependencies.
  • Estimation of residuals of motor vehicle damages in non-life insurance.
  • Implementation, validation, and runtime optimization using R.


Stat-Up Munich

March 2015 - Juli 2015

  • Data analyses in several project.
  • Creation of training courses, among others for data mining.

Technical Skills


  • R
  • C++
  • HTML
  • CSS
  • Docker
  • JavaScript
  • Python
  • LaTeX


  • MS Office
  • Adobe Photoshop
  • WordPress

Software Projects


Author and maintainer

R package for fast and flexible component-wise boosting.

GitHub link: https://www.github.com/schalkdaniel/compboost



R package to provide hyperparameter tuning for mlr3.

GitHub link: https://www.github.com/mlr-org/mlr3tuning


Author and maintainer

R package for distributed model evaluation.

GitHub link: https://www.github.com/difuture-lmu/dsBinVal


Best statistics graduate from LMU in the year 2018/19

Mai 2019

MRDataThon 2017 Best Overall Solution

November 2017

Scholarship "Deutschlandstipendium"

April 2017 - October 2017

TEFDataChellenge 2017 Best Overall Solution

October 2017

DataFest Germany 2017 Best Visualisation

April 2017