PhD Courses in Denmark

Biostatistical modelling for Ag. Science

Graduate School of Technical Sciences at Aarhus University

Objectives of the course:

In the field of Agroecology, rigorous data-driven research has become crucial for addressing complex agricultural and environmental challenges. The ability to effectively manage and analyse data is paramount for producing high-quality research outcomes and making informed decisions.

Main course: Hands-on data: an applied statistics PhD course

Aim: Provide the participants with a place and the tools for methodological discussion around data analysis in the context of Agroecological research. The participants will finish the course being able to recognize the elements of experimental or observational designs involved in their own research as well as examples, and model them to respond specific research questions. 

ECTS: 3.5 One-week classes 90 hs. (40 hs. in class and 50 preparation).  

Pre-workshop (necessary to complete 5 ECTS):  Handling Ag Data through R 

Aim: Reinforce the understanding and skill development on data handling and coding in R while revisiting the basic notions on descriptive statistics necessary to the course. 

ECTS: 1.5 ECTS (48 hs., 3 day-long sessions) 

 

Learning outcomes and competencies:

At the end of the course, the student should be able to:

  • Manage R statistical software that allows reshaping and merging data of various types. 
  • Choose and apply exploratory data tools to discover patterns in data. 
  • Recognise experimental designs and appropriately handle them statistically. 
  • Employ common statistical methods (e.g., General Linear Mixed Models, GLMM) for data analysis and understand their strengths and weaknesses. 
  • Describe, interpret and discuss the results and shortcomings of an analysis based on statistical modelling. 
  • Build attractive and informative graphics and tables from applied statistical analysis. 
     
  • Report the results of an applied statistical analysis according to ethics of science. 
     

Specifications:

Language: English

Level of course: PhD course

Time of year: 3 to 13 and 27 February 2026

No. of contact hours/hours in total incl. preparation, assignment(s) or the like: 125

Capacity limits: 20

We present a complete course in data handling and analysis. The complete course encompasses a three-day pre-course on handling data through R and a week-long course on Biostatistics. Completed both courses provide a total of 5 ECTS, and only attending the second week yields 3.5 ECTS. The course is planned for a maximum of 20 students. 
 

Course contents:

Hands-on data: an applied statistics PhD course

  1. Data handling and exploratory analysis 
  1. Data types and variables. 
  1. Descriptive qualitative data analysis.  
  1. Descriptive quantitative data analysis.  
  1. Introduction to statistical modelling 
  1. Notation for statistical modelling. 
  1. Modelling experimental and observational data.  
  1. Statistical modelling and uncertainty. 
  1. Models for different types of covariates. 
  1. Experimental design in agricultural science 
  1. Randomization and Replication. Randomization restriction strategies. 
  1. Single-factor experiments. Experiments with factorial treatment structure. Crossed and nested factors. Number of required replications for desired power.  
  1. Experiments with plot structures. Completely randomized designs, blocked designs, split-plot designs. Combining factorial treatment structures with plot structures. 
  1. Experiment with temporal and spatial structures. 
  1. Type of sampling for observational studies.  
  1. General Linear Mixed Model 
  1. Linear models. Simple and multiple linear regression. Estimation and confidence intervals. Hypothesis testing. Predicted values, confidence bands, and prediction intervals. Residual analysis. Model adequacy. 
  1. Random effect models. General concepts. Marginal models and subject-specific models. Models for residual covariance structure. Estimation of co-variances in normal populations. Inference on random effects. Best Linear Unbiased Predictor (BLUP). Goodness-of-fit criteria and model selection. Models for longitudinal data. 
  1. Results and analysis communication  
  1. Good practices in presenting results 
  1. Principles of reproducible research (Open Data, version control through Git and GitHub). 
  1. Discussion on the limitation of biostatistical modelling 

Handling Ag Data through R

  1. Getting started with R. 
  1. Introduction to the R working environment. 
  1. What is R? What is RStudio? Download and installation of R and RStudio. Packages, documentation, and help in R. 
  1. Starting a work session in R. Creating work projects in RStudio. First functions. R as a calculator. Language syntax. Statements. Assignments. 
  1. Mathematical operations. Comments. Saving commands (scripts) and projects.  
  1. Data files (data frame) 
  1. Reading data, importing data from Excel and other formats, reading files from a working directory. Exporting data, saving objects, and R workspace.  
  1. Handling files and data, sorting, selecting rows, selecting columns, creating data subsets. Best practices for processing data and making the process reproducible.  
  1. Exploring data frames with specialized packages, identifying missing values. Lists. 
  1. Graphics 
  1. The plot() function. Scatter plots. Graphical attributes. Histogram. Box plot. Bar chart.  
  1. Introduction to Grammar of graphics and advance plotting using ggplot2 package.  
  1. Functions 
  1. Defining functions in R. Function arguments. Writing code to create functions.  
  1. Functions from the apply family and loops. 
  1. A refresh on descriptive statistics 
  1. Descriptive statistics 
  1. Variability in data. Distribution. Skew and kurtosis.   
  1. Different measures of central tendency and when to use them. 
  1. Different measures of variability and when to use them. 
  1. Standardisation. Correlation. 
  1. Populations. Difference between describing a sample and statistical estimation.  
  1. Data model 
  1. Data model. Handling data collected from different sources with different methodologies. 
  1. Principles of databases. Best practices for data management. 
  1. Preparing a dataset for analysis. 
     

Prerequisites: Prerequisites: PhD students conducting quantitative data analysis as part of their research.

Course leader: René Gislum, Associate Professor, Head of Crop Health Section, Department of Agroecology. 

Name of lecturers:

Maarit Mäenpää, Doctoral degree in Evolutionary Ecology, Master's degree in Ecology, Statistician in the Department of Agroecology. 

Franca Giannini-Kurina, Doctoral degree in Agricultural Science, Master's degree in Applied Statistics. Postdoc at Soil Fertility Section, Department of Agroecology.    

Simon Riley, Doctoral degree in Agronomy, Master's degree in degree in Agronomy, Statistician in the Department of Agroecology 

Type of course/teaching methods:

This course attempts to differentiate itself from other postgraduate training programs in basics statistics by applying a student-centered pedagogical strategy where the teaching format includes lecturing and case-based teaching. The cases are based on PhD students’ own research problems and data to make the cases more relatable, relevant, and authentic, and the cases are to be presented and discussed in-class activities.

Literature:

  • Casler, M. D. (2015). Fundamentals of experimental design: Guidelines for designing successful experiments. Agronomy Journal, 107(2), 692-705. 
  • Efron, B., & Hastie, T. (2021). Computer age statistical inference, student edition: algorithms, evidence, and data science (Vol. 6). Cambridge University Press. 
  • Glaz, B., & Yeater, K. M. (2020). Applied statistics in agricultural, biological, and environmental sciences (Vol. 172). John Wiley & Sons. 
  • Piepho, H. P., Büchse, A., & Emrich, K. (2003). A hitchhiker's guide to mixed models for randomized experiments. Journal of Agronomy and Crop Science, 189(5), 310-322. 
  • Piepho, H. P., Gabriel, D., Hartung, J., Büchse, A., Grosse, M., Kurz, S., ... & Wittenburg, D. (2022). One, two, three: Portable sample size in agricultural research. The Journal of Agricultural Science, 160(6), 459-482. 
  • Kozak, M., & Piepho, H. P. (2018). What's normal anyway? Residual plots are more telling than significance tests when checking ANOVA assumptions. Journal of agronomy and crop science, 204(1), 86-98. 
  • West, B. T., Welch, K. B., & Galecki, A. T. (2022). Linear mixed models: a practical guide using statistical software. Crc Press. 
  • Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D. A., François, R., ... & Yutani, H. (2019). Welcome to the Tidyverse. Journal of open-source software, 4(43), 1686. 
  • Wickham, H., & Grolemund, G. (2017). R for data science (Vol. 2). Sebastopol: O'Reilly. https://r4ds.hadley.nz/ 
  • Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., ... & Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, 3(1), 1-9. 

 

Course assessment:

Individual student evaluation (pass/not pass) will be based on their own data analysis to be presented in a poster session.

The course will culminate in a collective assessment in an open-door session where participants will present the report of their own analysis. Students need to demonstrate the ability to effectively analyse, interpret and communicate their results in the context of their research questions, making their learning experience more meaningful and applicable to their future research endeavours.

Special comments on this course: The course fee is 300 (incl. lunch and coffee). Participants are responsible for arranging their own accommodation and transportation to the campus.

Time: February 3-13, 2026, from 9:00 to 17:00

Place: Aarhus University Viborg, Blichers Alle 20, 8830 Tjele, Denmark

Registration:

The deadline for registration is the 16 January 2026. Admission information will be sent shortly after. Please note the capacity limit (20 participants); the allotment will be based on a first-come, first-served basis.

That the assessment date is the 27 of February, it is mandatory to participate either in person or online.