Data Science using R (2025)
Doctoral School of Engineering and Science at Aalborg University
Description: In todays data-driven world, the ability to handle quantitative data systematically and reproducibly using reliable software is essential. This course, Data Science with R, is designed to introduce you to R, a powerful and free programming environment that has been a cornerstone of data analysis and statistical methodology for over two decades. R a versatile tool and is widely used in academia, industry, and government for a variety of data analysis tasks. This includes, for example, biostatistics, bioinformatics, analysis of data from life sciences, medicine, biology, engineering, physics and social sciences.
Data Science with R will equip you with the skills to tackle a variety of data challenges through a blend of practical programming and theoretical knowledge.
The curriculum encompasses:
- Introduction to R: Discover R as a premier tool for statistical programming and data analysis.
- Efficient Data Management: Master techniques for managing and manipulating data efficiently.
- High-Level Graphics: Create stunning, informative visualizations with R?s advanced graphical capabilities.
- Statistical Modeling: Dive into theoretical and practical aspects of statistical modeling.
- Big Data Analytics: Learn how to extract insights from large datasets.
- Reproducible Research: Implement reproducible research practices to ensure your work can be trusted and verified.
- R Programming: Develop robust programming skills in R to solve complex data problems.
- Application areas: Examples include biostatistics and analyzing data from life sciences, medicine, social sciences, spatial data, physics, engineering, and more.
Prerequisites: A working knowledge of elementary statistical methods is required. This includes topics such as: summarizing data, estimation of e.g. mean and variance, hypothesis testing, confidence intervals, linear regression models, and analysis of variance. Such topics are covered in many courses, including the ?Applied Statistics? course offered at Aalborg University.
Learning objectives: Upon completing the course, participants will:
- Master R Programming: Confidently solve programming tasks using R.
- Data Management and Visualization: Effectively manage, visualize, and analyze data using R?s powerful tools.
- Model Fitting: Fit and interpret statistical models to extract meaningful insights from data.
- Independent Learning: Acquire sufficient knowledge to continue exploring and advancing their R programming skills independently.
- Competitive Edge: Be at the forefront of modern data science methods, enhancing your competitiveness in the field.
Teaching methods: The course employs a dynamic blend of instructional approaches, including: Instructive Videos: Engaging video tutorials to introduce and explain key concepts. Computer Practicals: Hands-on sessions to apply learning in real-time, practical scenarios. Lectures: Comprehensive lectures to provide theoretical foundations and contextual understanding.
Criteria for assessment: Active participation in the practicals + approval of major exercise (to be handed in after the course).
Key literature: R for data science by Wickham, Cetinkaya-Rundel and Grolemund
Organizer: Ege Rubak
Lecturers: Ege Rubak
ECTS: 4.0
Time: 5 - 6 February, 26 - 27 February and 12 - 13 March 2025
Place: Aalborg University
Zip code: 9220
City: Aalborg
Number of seats: 50
Deadline: 15 January 2025