Introduction

Welcome to the class Modern Applied Data Analysis (MADA)!

The course is listed as EPID8060E/BIOS8060E. It doesn’t matter if you are enrolled under the EPID or BIOS label. I will generally refer to the course as MADA.

Module Overview

This first module provides a brief introduction to the course, the tools we will be using, and the topic.

Learning Objectives

The specific learning objectives for this unit are:

  • Know what this course is all about.
  • Know how this course is set up and what you are expected to do.

Course Goals

The main goal for this course is for you to learn the whole process of performing a data analysis project. The focus is on applied analysis of real world, messy data.

A second goal is to introduce you to some modern analysis approaches commonly referred to as Machine Learning.

A related goal is to introduce you to a set of tools that allow for a modern, reproducible workflow of your analysis.

For more detailed learning objectives, see the course syllabus.

In this course, I randomly switch back and forth between singluar and plurar. Source: phdcomics.com.

In this course, I randomly switch back and forth between singluar and plurar. Source: phdcomics.com.

Course Philosophy

Here are my goals, promises, and expectations for this course.

  • I expect you to be self-motivated and committed to learning the material by putting in the effort needed to succeed.

  • I will try to maximize the rewards you get by hopefully teaching methods that are useful to you, and I will try to provide as much help as needed to maximize your learning.

  • This class strives to be challenging but non-threatening. As such, I’ll make you work hard, and expect you to do the assigned tasks by the deadlines, but in the end, I usually don’t grade hard - unless you fail to keep up your end of the agreement and don’t put in the work.

  • This class is open everything. You can use the internet, ask your classmates, myself and others, get help from wherever you can. I trust you will find the right balance of getting help when you need it while still putting in enough effort to experience real learning.

  • I will not perform any policing to try to prevent you from taking shortcuts (i.e., not doing work yourself). The class contains graded assessments with deadlines, but those are meant to help you stay on track. If you somehow cheat - and cheating will be easy - you are mainly cheating yourself out of learning.

Overall, I hope this course is going to be useful, interesting, challenging and also interactive. Online courses are always a bit tricky with interaction/participation. I hope we can create something online that feels like a classroom. Please participate, ask questions, etc. The more you engage in the course, the more you’ll get out of it.

Course Setup

  • The course is split into modules. Each module will usually be covered in a week. The Schedule document gives an outline. The schedule is not fully finalized and will change, so check frequently.
  • Each module consists of one or several units/documents containing a mix of things I wrote, and writings or videos by others. They are listed in the order you should go through them.
  • All material for the course can be accessed through this course website. Some material might not be available yet and will be unlocked as the course proceeds.
  • There is generally a lot of material for each unit. You are expected to go through the main components at a level that allows you to get the big picture and be able to answer the quiz questions. Once you get the overall idea, consider the materials to be resources you can visit on demand, e.g., when you are working on the exercises.
  • You will be placed in groups throughout the course. The hope is that this will create a support group of classmates where you can help each other. Assignments often require interaction among group members (like in a real-world team), and I hope you will interact with your group members outside the assignments as well. You are welcome and encouraged to interact with anyone else in the course.

Course Tools

This is a brief overview of the tools we will be using for this class. The next module, which you can and should start right after finishing this one, describes all these tools in more detail and gives instructions on how to set them all up.

  • We will use the R software in this course.
  • We will also use RStudio (a graphical frontend to R).
  • We will use Quarto, which nicely plays with R/Rstudio to let you create easy, automated workflows.
  • We will use Git/GitHub for exercises and the project.
  • We will use Slack for discussions and help.
  • We will use an online system for quizzes.

Assessments

For details on assessments, see the Assessments page.

Course Resources

We’ll be drawing on a lot of different resources. I compiled a list with the ones we’ll use and others you might find helpful in the Resources section of the class website.

Getting help

I do not expect you to figure it all out yourselves. You will get stuck and are encouraged to seek help. You can ask for help from your classmates or your instructor. Also, use the wider community online. For specific places to get help, see the Resources pages.