Welcome to Quantitative Methods.

In this course we give you an introduction into common concepts and tools for data analysis in economics and business. Emphasis will be on practical data skills as well as the ability to formulate a question appropriately, provide data analysis relevant to the question and interpret the results correctly. As part of the course you will be asked to provide a small piece of data analysis.

Class participation...

To get your participation mark please participate in the Datathon Forum with either questions or answers or by sharing relevant stuff (e.g. an interesting dataset or article). You find the forum here. You should already be enrolled to participate. But if not let me know. Please only participate with your official Imperial Email address.

Topic 1 - Introduction

Learning Objectives

Understanding what causal relationships are
Become aware of factors that lead to biased estimates of causal relationships
Understand upward and downward bias
Understand the concept of endogenous and exogenous variation.
Discuss causal identification and biases in the context of a univariate linear model

Slides

Exercises

Exercises 1

Topic 2- Rrrr

We take a first look at the R software package and programming language, which we will be using throughout this module. For those who have not yet been programming this will be a steep learning curve. However, it should be worth it.

You will not be able to analyse data effectively without being familiar with basic programming concepts. What’s more, to grasp many aspects of today’s world an understanding of programming is key. Even if you don’t envisage a career as master coder or data analyst, trying these things for yourself will be useful as you will very likely have to manage coders and analysts or rely on their work at some point in your career.

Learning Objectives

Install R and RStudio on your computer.
Use R as a glorified pocket calculator
Load a dataset and provide simple analysis of it.
Have an understanding of programming concepts such as loops
Be able to merge or join different data sources using code.
Learn how to create data yourself using random number generators.

Slides

Additional Material

Video Guide: R scripts

Video Guide: Installing R Packages
Video Guide: Functions in R

Exercises

Exercises 2

Topic 3 - Visions

Some of the best ways to understand data are graphs and visualisations. R has powerful tools for that purpose.

We will be deepening our ability to understand the R programming language by looking at some of its graphics commands. We will also discuss some of the pitfalls as well as deliberate manipulations that people engage in when producing data visualisations.

Learning Objectives

Use the R programming language to create basic and some more advanced graphs and visualisations based on data.
Avoid common pitfals that lead to misleading or un-informative visualisations.

Slides

Additional Material

The R Graph Gallery

Exercises

Exercises 3

Topic 4 - Testing Times

We have discussed how to estimate the parameters of a simple regression model. In this topic we will discuss how work out the reliability of such estimates. We will examine what determines the distribution of the estimates and how we can use hypothesis testing to explore the data and estimates.

Learning Objectives

Use the R programming language to create basic and some more advanced graphs and visualisations based on data.
Avoid common pitfals that lead to misleading or un-informative visualisations.

Slides

Additional Material

Exercises

Exercises 4

Topic 5 - Multivariate Regressions

we look at multivariate regressions; i.e. regressions where the dependent variable depends on several – not just one – explanatory variables.

Learning Objectives

Estimate regression models that depend on multiple explanatory variables
Be able to interpret output from multivariate regressions correctly
Understand which variables to include and which ones not to include to improve causal identification.

Slides

Additional Material

Exercises

Exercises 5

Topic 6 - Econometrics for Dummies

This might be a course about quantitative analysis, but that doesn’t mean we can’t handle qualitative issues as well. To capture qualitative aspects – e.g. an individual in our dataset having a job, the gender of a person, the location of the headquarters of a company, etc. - we can use so called dummy or binary variables; i.e. we simply set variable equal to 1 if the qualitative aspect is true and to 0 if not. We will be discussing what this means for regression analysis. Moreover, we will be looking at some nonlinear regression models.

Learning Objectives

Conduct regression analysis with dummy variables
Interpret regression results correctly.

Slides

Additional Material

Exercises

Exercises 6

Topic 7 - Instrumental Variables

Suppose you want to know the causal effect of a variable X on a variable Y but there are concerns that X might be endogenous because of omitted variable bias for instance. Instrumental variables are variables that your are not necessarily interested in for their own sake, but that can help you identify the causal effect that you are interested in despite the endogeneity.

Learning Objectives

Identify the requirements and conditions that make the Instrumental Variables approach feasible
Implement Instrumental Variables regressions.

Slides

Additional Material

Exercises

Exercises 7

Topic 8 - Learning like a machine

You might think studying for this course is so hard that it could only be done by a machine. After this topic, you'll hopefully think again.

You will get a brief introduction into Machine Learning. Compared to the rest of the course, Machine Learning is essentially econometrics where we are more concerned about prediction rather than the identification of causal relationships.

Learning Objectives

Explain the basic principles of machine learning
Implement simple machine learning applications.

Slides

Additional Material

Machine Learning Competition

Topic 9 - Time for Series

To do any data analysis we need different datapoints. Most data we looked at so far was cross sectional data; i.e. datapoints derived from several data units (e.g. individuals, countries, cities, firms). Alternatively, we might have data for just one data unit but over many different periods of time (e.g. years, quarters, days or even sometimes seconds). This creates some specific issues, which we will discuss. Given that time itself causes a form of confounding and correlation from one datapoint to the next, we might get some very biased estimates if we are not careful.

Learning Objectives

Identify the problems related to time series data
Correctly implement causal analysis in a time series context.

Slides

Additional Material

Exercises

Exercises 9

Course Summary Slides

Here

Alternative download location

summary2021.pptx

Group Coursework

As part of this course you are asked to hand in a piece of group coursework. For this you are asked to provide some simple a short report on anything you like as long as it involves a discussion of a dataset and some of the methods we discussed in this class. I.e. think of a good question, some data and a strategy to say something towards the answer of the question using data.

You find a template with further instructions here (RMarkdown Version).

Past Exam Papers

To prepare also look at Exercises 10

Datasets for past papers (and more)

- [ back2country_set.dta ]( https://mondpanther.github.io/datastorieshub/data/back2country_set.dta ) 
- [ driving.csv ]( https://mondpanther.github.io/datastorieshub/data/driving.csv ) 
- [ ets_thres_final.csv ]( https://mondpanther.github.io/datastorieshub/data/ets_thres_final.csv ) 
- [ foreigners.csv ]( https://mondpanther.github.io/datastorieshub/data/foreigners.csv ) 
- [ guns.csv ]( https://mondpanther.github.io/datastorieshub/data/guns.csv ) 
- [ hals1prep.csv ]( https://mondpanther.github.io/datastorieshub/data/hals1prep.csv ) 
- [ house.csv ]( https://mondpanther.github.io/datastorieshub/data/house.csv ) 
- [ lfsclean.dta ]( https://mondpanther.github.io/datastorieshub/data/lfsclean.dta ) 
- [ maketable1.csv ]( https://mondpanther.github.io/datastorieshub/data/maketable1.csv ) 
- [ maketable2.csv ]( https://mondpanther.github.io/datastorieshub/data/maketable2.csv ) 
- [ maketable4.csv ]( https://mondpanther.github.io/datastorieshub/data/maketable4.csv ) 
- [ migrants.csv ]( https://mondpanther.github.io/datastorieshub/data/migrants.csv ) 
- [ NH.Ts+dSST.csv ]( https://mondpanther.github.io/datastorieshub/data/NH.Ts+dSST.csv ) 
- [ oj.csv ]( https://mondpanther.github.io/datastorieshub/data/oj.csv ) 
- [ populationclean.csv ]( https://mondpanther.github.io/datastorieshub/data/populationclean.csv ) 
- [ populationdata.csv ]( https://mondpanther.github.io/datastorieshub/data/populationdata.csv ) 
- [ prod.csv ]( https://mondpanther.github.io/datastorieshub/data/prod.csv ) 
- [ prod_balanced.csv ]( https://mondpanther.github.io/datastorieshub/data/prod_balanced.csv ) 
- [ proddata_clean.csv ]( https://mondpanther.github.io/datastorieshub/data/proddata_clean.csv ) 
- [ production2.csv ]( https://mondpanther.github.io/datastorieshub/data/production2.csv ) 
- [ statistic_id183497_population-in-the-states-of-the-us-2019.csv ]( https://mondpanther.github.io/datastorieshub/data/statistic_id183497_population-in-the-states-of-the-us-2019.csv ) 
- [ statistic_id183497_population-in-the-states-of-the-us-2019.xlsx ]( https://mondpanther.github.io/datastorieshub/data/statistic_id183497_population-in-the-states-of-the-us-2019.xlsx ) 
- [ TableWages.csv ]( https://mondpanther.github.io/datastorieshub/data/TableWages.csv ) 
- [ TeachingRatings.csv ]( https://mondpanther.github.io/datastorieshub/data/TeachingRatings.csv ) 
- [ UK Gender Pay Gap Data - 2017 to 2018.csv ]( https://mondpanther.github.io/datastorieshub/data/UK Gender Pay Gap Data - 2017 to 2018.csv ) 
- [ unempprep.csv ]( https://mondpanther.github.io/datastorieshub/data/unempprep.csv ) 
- [ unempprep.dta ]( https://mondpanther.github.io/datastorieshub/data/unempprep.dta ) 
- [ us-states.csv ]( https://mondpanther.github.io/datastorieshub/data/us-states.csv )

CCMF - Quantitative Methods 2021

Class participation...

Topic 1 - Introduction

Learning Objectives

Slides

Exercises

Further reading

Topic 2- Rrrr

Learning Objectives

Slides

Additional Material

Exercises

Further Reading

Topic 3 - Visions

Learning Objectives

Slides

Additional Material

Exercises

Further Reading

Topic 4 - Testing Times

Learning Objectives

Slides

Additional Material

Exercises

Further reading

Topic 5 - Multivariate Regressions

Learning Objectives

Slides

Additional Material

Exercises

Further Reading

Topic 6 - Econometrics for Dummies

Learning Objectives

Slides

Additional Material

Exercises

Further Reading

Topic 7 - Instrumental Variables

Learning Objectives

Slides

Additional Material

Exercises

Further Reading

Topic 8 - Learning like a machine

Learning Objectives

Slides

Additional Material

Topic 9 - Time for Series

Learning Objectives

Slides

Additional Material

Exercises

Further Reading

Course Summary Slides

Group Coursework

Past Exam Papers

Datasets for past papers (and more)