Exercises 7

Exercises

Instrumental Variables

Ralf Martin https://mondpanther.github.io/wwwmondpanther/
2021-11-25

Exercise 7.1

Use the dataset4.dta dataset. It contains weekly prices for rail transport of grain in the US midwest during the 1880s, and the quantity shipped. The railroad companies at the time operated a cartel, called the Joint Executive Committee (JEC), which is believed to have raised prices above the level that would have otherwise prevailed. This practice was legal before the Sherman Act of 1890 (antitrust legislation) was passed. From time to time, cheating by cartel members brought about a temporary collapse of the collusive price setting agreement. A dummy variable – “cartel”- in the data set indicates the period when price fixing was in effect.

library(haven)
d4=read_dta("https://www.dropbox.com/sh/rqmo1hvij1veff0/AABQXwWvdZdvGlT6Y8LIfoMha/dataset4.dta?dl=1")
names(d4)
 [1] "week"     "price"    "cartel"   "quantity" "seas1"    "seas2"   
 [7] "seas3"    "seas4"    "seas5"    "seas6"    "seas7"    "seas8"   
[13] "seas9"    "seas10"   "seas11"   "seas12"   "ice"     

Part (a)

Run an OLS regression of the log quantity on the log price, controlling for ice, indicating that the Great Lakes were frozen preventing transport by ship, and a set of seasonal dummy variables (to capture seasonality in demand; note that dataset has 12 seasonal dummies; i.e. they tread every month as a season).

What is the estimated price elasticity?

Do you think you are estimating a demand curve? Explain.

Think of what the the economic rationale is of including the variable “ice” in the regression?

m1=lm(log(quantity)~log(price)+ice+seas1+seas2+seas3+seas4+seas5+seas6+seas7+seas8+seas9+seas10+seas11+seas12, data=d4)
coeftest(m1, vcov=vcovHC)

t test of coefficients:

              Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  8.8612335  0.1888154 46.9307 < 2.2e-16 ***
log(price)  -0.6388847  0.0754054 -8.4727 9.516e-16 ***
ice          0.4477537  0.1453460  3.0806  0.002249 ** 
seas1       -0.1328219  0.0987687 -1.3448  0.179671    
seas2        0.0668882  0.0935005  0.7154  0.474909    
seas3        0.1114364  0.0998924  1.1156  0.265464    
seas4        0.1554218  0.1367509  1.1365  0.256604    
seas5        0.1096585  0.1369305  0.8008  0.423836    
seas6        0.0468326  0.1890620  0.2477  0.804521    
seas7        0.1225525  0.2118166  0.5786  0.563290    
seas8       -0.2350079  0.1874514 -1.2537  0.210886    
seas9        0.0035606  0.1849417  0.0193  0.984652    
seas10       0.1692468  0.1855618  0.9121  0.362430    
seas11       0.2151843  0.1853410  1.1610  0.246519    
seas12       0.2196331  0.1825211  1.2033  0.229758    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The regression implies a price elasticity of -0.64; i.e. a 1% increase in price will lead to a 0.64% reduction in demand. We can interpret this as a demand curve, if the price coefficient really represents the effect of a price change on demand holding all other things constant. One reason why this might not be the case is if shocks to demand cause price changes. On the other hand if we can control for big potential shocks to demand it will be possible to recover un-biased estimates. A big potential demand shock here is the freezing of the lakes: Because rail transport becomes the only option and hence demand increases it might lead to price changes. Indeed we see that “ice” has a significant positive effect on demand.

Part (b)

Consider using cartel as an instrument for price in order to identify the demand curve.

Is the instrument is likely to satisfy the conditions for a valid instrument?

Can you use the data to check these conditions?

Exogenous? Having a cartel is clearly a supply side factor that will by and large depend on the number of firms in the market and their respective manager’s ability to strike a deal, all factors un-related to demand. However, the incentive to form a cartel is also driven the amount of money to be made from having a cartel which is a demand side factor. Having said that, it is more likely that long term structural aspects of of demand are relevant for this rather than the weekly changes over a relatively short time period (6 years). Hence, it is plausible that we can use cartel as an IV.

Driving explanatory variable? It is almost the definition of a cartel that it has a positive impact on price. At any rate this is something we can check via a first stage regression.

Exclusion? It is fairly plausible that the main effect of cartel on demand is via price. However, we can also imagine scenarios where the exclusion restriction might not hold in such a setting: imagine that the mere fact that the industry forms a cartel leads to some kind of negative press and consumer backlash perhaps with a boycott. Having said that, this is probably more of a concern with consumer goods rather than railway cargo transport.

Part (c)

Estimate the first stage and reduced form equations.

What is the effect of cartel on price in the first stage?

What is the effect of cartel on quantity in the reduced form?

First stage suggests highly significant effect of cartel on price.

fs=lm(log(price)~cartel+ice+seas1+seas2+seas3+seas4+seas5+seas6+seas7+seas8+seas9+seas10+seas11+seas12, data=d4)
summary(fs)

Call:
lm(formula = log(price) ~ cartel + ice + seas1 + seas2 + seas3 + 
    seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
    seas11 + seas12, data = d4)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.49765 -0.13625  0.01362  0.13616  0.55689 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.693741   0.078361 -21.615  < 2e-16 ***
cartel       0.357898   0.024862  14.395  < 2e-16 ***
ice          0.035003   0.064252   0.545  0.58629    
seas1        0.038725   0.059084   0.655  0.51268    
seas2        0.136288   0.059084   2.307  0.02173 *  
seas3        0.189049   0.059319   3.187  0.00158 ** 
seas4        0.089523   0.059357   1.508  0.13251    
seas5        0.017863   0.069869   0.256  0.79838    
seas6       -0.025741   0.085529  -0.301  0.76364    
seas7       -0.067126   0.085529  -0.785  0.43314    
seas8       -0.035837   0.085709  -0.418  0.67614    
seas9       -0.005776   0.086321  -0.067  0.94670    
seas10      -0.100211   0.086321  -1.161  0.24656    
seas11      -0.086751   0.085362  -1.016  0.31028    
seas12       0.011693   0.085362   0.137  0.89113    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2114 on 313 degrees of freedom
Multiple R-squared:  0.4881,    Adjusted R-squared:  0.4652 
F-statistic: 21.32 on 14 and 313 DF,  p-value: < 2.2e-16
#coeftest(rf,vcov=vcovHC) This would account for heteroskedasticity which we haven't covered in the course. So don't worry about it for now.
linearHypothesis(fs,"cartel=0")
Linear hypothesis test

Hypothesis:
cartel = 0

Model 1: restricted model
Model 2: log(price) ~ cartel + ice + seas1 + seas2 + seas3 + seas4 + seas5 + 
    seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12

  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1    314 23.251                                  
2    313 13.989  1    9.2616 207.22 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Reduced form shows strong negative effect of cartel on quantity as would be expected.

rf=lm(log(quantity)~cartel+ice+seas1+seas2+seas3+seas4+seas5+seas6+seas7+seas8+seas9+seas10+seas11+seas12, data=d4)
summary(rf)

Call:
lm(formula = log(quantity) ~ cartel + ice + seas1 + seas2 + seas3 + 
    seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
    seas11 + seas12, data = d4)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.37111 -0.22843 -0.01303  0.30380  0.79350 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 10.04131    0.15098  66.506  < 2e-16 ***
cartel      -0.31015    0.04790  -6.474 3.68e-10 ***
ice          0.39260    0.12380   3.171  0.00167 ** 
seas1       -0.16453    0.11384  -1.445  0.14938    
seas2       -0.02715    0.11384  -0.239  0.81163    
seas3       -0.02796    0.11429  -0.245  0.80693    
seas4        0.07493    0.11437   0.655  0.51283    
seas5        0.05808    0.13462   0.431  0.66644    
seas6        0.01624    0.16479   0.099  0.92155    
seas7        0.11840    0.16479   0.718  0.47299    
seas8       -0.26254    0.16514  -1.590  0.11289    
seas9       -0.05337    0.16632  -0.321  0.74852    
seas10       0.17265    0.16632   1.038  0.30004    
seas11       0.22697    0.16447   1.380  0.16858    
seas12       0.16852    0.16447   1.025  0.30633    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4073 on 313 degrees of freedom
Multiple R-squared:  0.2774,    Adjusted R-squared:  0.2451 
F-statistic: 8.582 on 14 and 313 DF,  p-value: 9.777e-16
#coeftest(rf,vcov=vcovHC)

Part (d)

Estimate the demand function by IV. What is your estimated demand elasiticity?

How does it differ from your OLS estimate in (a)?

IV regression suggests a price elasticity of -0.86; i.e. the estimate has become stronger (more elastic; i.e. more negative) compared to OLS. This is consistent with positive demand shocks in exerting a positive influence on price (i.e. the OLS estimate would be biased upward so that it becomes less negative than is true in reality)

library(AER)
iv=ivreg(log(quantity)~log(price)+ice+seas1+seas2+seas3+seas4+seas5+seas6+seas7+seas8+seas9+seas10+seas11+seas12| cartel+ice+seas1+seas2+seas3+seas4+seas5+seas6+seas7+seas8+seas9+seas10+seas11+seas12, data=d4)
summary(iv)

Call:
ivreg(formula = log(quantity) ~ log(price) + ice + seas1 + seas2 + 
    seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
    seas11 + seas12 | cartel + ice + seas1 + seas2 + seas3 + 
    seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
    seas11 + seas12, data = d4)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.38295 -0.27275  0.07318  0.27703  1.09320 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  8.573535   0.216445  39.611  < 2e-16 ***
log(price)  -0.866587   0.132123  -6.559 2.24e-10 ***
ice          0.422934   0.121569   3.479 0.000575 ***
seas1       -0.130973   0.112307  -1.166 0.244420    
seas2        0.090952   0.113167   0.804 0.422181    
seas3        0.135872   0.113194   1.200 0.230912    
seas4        0.152511   0.112094   1.361 0.174632    
seas5        0.073562   0.132494   0.555 0.579148    
seas6       -0.006064   0.163277  -0.037 0.970397    
seas7        0.060232   0.164392   0.366 0.714319    
seas8       -0.293599   0.163930  -1.791 0.074259 .  
seas9       -0.058372   0.164343  -0.355 0.722689    
seas10       0.085811   0.167514   0.512 0.608831    
seas11       0.151791   0.164530   0.923 0.356941    
seas12       0.178656   0.162119   1.102 0.271306    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4021 on 313 degrees of freedom
Multiple R-Squared: 0.2959, Adjusted R-squared: 0.2644 
Wald test: 8.807 on 14 and 313 DF,  p-value: 3.5e-16 
#robust.se(iv)

Part (e)

Microeconomic theory suggests that a monopolist (like the cartel) should operate in a region of the demand curve where demand is elastic (i.e. the elasticity is <= -1). The estimate in c) is clearly larger than -1. Can we therefore conclude in a statistically significant way that the demand curve is in-elastic and therefore at odds with economic theory?

The estimated elasticity is -0.8 in the IV result and thus larger than -1 and therefore non-elastic. That said, we know that even if the demand is actually ealstic (i.e. <=-1) we might still end up with an in-elastic estimate when estimating the elasticity from a sample of data. More formally we might want to test the hypothesis that the demand curve is equal to -1 (i.e. the highest value that would still allow to claim it is elastic) againts the alternative that it is in-elastic; i.e.

Hence we are tesing \(H0: \beta(lnprice)=-1 \textrm{ vs } H1: \beta(lnprice)>-1\). Note that this is a one sided test, implying that we are only worried about being wrong if the value is larger. This only affects which critical threshold we require. For a 5% significance level we don’t need to divid 5% by 2 to find the threshold. Rather, the threshold becomes: qnorm(0.95)=1.6448536. Hence, we would reject that our demand is in-elastic if we find that our t-value is larger than this threshold. We can calculate our t-value here as:

\[t-stat=(-0.8666+1)/0.131= 1.0204783\]

which is not bigger than 1.6448536 , so we cannot reject the hypothesis that we are actually dealing with an in-elastic demand curve.

As further illustration for this exercise consider also the following diagram: Note that we can work out the shape of a demand curve if we know that the demand curve stays fixed and prices vary for reasons other than shifts in demand (such as a shifts in marginal costs or price increases due to monopoly/cartel power). However, in practice we don’t know what exactly moved prices and demand from one datapoint to the next. We can make progress by controlling for (some) of the stuff that clearly will shift demand (e.g. ice preventing alternative forms of transport). In addition we can focus exclusively on price movements that are brought about by supply side price movements if we have variables (such as the existence of a Cartel) that we can assume to be exclusively driving the supply side.

Citation

For attribution, please cite this work as

Martin (2021, Nov. 25). Datastories Hub: Exercises 7. Retrieved from https://mondpanther.github.io/datastorieshub/posts/exercises/exercises7/

BibTeX citation

@misc{martin2021exercises,
  author = {Martin, Ralf},
  title = {Datastories Hub: Exercises 7},
  url = {https://mondpanther.github.io/datastorieshub/posts/exercises/exercises7/},
  year = {2021}
}