ECMT1020 Introduction to Econometrics Week 8, 2021S1

Lecture 7: Dummy Variables

Instructor: Ye Lu

Please read Chapter 5 of the textbook.

Contents

1 Motivation: two groups in the data 2

1.1 Chow test: pool or separate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Limitation of separate regressions . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Regression with dummy variable 5

2.1 Separating two groups in one regression: dummy variable . . . . . . . . . . . 6

2.1.1 Intercept dummy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Slope dummy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Dummy variable trap: perfect multicollinearity . . . . . . . . . . . . . . . . . 9

3 More than two groups: more than one dummy variable 11

3.1 More than two groups from one grouping criterion . . . . . . . . . . . . . . . 11

3.1.1 M categories: M − 1 dummies . . . . . . . . . . . . . . . . . . . . . . 11

3.1.2 Change of reference category . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 Multiple grouping criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Figure 1: Cost against number of students for 74 secondary schools in Shanghai, among

which 34 are occupational schools and 40 are regular academic schools.

1

1 Motivation: two groups in the data

Running example:

Yi = β1 + β2Xi + ui, i = 1, . . . , n, (1)

where

• Y = COST is the annual recurrent cost for running a school;

• X = N is the number of students in a school.

The question is to explain the effect of number of students on the cost of running a school.

The scatter plot of the annual cost against number of students for 74 secondary schools

in Shanghai is given in Figure 1. In the plot we can see that data points are divided into

two categories/types:

• occupational school: schools that aim to provide skills for specific occupations,

• regular school: regular general academic schools.

There are in total 34 occupational schools and 40 regular schools in this sample.

It is clear from Figure 1 that the red dots (occupational schools) are generally above the

grey dots (regular schools). This means overall it is more expensive to run the occupational

school than regular school1, or the overhead cost of occupational schools is higher than that

of regular school. Therefore, if we were to fit a linear regressoin like (1) separately for the two

types of schools, we expect the intercept estimate from the occupational school regression to

be higher than that from the regular school regression.

Moreover, we can also see from Figure 1 that the marginal cost of running occupational

schools for each additional student seems to be also higher than the case for regular schools.

In other words, if we were to fit a linear regressoin like (1) separately for the two types of

schools, we expect the slope estimate from occupational school regression to be higher than

that from regular school regression.

Given the above observations, we have good reason to think about fitting separate regressions

for these two distinct subsamples:

• subsample A: observations for occupational schools (sample size nA),

• subsample B: observations for regular schools (sample size nB),

where the subsample sizes must satisfy nA + nB = n.

The separate regressions for subsample A and B are respecively

Regression A: Y Ai = α1 + α2X

A

i + u

A

i , i = 1, . . . , nA, (2)

and

Regression B: Y Bi = γ1 + γ2X

B

i + u

B

i , i = 1, . . . , nB. (3)

Accordingly, we call regression (1) with all the observations as the ‘pooled regression’.

1This is reasonable because the occupational school tends to be expensive to run for the need to maintain

specialized workshops.

2

First of all, we should understand that if we fit separete regressions A and B rather than

a pooled regression, we must get better model fit in terms of smaller total sum of the squares

of the residuals. In other words, we should expect2

RSSA +RSSB ≤ RSSP (4)

where RSSA, RSSB and RSSP denote, respectively, the sum of the squares of the residuals

for regression A, regression B and the pooled regressoin (1) which we call regression P.

The question is: does this improvement specified in (4) statistically significant or not?

Put differently, is the quantity

RSSP − (RSSA +RSSB)

significantly different than zero or not? The Chow test (Chow, 1960) provides a formal

statistical procedure to answer this question.

1.1 Chow test: pool or separate

Recall how we perform F test in the multiple regression model to test whether adding extra

regressor(s) can significantly improve the model fit. The logic for the Chow test is the same,

as it is essentially a F test.

When we fit the pooled regression (1), we have k = 2 parameters to estimate; where the

total numbers of parameters doubles to 2k = 4 if we fit separate regressions (2) and (3). So

the scenario here is again that we get improvement of model fit by adding more parameters,

or in other words, sacrificing the degrees of freedoms (DF). Remember that extra parameters

cost/use up extra degrees of freedom – no free lunch!

The general formula for the F test statistic is

F (extra DF,DF remaining) =

improvement in fit/extra DF

RSS remaining/DF remaining

(5)

where the improvement in fit comes from using a model with more parameters (or less DF)

versus using a model with fewer parameters (or more DF); and the RSS remaining and DF

remaining are the RSS and DF of the model with more parameters. The null hypothesis is

H0 : no improvement in fit from a model with more parameters

and we reject the null hypothesis if the F test statistic is greater than the critical value at

certain significant level.

Chow test aims to decide whether separate Regressions A and B can provide a significantly

better fit than pooled Regression P. Note that we have

• Regression A

– sample size is nA;

– k parameters which use up k degrees of freedom in fitting Regression A;

– remaining DF is nA − k;

2Think about why? Read the textbook companion slides on the Chow test for the great illustration. Also

in the textbook equations (5.41)–(5.43).

3

– residual sum of squares is RSSA.

• Regression B

– sample size is nB;

– k parameters which use up k degrees of freedom in fitting Regression B;

– remaining DF is nB − k;

– residual sum of squares is RSSB.

• Regression P

– sample size is n = nA + nB;

– k parameters which use up k degrees of freedom in fitting Regression P;

– remaining DF is n− k;

– residual sum of squares is RSSP .

To put the Chow test into the F test context, we may consider Regression A and B together

as one model with 2k parameters in total, and Regression P itself as another model with

only k parameters. We want to use the F test given by (5) to test whether the former model

with 2k parameters provides a better fit than the latter with k parameters.

Let’s fit these quantities into the formula in (5):

• improvement in fit = RSSP − (RSSA +RSSB);

• extra DF used up = 2k − k = k;

• RSS remaining = RSSA +RSSB;

• DF remaining = (nA − k) + (nB − k) = (nA + nB)− 2k = n− 2k.

Therefore, we have the F test statistic for Chow test as

F (k, n− 2k) = (RSSP − (RSSA +RSSB))/k

(RSSA +RSSB)/(n− 2k) ,

and this test statistic is distributed as F distribution with k and n− 2k degrees of freedom

under the null hypothesis that there is no significant improvement in fit.

An example of Chow test for the cost regerssion of 74 secondary schools is given in the

textbook and companion slides.

• Regression A (occupational schools)

– nA = 34;

– k = 2 parameters which use up k = 2 degrees of freedom in fitting Regression A;

– remaining DF is nA − k = 34− 2 = 32;

– residual sum of squares is RSSA = 3.49(×1011).

• Regression B (regular schools)

– nB = 40;

– k = 2 parameters which use up k = 2 degrees of freedom in fitting Regression B;

– remaining DF is nB − k = 40− 2 = 38;

4

– residual sum of squares is RSSB = 1.22(×1011).

• Regression P

– n = 34 + 40 = 74;

– k = 2 parameters which use up k = 2 degrees of freedom in fitting Regression P;

– remaining DF is n− k = 74− 2 = 72;

– residual sum of squares is RSSP = 8.92(×1011).

Let’s, again, fit these quantities into the formula in (5):

• improvement in fit = RSSP − (RSSA +RSSB) = 8.91− (3.49 + 1.22) = 8.91− 4.71 =

4.20(×1011);

• extra DF used up = 2k − k = k = 2;

• RSS remaining = RSSA +RSSB = 3.49 + 1.22 = 4.71(×1011);

• DF remaining = n− 2k = 74− 4 = 70(= 32 + 38).

Therefore, the F test statistic for Chow test is

F (2, 70) =

(RSSP − (RSSA +RSSB))/k

(RSSA +RSSB)/(n− 2k) =

4.21/2

4.71/70

= 31.3.

The critical value of F (2, 70) distribution at the 0.1% level is 7.64 < 31.3. So we come to

the conclusion that we reject the null hypothesis at the 0.1% significance level, and believe

that we should run separate regressions two the two types of schools.

1.2 Limitation of separate regressions

We often want to see how different αˆ1 and γˆ1 are, and how different αˆ2 and γˆ2 are from

regressions (2) and (3) when we have the motivation to run separate regressions. They must

differ in values, but we cannot see how statistically significant the differences are. This leads

to the major drawback of the separate regressions. In particular, we draw your attention to

the below two problems.

1. (major problem) How can we tell how different the coefficients are? How statistically

significant are these differences?

2. When you run regressions with two small samples (nA < n, nB < n) instead of running

one large pooled regression, there is an adverse effect on the precision of the estimates

of the coefficients.

2 Regression with dummy variable

The usual solution to above problems is to fit a single regression with an extra dummy vari-

able. The dummy variable is a binary variable which only takes value either 0 or 1, and it

indicates the category that an observation belongs to. Figure 2 gives the illustration of a

dummy variable ‘OCC’ which takes value 1 if the observation is for the occupational school,

and 0 if not. We can see that the dummy variable is essentially an indicator we use to

mark each observation into certain category in the data set – like how we mark the data

5

Figure 2: Illustration of dummy variable: OCC is a 0-1 variable indicating the type of school

in the data set.

points using different colors (red and grey) in Figure 1. We often call the texts listed in

the second column of the table in Figure 2 as ‘categorical data’. So, a dummy variable is

actually a numerical variable which represents categorical data.

2.1 Separating two groups in one regression: dummy variable

Below we consider the intercept dummy and slope dummy, respectively, to address the two

issues we raised above:

1. Overhead costs for running occupational schools and regular schools can be different;

2. Marginal costs of each additional student can also be different for running occupational

schools and regular schools.

2.1.1 Intercept dummy

Let’s first see what happens if we introduce the above dummy variable OCC as an extra

explanatory variable into the simple regression (1). Now we have a multiple regression:

Yi = β1 + β2Xi + β3Di + ui, i = 1, . . . , n, (6)

where I use the generic notations Y,X,D for the dependent variable, the first regressor and

the dummy variable. In particular,

• Yi = COSTi is the annual recurrent cost for running the ith school;

• Xi = Ni is the number of students in the ith school.

• Di = OCCi = 1 if the ith school is an occupational school, and Di = OCCi = 0 if the

ith school is a regular school.

Despite being binary, D can be treated the same as other explanatory variable(s) in the

regression. We can fit the regression using the total n = 74 observations using OLS method,

6

and obtain the OLS estimates βˆ1, βˆ2 and βˆ3. The fitted regression is written as

Yi = βˆ1 + βˆ2Xi + βˆ3Di, i = 1, . . . , n.

How to interpret our parameter estimates? It would be clear if we look at the fitted regres-

sions (interpreted as estimated cost functions here) separately for two types of schools:

• Di = 1 : occupational school

Yi = βˆ1 + βˆ2Xi + βˆ3 · 1

= (βˆ1 + βˆ3) + βˆ2Xi, (7)

where the index i here runs through the indices of the occupational schools in the

sample.

• Di = 0 : regular school

Yi = βˆ1 + βˆ2Xi + βˆ3 · 0

= βˆ1 + βˆ2Xi, (8)

where the index i here runs through the indices of the regular schools in the sample.

Now, by comparing (7) and (8), we can interpret the parameter estimates as follows:

1. βˆ1 is the estimated annual overhead cost for regular schools,

2. βˆ2 is the estimated annual marginal cost of each additional student for both regular

schools and occupational schools.3

3. βˆ3 is the estimated extra annual overhead cost for occupational schools over regular

schools. Note that the intercept in the fitted regression (7) for occupational schools is

βˆ1 + βˆ3 which means the overhead cost for occupational schools is estimated as βˆ1 + βˆ3.

The interpretation of βˆ3 is the key to understand how the dummy variable D works in

regression (6). Implicitly, we have set the regular schools as the ‘reference’. We get the

overhead cost estimate for the reference type of school as βˆ1, the estimate of the intercept

in regression (6); and βˆ3 tells us how much extra overhead cost we need for the other type

of school.

Since the slope coefficient βˆ3 in front of the dummy variable D turns out to be the

difference between the two estimated intercepts in fitted regressions (7) and (8) to capture

the parallel shift from one fitted regression line to another, we often call the dummy variable

D in regression (6) as the ‘intercept dummy’.

Obtaining standard errors and conducting hypothesis testing in a regression with dummy

variable are not different than usual. It can be very useful to perform a t test on the coefficient

of the dummy variable to see whether there is a significant difference in the overhead costs

of the two types of school. See textbook and companion slides for more discussion.

3Note that it is a restriction of this model that the marginal costs for the two types of schools have to be

the same. Since this restriction sounds unrealistic, we will relax it in the next section.

7

2.1.2 Slope dummy

The intercept dummy can only shift the fitted regression line in a parallel manner, and it

cannot allow the two fitted regression lines to have different slopes. As mentioned above, this

is the limitation of the intercept dummy model, where the marginal costs for both types of

school have to be the same. The latter is clearly not a plausible assumption from the visual

inspection of Figure 1: the fitted regression (cost function) for the occupational schools

should be steeper, and that for the regular schools should be flatter.

To allow the slopes to be different, we introduce another ‘slope dummy variable’ X ·D

into the regression (6), and get

Yi = β1 + β2Xi + β3Di + β4XiDi + ui, i = 1, . . . , n, (9)

where the variables Y,X and D are the same as before.

Again, we can fit the regression using the total n = 74 observations and obtain the OLS

estimates βˆ1, βˆ2, β3 and βˆ4. The fitted regression is written as

Yi = βˆ1 + βˆ2Xi + βˆ3Di + βˆ4XiDi, i = 1, . . . , n.

We look at the fitted regressions separately for two types of school:

• Di = 1 : occupational school

Yi = βˆ1 + βˆ2Xi + βˆ3 · 1 + βˆ4Xi · 1

= (βˆ1 + βˆ3) + (βˆ2 + βˆ4)Xi, (10)

where the index i here runs through the indices of the occupational schools in the

sample.

• Di = 0 : regular school

Yi = βˆ1 + βˆ2Xi + βˆ3 · 0 + βˆ4Xi · 0

= βˆ1 + βˆ2Xi, (11)

where the index i here runs through the indices of the regular schools in the sample.

The fitted regressions (10) and (11) clearly show that now both intercepts and slopes can

be different, and the differences are captured by βˆ3 and βˆ4, respectively. Specifically, the

parameter estimates are now interpreted as follows:

1. βˆ1 is the estimated annual overhead cost for regular schools (the reference),

2. βˆ2 is the estimated annual marginal cost of each additional student for regular schools

(the reference).

3. βˆ3 is the estimated extra annual overhead cost for occupational schools.

4. βˆ4 is the estimated extra annual marginal cost for occupational schools.

Again, we can perform t tests as usual. The t test for the significance of the slope coef-

ficient βˆ4 can be useful for telling whether the marginal cost per student in an occupational

school is significantly higher than that in a regular school.

8

We can also perform an F test of the joint explanatory power of the intercept dummy

and slope dummy in regression (9) by testing

H0 : β3 = β4 = 0. (12)

To do this, we compare the RSS from regression (9) where both dummies are included and

the RSS from regression (1) where they are not, and use the usual F test statistic and critical

values to make testing decision. If we reject the null, then it means that at least one of β3

and β4 is different from zero.

In fact, the Chow test we introduced at the beginning of this lecture is equivalent to this F

test. We verify this in the same data example. Recall that we hadRSSA+RSSB = 4.71×1011

in Section 1.1. For the regression (9) on the whole sample with both intercept and slope

dummy variables, the residual sum of squares is also 4.71 × 1011. Note that the number of

parameters in regression (9) is 4 which is the same as the total number of parameters from

subsample Regressions A and B. Therefore, the below two sets of the model fit comparison

are equivalent:

• compare running (the pooled) regression (1) and running separate regressions A and

B in (2) and (3).

– Both regressions have no dummy variables.

– Regression (1) uses the whole sample, while regressions A and B are separately

fit using two subsamples.

• compare regression (1) and regression (9)

– Both regressions are based on whole sample.

– Regression (1) has no dummies, while regression (9) has both intercept and slope

dummies.

2.2 Dummy variable trap: perfect multicollinearity

Wait. There are two types of school in the data, but why do we only use one dummy

variable to separate the two groups? Can we use two dummies one for occupational schools

and another for regular schools? In other words, can we set

Do =

{

1 if occupational school

0 if regular school

Dr =

{

0 if occupational school

1 if regular school

and include both of them in a regression

Y = β1 + β2X + β3D

o + β4D

r + u? (13)

The answer is NO. We fall into the classic ‘dummy variable strap’ if we do this, and the reason

is that we essentially run into a special case of the ‘perfect multicollinearity’ discussed in the

lecture on multiple regressions.

Note what’s special about the two dummies Do and Dr defined above – they always add

to one, simply because a school in the sample is either occupational school or regular school.

9

So we have4

Do +Dr = 1. (14)

Look at the 1 on the right-hand side of the equation: can you find it also hides in our

regression (13)5 as one of the regressors? Yes, the constant 1 is the first regressor in the

regression. Note we can always write (13) as

Y = β1X1 + β2X2 + β3X3 + β4X4 + u

where

X1 = 1, X2 = X, X3 = D

o, X4 = D

r. (15)

Then from (15) and (14), it is clear that we have

X1 = X3 +X4,

which is a perfect linear relationship among the regressors! With such perfect multicollinear-

ity, we will not be able to perform the OLS estimation.

As you might have imagined, we can actually escape from the dummy variable trap by

excluding the intercept term in the regression. For example, instead of (13), we may run the

following regression

Y = β2X + β3D

o + β4D

r + u, (16)

which will be perfectly fine. Without the presence of perfect multicollinearity we can obtain

the OLS estimates for βˆ2, βˆ3 and βˆ4. Following the similar analysis as before (but notice

that logically Do and Dr can never be both one or zero), we have

• Do = 1 and Dr = 0: occupational school fitted regression

Y = βˆ2X + βˆ3. (17)

• Dr = 1 and Do = 0: regular school fitted regression

Y = βˆ2X + βˆ4. (18)

Apparently, now the interpretations of the parameters are different in general:

1. βˆ2 is the estimated annual marginal cost of each additional student for both regular

schools and occupational schools. −→ This is the same as in the case before with only

intercept dummy.

2. βˆ3 is the estimated annual overhead cost for occupational schools.

3. βˆ4 is the estimated annual overhead cost for regular schools.

4To be more explicit, we actually have Doi +D

r

i = 1 for all i = 1, . . . , n in the sample.

5Well, it hides in any regression with an intercept term as one of the regressors

10

The difference here is that there is no reference type of school any more! Neither of the

slope coefficients for the dummy variables is interpreted as the ‘extra’ overhead cost of one

type than the other. They are simply estimating the overhead costs for both types of school

separately.

Summary: If we would like to keep the intercept term in the regression, then for separating

two groups we only need one dummy variable. The general rule is that we need M − 1

dummy variables to separate M categories using a grouping criterion, if we have intercept

in the regression.

3 More than two groups: more than one dummy variable

In practice, there are often cases where there are more than two distinct groups in the data,

either it is because we obtain more than two categories by dividing the observations based

on one grouping criterion, or it is because we use multiple criteria to group the observations.

• One grouping criterion: for example, when we try to group the 74 secondary schools in

Shanghai based on the type of the curriculum, we can do a finer job than just classifying

them as occupational or regular. In fact, there are also two types of occupational school,

and they are technical schools training technicians and skilled workers’ schools training

craftsmen. There are also two types of regular secondary school in Shanghai, they are

general schools which provide the usual academic education, and vocational schools.

So, in total there are 4 types of school among the 74 schools. Figure 3 mark these 4

types into 4 different colors.

• Multiple grouping criteria: suppose we want also to take into account of the fact that

some schools are residential and some are not, then we can use two grouping criteria

to group the observations: residential or not, and occupational or not. Using these two

grouping criteria each has two categories, in total we divide the observations also into

4 groups. This is illustrated in Figure 4.

In the following we consider these two cases.

3.1 More than two groups from one grouping criterion

This is the case as illustrated in Figure 3, and the example is a straightforward extension of

the example we discussed before with two groups.

3.1.1 M categories: M − 1 dummies

To separate the four groups of schools shown in Figure 3, we need 4−1 = 3 dummy variables.

They are illustrated in the last three columns of the table in Figure 5.

Where is the general school? Yes, it is chosen as the reference type/category. Therefore,

we only see dummy variables for the other three types: technical schools, worker’s schools,

and vocational schools. The reference category is hence usually described as the ‘omitted’

category.

The regression we run is

Yi = β1 + β2Xi + β3D

T

i + β4D

W

i + β5D

V

i + ui, i = 1, . . . , n, (19)

11

Figure 3: Cost against number of students for 74 secondary schools in Shanghai classified

into four categories.

Figure 4: Cost against number of students for 74 secondary schools in Shanghai: classified

into two sets of categories: residential/nonresidential and regular/occupational.

where

• Yi = COSTi is the annual recurrent cost for running the ith school;

• Xi = Ni is the number of students in the ith school.

• DTi = TECHi = 1 only if the ith school is a technical school.

• DWi = WORKERi = 1 only if the ith school is a workers’ school.

• DVi = V OCi = 1 only if the ith school is a vocational school.

Keeping in mind that the reference category is the general school, we can easily obtain the

below interpretations of the parameter estimates:

1. βˆ1 is the estimated annual overhead cost for general school (the reference),

12

2. βˆ2 is the estimated annual marginal cost of each additional student for all schools

(because there is no slope dummy yet).

3. βˆ3 is the estimated extra annual overhead cost for technical school over the general

school.

4. βˆ4 is the estimated extra annual marginal cost for workers’ school over the general

school.

5. βˆ5 is the estimated extra annual marginal cost for vocational school over the general

school.

The standard errors and hypothesis tests are not different than usual. The analysis done for

the two groups with one dummy variable can be simply generalized in this case with more

than two groups.

3.1.2 Change of reference category

In the above regression we chose the general school as the reference category, and thus we

can compare the overhead costs of other schools with general schools and to test whether

the differences were significant by using the t test for each individual parameters β3, β4 and

β5.

What if we were interested in testing whether the overhead costs of workers’ schools were

different from those of the other types of schools? −→ The easiest way to do this is to re-run

the regression making workers’ schools the reference category. This is simple: we just need

to get rid of the dummy for worker’s school and add the dummy for the general school in

regression (19)!

What do we expect to see from the estimation of the regression with the new reference?

• The parameter estimates for the intercept and slope coefficients are certainly different

except for βˆ2 (which is the estimated marginal cost for all school types).

• The fitted regression (cost function) for each category should remain the same!

See more detailed discussion in the textbook and the companion slides.

3.2 Multiple grouping criteria

To separate the four groups of schools shown in Figure 4 made by two grouping criteria, we

need two sets of dummy variables. They are illustrated in the last two columns of the table

in Figure 6. Since there are only two categories under each grouping criterion, we need one

(two minus one) dummy variable for each grouping. In total we need two dummy variables.

The regression we run is

Yi = β1 + β2Xi + β3OCCi + β4RESi + ui, i = 1, . . . , n, (20)

where

• Yi = COSTi is the annual recurrent cost for running the ith school;

• Xi = Ni is the number of students in the ith school.

• OCCi = 1 if the ith school is an occupational school.

13

Figure 5: Three dummy variables for separating four types of secondary schools in Shanghai.

Figure 6: Two sets of dummy variables: OCC and RES.

14

• RESi = 1 if the ith school is a residential school.

Obviously, the ith school can be both occupational and residential, or both not. This means

OCCi and RESi can both be one or zero.

The fitted regression is written as (omitting the subscript i for observations)

Y = βˆ1 + βˆ2X + βˆ3OCC + βˆ4RES.

To interpret our parameter estimates, we consider:

• OCC = 0, RES = 0 : regular, nonresidential school cost function

Y = βˆ1 + βˆ2X. (21)

• OCC = 0, RES = 1 : regular, residential school cost function

Y = (βˆ1 + βˆ4) + βˆ2X. (22)

• OCC = 1, RES = 0 : occupational, nonresidential school cost function

Y = (βˆ1 + βˆ3) + βˆ2X. (23)

• OCC = 1, RES = 1 : occupational, residential school cost function

Y = (βˆ1 + βˆ3 + βˆ4) + βˆ2X. (24)

Interpretations:

1. βˆ1 is the overhead cost for regular, nonresidential school −→ see (21).

2. βˆ2 is the marginal cost of each additional student for all schools (because there is no

slope dummy yet) −→ see (21)–(24).

3. βˆ3 is the extra overhead cost for occupational, nonresidential school over the regular,

nonresidential school (compare (21) and (23)) and also the extra overhead cost for

occupational, residential school over the regular, residential school (compare (22) and

(24)).

4. βˆ4 is the extra overhead cost for regular, residential school over the regular, nonresidential

school (compare (21) and (22)) and also the extra overhead cost for occupational,

residential school over the regular, nonresidential school (compare (23) and (24)).

Clearly, βˆ3 estimates the extra overhead cost for occupational school over regular school,

irrespective of the school being residential or not. Likewise, βˆ4 estimates the extra over-

head cost for residential school over nonresidential school, irrespective of the school being

occupational or not. This is part of the restrictions in the model.

15

欢迎咨询51作业君