Regression Through Origin

Introduction of Regression Through Origin Models

So far we have studied models like

Y_i = \beta_0 + \beta_1 X_i + u_i

Where intercept is present. An economic example of these models is the Keynes consumption function written as:

\text{Consumption} = \beta_0 + \beta_1 \text{Income} + u

Where \beta_0 is autonomous consumption i.e., level of consumption when income is zero.

In some cases, we wish to impose the restriction that, when x=0, the expected value of y is zero. There are certain relationships for which this is reasonable. For example, in the following model where tax revenue depends on income:

\text{Tax Revenue} = \beta_1 \text{Income} + u

If income (X) is zero, then income tax revenues (Y) must also be zero.

Consider another example from production theory:

\text{Variable Cost} = \beta_1 \text{Output} + u

If output is zero (X=0), the variable cost will also be zero (Y=0).

Definition of Regression Through Origin Models

A regression model in which intercept term ( ) is absent or zero is called regression through origin model because it passes through the origin i.e., where X=0, and Y=0. It can be written as:

Y_i = \beta_1 X_i + u_i

Economic Examples of Regression Through Origin Models

Instances where the zero-intercept model may be appropriate are

  • Milton Friedman’s permanent income hypothesis, which states that permanent consumption C_P  is proportional to permanent income, Y_P, that is C_P = k Y_P + u
  • Cost theory which postulates that the variable cost of production is proportional to output.
  • Some versions of monetary theory which state that the rate of change of prices (i.e., the rate of inflation) is proportional to the rate of change of the money supply.

Estimation of Regression Through Origin Models

Corresponding to equation 1 we can write our sample regression function as.

Y_i = \hat{\beta}_1 X_i + \hat{u}_i

Or

\hat{Y}_i = \hat{\beta}_1 X_i

To obtain the slope coefficient, we still rely on the method of ordinary least squares, which in this case minimizes the sum of squared residuals.

\hat{\beta}_1 = \frac{\sum X_i Y_i}{\sum X_i^2}

\text{Var}(\hat{\beta}_1) = \frac{\sigma^2}{\sum X_i^2}

And

\hat{\sigma}^2 = \frac{\sum \hat{u}_i^2}{n - 1}

It is interesting to compare these formulas with those obtained when the intercept term is included in the model.

\hat{\beta}_1 = \frac{\sum x_i y_i}{\sum x_i^2}

\text{Var}(\hat{\beta}_1) = \frac{\sigma^2}{\sum x_i^2}

\hat{\sigma}^2 = \frac{\sum \hat{u}_i^2}{n - 2}

Difference between Intercept and without Intercept Models

  1. In the model with no intercept, we use raw sums of squares and cross products but in the model with intercept, we use deviations from mean sums of squares and cross products.
  2. The degrees of freedom in model with intercept to estimate the variance of residual is n-2, but df in the model without intercept is n-1.
  3. The r2, the coefficient of determination is always non-negative for the conventional model, but for interceptless model it can turn out to be negative! This anomalous result arises because the r2 explicitly assumes that the intercept is included in the model.

Consequences of using Zero Intercept Models.

  1. In models with intercept sum of residuals is always zero but in models without intercept zero mean of residual is not necessary. Thus, we must use the interceptless models only when it is appropriate.
  2. If we omit the constant term, then the impact of the constant is forced into the estimates of the other coefficients, causing potential bias.

Example:

Suppose we run a regression model both with and without intercept. Estimating a regression equation with a constant term would likely produce an estimated regression line very similar to the true regression line, which has a constant term quite different from zero. The slope of this estimated line is very low, and the t-score of the estimated slope coefficient may be very close to zero.

However, if the researcher estimates the model without intercept, which implies that the estimated regression line must pass through the origin, then the estimated regression line would result the slope coefficient is biased upward compared with the true slope coefficient. The t-score is biased upward as well, and it may indicate that the estimated slope coefficient is significantly positive. Such a conclusion would be incorrect.

Regression with and without intercept.

Regression with and without intercept.

Coefficient of Determination for Regression Through Origin Model.

To calculate r2 for models without intercept we can use the following formula.

R^2_{\text{raw}} = \frac{\left(\sum X_i Y_i\right)^2}{\left(\sum X_i^2\right)\left(\sum Y_i^2\right)}

Note that here cross products and sum of squares are in raw form i.e., not mean corrected that`s why we call r2 of regression through origin as raw r2.

Although this raw r2 satisfies the relation 0 < r2 < 1, it is not directly comparable to the conventional r2 value. For this reason, some authors do not report the r2 value for zero intercept regression models.

Conclusion

Because of these special features of this model, we need great caution in using the zero-intercept regression model. Unless there is a very priori expectation, one would be advised to use the conventional, intercept-present model. This has a dual advantage.

  • First, if the intercept term is included in the model but it turns out to be statistically insignificant (i.e., statistically equal to zero), for all practical purposes we have a regression through the origin.
  • Second, and more important, if in fact there is an intercept in the model, but we insist on fitting a regression through the origin, we would be committing a specification error.

Share this article
Facebook
Twitter
LinkedIn
WhatsApp

Leave a Reply

Your email address will not be published. Required fields are marked *

Sen Capability Approach

Core Values of Development Sustenance: Sustenance is the ability to meet life-sustaining basic needs like food, clothing, shelter, health, and protection. It is the minimum level required for a good life. If any of these basic needs are absent or shorter in supply, the situation is known as absolute underdevelopment.

Read More »

Assumptions of Classical Linear Regression Model (CLRM)

In the previous post, we discussed how to estimate a sample regression model, i.e., and . by applying the OLS method on sample data, both in simple and multiple linear regression models. You can read these posts here: A Numerical Example of Multiple Linear Regression by Hand and Simple Linear Regression

Read More »

Education and Economic Development

Health, Education, and Economic Development Health and Education as Objectives of Development Education and health are basic objectives of development; they are important ends in themselves. Health is central to wellbeing, and education is essential for a satisfying and rewarding life. Health and Education as Inputs of Development At the

Read More »

A Numerical Example of Multiple Linear Regression by Hand

What is Multiple Linear Regression? The linear regression model shows the linear dependence of one variable on one or more independent variables. A simple linear regression model consists of the linear dependence of one variable on only one independent variable. It is also called a bivariate or two-variable regression model. Such as

Read More »