Understanding Types of Economic Data with Real Examples

Types of Economic Data

Types of Economic Data

Data is central to any empirical analysis. An empirical analysis uses data to test a hypothesis or theory or to estimate a relationship. The first step in any empirical analysis is to formulate a research question. Model specification is the next step followed by obtaining data. There are many different kinds of economic data sets used in empirical economic analysis. Some econometric techniques can be used across a variety of data sets with minimal adjustments, but certain data sets have unique characteristics that must be carefully considered, sometimes necessitating the development of new econometric methods. Three are mainly four types of datasets. These are:

  • Cross-Section data
  • Time series data
  • Pool data
  • Panel or longitudinal data

Cross-Section data

Cross-sectional data consists of observations collected from various entities such as individuals, households, firms, cities, states and countries at a given point in time. For example, GDP of Asian countries for the year 2023, number of deaths due to coronavirus pandemic in the year 2020, number of car accidents recorded in different big cities in the year 2023, data on wages, education, experience, gender of 100 individuals of district Khushab.

Sources of cross section dataset in Pakistan are:

In econometrics, cross-sectional variables are usually denoted by the subscript i, with i taking values of 1, 2, 3, …, N, for N number of cross-sections. So, if, for example, Y denotes the income data collected for N individuals, will be denoted by: Yi for i = 1, 2, 3, …, N. In economics, the analysis of cross-sectional data is mainly used in microeconomics, labor economics, state and local public finance, business economics, demographic economics and health economics. While dealing with cross section data an important problem that economists must face is heterogeneity.

Obs.WageEducationExperienceFemale
111.5512201
25990
31216150
4714380
521.1516191
9681260
971211320
9825141
9959120
10071230

Table 1: Example of Cross-Section Data

Time-Series Data

Time series data consists of observations collected over multiple time periods for a single entity. For example, data about Real GDP (RGDP), Inflation (INF), Unemployment (UR) and Life expectancy (LE) of Pakistan from 1991 to 2019. In this data Pakistan is a single entity observed over multiple time periods from 1991 to 2019, a total of 29 observations. Time series data is often denoted by subscript t, where t shows a specific time observation.

The order of time series data is very important because it is collected in chronological order i.e., in accordance with the occurrence of time. Time series data is collected at various frequences such as daily, weekly, monthly, and annually. An important feature of time series data is that past observations affect the current observations. The original use of time series data is forecasting based on past information. Forecasting requires that the data is stationary, but most time series data is non-stationary. Time series data has four components trend, cyclical, seasonal and irregular components.

YearRGDPInflationURLE
1991344102.65611.7910.96160.259
1992370616.7819.5090.96160.116
1993377133.3759.97370.96959.934
1994391228.43812.3680.97360.116
1995410643.56312.3440.97359.878
2015885411.9382.52936.67665.697
2016934346.3133.76515.1665.88
2017986242.5634.08546.50666.297
20181043742.885.07817.84966.482
20191078572.6310.5789.57466.756

Table 2: Example of Time Series Data

Pooled Data

Pooled data or combined data have features of both cross section and time series data in which each cross-section unit may not be the same for each time period. For example, suppose that two cross-sectional household surveys are taken in Pakistan, one in 1985 and one in 1990. In 1985, a random sample of households is surveyed for variables such as income, savings, family size, and so on. In 1990, a new random sample of households is taken using the same survey questions. To increase our sample size, we can form a pooled cross section by combining the two years.

Panel Data

Panel (or longitudinal) data is a combination of cross-section and time-series data in which data on the same cross-sectional units are collected over multiple time periods. For example, we collect data on GPA, attendance ratio, and study hours for the same BS Economics students across all semesters. Similarly, data about GDP, inflation, unemployment rate, money supply, and investment for all developing countries from 1970 to 2023. For panel data, the subscripts i and t are used. The subscript i is used for cross-sectional and t is used for time-series data. Examples of panel datasets:

In balanced panel number of time observations are same for all cross-section units. In unbalanced panel number of time observations are not same for all cross-sectional units. For example, if GPA and attendance data are available for all BS Economics students for every semester, with no missing observations, the panel is balanced. If some students drop after some semester, then we do not have complete data for all students in all semesters, this is unbalanced panel.

NamePanel IDYearGDPGFDIP
Pakistan120165.5267360.9244422.086326
Pakistan120175.5542770.8195232.077578
Pakistan120185.8364170.5521872.057546
Pakistan120190.9888290.8029562.022967
Pakistan120200.5255271.0537261.97832
India220168.2563061.9373631.090459
India220176.7953831.5073171.063359
India220186.5329891.5592641.037828
India220194.0415541.7631281.013261
India22020-7.964611.9669920.989414
Srilanka320164.4866351.0886381.104984
Srilanka320173.578171.570231.13022
Srilanka320183.2721.8350951.048393
Srilanka320192.2551770.8853360.611876
Srilanka32020-3.56908-0.064420.530627

Table 3: Example of Balanced Panel Dataset

NamePanel IDYearGDPGFDIP
Pakistan120165.5267360.9244422.086326
Pakistan120175.5542770.8195232.077578
Pakistan120185.8364170.5521872.057546
Pakistan120190.9888290.8029562.022967
Pakistan120200.5255271.0537261.97832
India220168.2563061.9373631.090459
India220176.7953831.5073171.063359
India220186.5329891.5592641.037828
India220194.0415541.7631281.013261
Srilanka320164.4866351.0886381.104984
Srilanka320173.578171.570231.13022
Srilanka320183.2721.8350951.048393

Table 4: Example of Unbalanced Panel Dataset

Other Types of Data

Sources and Nature of Economic Data

Experimental vs Non-Experimental Data

Experimental data is collected through controlled experiments where researchers can manipulate one or more independent variables to observe their effects on dependent variable by controlling the effects of other variables. This method is often used to establish cause-and-effect relationships.

Examples:

  • Testing the effectiveness of a new vaccine
  • Analyzing the effect of fertilizer on plant growth

Non-Experimental data or observational data is collected by observing and recording events, behaviors, or phenomena as they naturally occur without manipulation. This method is used where experiments are not possible, not ethical or expensive. 

Sources of Data Collection

Primary data is the data collected for the first time by a researcher for his/her specific research purpose. This data has not been published yet and is more reliable. Primary data is collected through surveys, interviews, experiments, observations, and questionnaires. Since the researcher himself collects the data from the (sample) respondents, he gets the precise data actually needed for the research project.

For example, a researcher wants to study the effect of online learning on students’ academic performance. He designs a questionnaire and personally surveys 150 university students, asking about their study hours, internet usage, GPA, and learning experience. The information he collects directly from these students is primary data.

Secondary data is data that has already been collected by an institution or researcher for different purposes. it can be obtained from sources such as books, reports, articles, online databases and surveys. The researcher need not to prepare schedule or questionnaire for the collection of data from the sample respondents. Therefore, collecting secondary data is often less expensive and less time consuming to obtain and analyze. For example, a researcher studying CO₂ emissions and economic growth uses data published by the World Bank and Our World in Data. Since the data were collected earlier by these organizations for their own purposes, they are secondary data.

When to use Secondary Data

Though secondary data can be less valid compared to primary data, still it is preferred in the following cases:

  • It is difficult to obtain primary data and easier to get secondary data.
  • When primary data does not exist and hence, the researcher has to depend only on secondary data.
  • When primary data is present, but the respondents are not willing to reveal the information.
  • When the budget is too limited
  • When there is a time constraint on the part of the researcher to collect primary data.

Answer the following questions

  1. Define cross section data with example.
  2. Define time series data with example.
  3. Define panel data with example.
  4. Differentiate pool and panel data.
  5. What is observational data? Give example.
  6. What is experimental data? Give example.
  7. Define primary data with example.
  8. Define secondary data with example.
Share this article
Facebook
Twitter
LinkedIn
WhatsApp

2 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

Microeconomic Household Fertility Theory

Introduction to Microeconomic Household Fertility Theory The 3rd stage of Demographic Transition Theory marks the decline of birth rate with the increase in level of economic development. To explain this decline in birth rate we use Microeconomic Household Fertility Theory which is the application of consumer behavior in microeconomics. Microeconomic Household

Read More »

Inflation, Its Types, Causes and Effects

Inflation Inflation is a sustained increase in the general price level of goods and services in an economy over time. When the general price level increases purchasing power of money decreases and each unit of money buys fewer goods and services. Thus, money losses its value. Prof. Coulborn defines inflation

Read More »

Malthus Population Theory

In the previous post we study about Demographic Transition Theory. In this post we will discuss Introduction to Malthus Population Theory Thomas Malthus examined the relationship between population growth and food supply in his essay “The Principle of Population” in 1798. This theory has two core principles: Core Principles of Malthusian

Read More »

Nominal GDP, Real GDP & GDP Deflator

In this post we will discuss the concepts of nominal GDP, real GDP, GDP deflator and inflation. Before going forward we must know what GDP is? Gross Domestic Product is the total market value of all final goods and services produced within a country in a year. To see more

Read More »

Solow Model of Economic Growth

In the previous couple of blogs, we discussed the Lewis Theory of Economic Development and International Dependence Model.  In this blog our focus is on neoclassical long run economic growth model. Introduction of Solow Model of Economic Growth The Solow model of economic growth is a well-known Neoclassical exogenous growth model

Read More »