Year | Original | CS | LX | H | Original | CS | LX | H |
---|---|---|---|---|---|---|---|---|
1991 | 983,476 | 530,283 | 378,260 | - | 44.92 | 43.14 | 41.76 | - |
1992 | 979,065 | 527,550 | 380,470 | - | 44.96 | 43.42 | 42.13 | - |
1993 | 977,567 | 533,715 | 386,543 | 320,466 | 44.91 | 43.68 | 42.51 | 41.25 |
1994 | 989,879 | 545,664 | 395,941 | 323,470 | 45.03 | 43.96 | 42.86 | 41.64 |
1995 | 1,012,618 | 562,889 | 409,693 | 331,145 | 45.4 | 44.31 | 43.23 | 41.99 |
1996 | 1,034,423 | 590,827 | 426,375 | 341,581 | 45.74 | 44.48 | 43.51 | 42.31 |
1997 | 1,045,595 | 600,838 | 432,706 | 352,414 | 46.01 | 44.78 | 43.7 | 42.69 |
1998 | 1,048,281 | 609,306 | 434,783 | 362,646 | 46.01 | 44.97 | 43.83 | 42.88 |
1999 | 1,056,571 | 616,042 | 439,602 | 367,774 | 46.12 | 45.19 | 44.08 | 43.13 |
2000 | 1,076,253 | 626,512 | 446,972 | 372,791 | 46.31 | 45.48 | 44.34 | 43.28 |
2001 | 1,095,857 | 635,920 | 453,911 | 376,923 | 46.64 | 45.69 | 44.48 | 43.46 |
2002 | 1,112,807 | 640,395 | 457,142 | 382,168 | 47 | 45.79 | 44.63 | 43.76 |
2003 | 1,138,673 | 643,056 | 465,497 | 389,958 | 47.44 | 45.9 | 44.74 | 43.9 |
2004 | 1,171,995 | 652,977 | 469,845 | 391,155 | 47.79 | 46.07 | 44.89 | 43.97 |
2005 | 1,205,964 | 666,143 | 477,674 | 395,078 | 47.89 | 46.14 | 44.97 | 44.06 |
2006 | 1,235,593 | 679,819 | 488,542 | 402,392 | 47.96 | 46.34 | 45.08 | 44.16 |
2007 | 1,269,997 | 696,736 | 503,394 | 412,287 | 47.98 | 46.55 | 45.32 | 44.31 |
2008 | 1,318,165 | 725,584 | 516,141 | 419,451 | 47.86 | 46.65 | 45.45 | 44.5 |
2009 | 1,327,342 | 733,132 | 521,422 | 427,969 | 48.06 | 46.7 | 45.5 | 44.68 |
2010 | 1,340,228 | 739,348 | 528,695 | 439,966 | 48.05 | 46.7 | 45.53 | 44.81 |
2011 | 1,363,749 | 755,250 | 538,667 | 446,686 | 48.05 | 46.62 | 45.65 | 44.95 |
2012 | 1,370,301 | 771,205 | 546,236 | 451,853 | 47.67 | 46.7 | 45.81 | 45.04 |
2013 | 1,370,705 | 779,184 | 552,304 | 459,918 | 47.58 | 46.7 | 45.9 | 45.14 |
2014 | 1,403,134 | 788,363 | 559,697 | 467,106 | 47.76 | 46.76 | 46.18 | 45.39 |
2015 | 1,432,924 | 798,600 | 564,879 | 470,454 | 47.99 | 47.01 | 46.57 | 45.67 |
2016 | 1,467,041 | 808,594 | - | - | 48.22 | 47.24 | - | - |
2017 | 1,499,854 | 819,852 | - | - | 48.47 | 47.42 | - | - |
2018 | 1,527,016 | 833,686 | - | - | 48.58 | 47.64 | - | - |
2019 | 1,556,649 | 848,159 | - | - | 48.78 | 47.92 | - | - |
2020 | 1,557,642 | 854,916 | - | - | 49 | 48.22 | - | - |
Earnings Dynamics and Mobility in Australia: New Evidence from ALife Data 1991-2020
Introduction
This appendix provides detailed statistics on earnings inequality and dynamics in Australia from 1991 to 2020 using ALife data. We follow the methodology developed by the Global Repository of Income Dynamics (GRID). Please note, this project is not officially associated with GRID.
For details on the data source, sample selection and variable construction, please refer to the main paper.
Data
Data description
Our primary data source is the ATO Longitudinal Information Files (ALife). This consists of a 10% random sample of individual tax filers in ATO’s 2016 client register. The data contains tax records for each individual over the period. Each year, a 10% random sample of new tax filers are added to the sample.¹ Our unit of measurement is the individual. In the Australian income tax system, all income tax liabilities are at the individual level, and there is no joint-filing of tax returns. Our cross-sectional sample provides us a point-in-time snapshot of annual income, tax, and public transfer data between 1991–2020.
Variable construction
Our main earnings measure is real total labour income indexed by the 2020 CPI. We construct the following measures of earnings for worker \(i\) in year \(t\):
Raw real earnings in levels \(y_{it}\), and logs, \(\log(y_{it})\).
Residualized log earnings, \(\epsilon_{it}\). We regress log real earnings on a full set of age dummies, separately for each year and gender. This controls for trends in earnings across workers at different stages of their life or business cycle.
Permanent earnings, \(P_{it-1} = \left( \sum_{s=t-3}^{t-1} y_{is} \right) / 3\), defined as the average earnings over the previous three years. We compute percentiles of permanent earnings.
Residualized permanent earnings, \(\epsilon_{it}^P\), computed from \(P_{it-1}\) using the same method as we used to compute \(\epsilon_{it}\).
1-year change in residualized log earnings, \(g_{it}^1 = \Delta \epsilon_{it} = \epsilon_{it+1} - \epsilon_{it}\). This represents the 1-year forward change in \(\epsilon_{it}\).
5-year change in residualized log earnings, \(g_{it}^5 = \Delta^5 \epsilon_{it} = \epsilon_{it+5} - \epsilon_{it}\). This represents the 5-year forward change in \(\epsilon_{it}\).
We use the consumer price index (CPI) to convert variables to 2020 Australian dollars.
Samples
We construct the following three samples for our analysis.
Cross-sectional (CS) sample: All individuals who satisfy these two criteria at a given year t form the cross-sectional (CS) sample for that year. The CS sample includes years 1991-2020. This sample is used to compute cross-sectional inequality statistics.
Longitudinal (LX) sample: In order to study the distribution of earnings changes we restrict our CS sample to those individuals who have 1-year and 5-year forward earnings changes. This forms our longitudinal sample (LX) for the years 1991-2015.
Heterogeneity (H) sample: We further restrict the LX sample to individuals for whom a permanent earnings measure can be constructed (see below). This restricts the sample to those who have been in the sample for the three previous consecutive years. The H sample includes years 1993-2015 and is used to study variation across demographic groups.
The following table compares the number of observations and the percentage of women in each sample with the raw data.
Summary statistics
The following table presents summary statistics for our cross-section sample. y denotes the earnings variable,and log denotes the log of earnings.
The following table consists of summary statistics for the 1-year and 5-year change in residualized log earnings for the longitudinal sample.
The following table presents summary statistics for our heterogeneity sample. y denotes the permanent earnings variable,and log denotes the log of permanent earnings.
Cross-sectional statistics
Earnings by key percentiles
The main paper plots log earnings by key percentiles relative to their respective values in 1991. In this section we display trends in levels.
Earnings at the top
Earnings inequality
We provide some further metrics to measure inequality in addition to those displayed in the main paper.
Earnings dynamics over time
In this section, we use our longitudinal sample to explore earnings dynamics over time. In addition to statistics on 1-year change in earnings provided in the main paper, we show those for 5-year change in earnings in this section.
Dispersion of earnings changes
Skewness of earnings change distribution
Kurtosis of earnings change distribution
Earnings dynamics by age and permanent income rank
In this section, we show how earnings dynamics vary by age and permanent income rank. The main paper examined 1-year changes in earnings by age and permanent income rank. Here, we show those for 5-year changes.
Dispersion of earnings changes
Skewness of earnings change distribution
Kurtosis of earnings change distribution
Earnings mobility
In this section, we use our Heterogeneity sample to compute some further measures of earnings mobility and examine 5 year mobility in addition to the 10 year mobility examined in the main paper.
Mean rank by quantiles of permanent income
Rank-rank slope
The rank-rank slope (RRS) is the coefficient \(\beta\) of the following regression:
\[ R_{i, t+5} = \alpha + \beta R_{i, t} + \epsilon_{i, t}. \]
This indicator, also common in the literature on intergenerational mobility (Chetty, Hendren, Kline, and Saez (2014)), measures rank persistence over the life cycle. In addition, we calculate a set of mobility indicators conditional on various initial positions within the distribution.
In a world without any rank persistence, \(\beta = 0\), and the indicators of bottom and top mobility would all be equal to 50 (the median rank). In a world with maximum persistence, \(\beta = 1\), and ranks would perpetuate: AUM would equal 25 and ADM would equal 75.
Absolute upward mobility
AUM is the expected rank at \(t + 5\) for individuals who are below the median at time \(t\):
\[ \text{AUM} = \mathbb{E}[R_{i, t+5} | R_{i, t} \leq 50]. \]
Absolute downward mobility
ADM is an index of mobility from the top, or absolute downward mobility (ADM):
\[ \text{ADM} = \mathbb{E}[R_{i, t+5} | R_{i, t} > 50]. \]
Mobility at top 1%
Mobility at the very top of the earnings distribution and estimate an indicator for those in the top 1% (M99) is measured as
\[ M99 = \mathbb{E}[R_{i, t+5} | R_{i, t} > 99]. \]