Projections of future income distributions at subnational levels are becoming increasingly important for a variety of analyses and evaluations. However, relevant datasets are currently limited. This study presents a methodological framework that introduces machine learning algorithms to a top-down approach used for generating income distribution datasets. We project per capita disposable income and income inequality for 31 Chinese provinces from 2020 to 2100, considering different scenarios based on China's local circumstances, and then estimate income distributions based on these. After accounting for necessary consistency between provincial, urban, and rural income datasets, we further generate the same data products at the urban and rural level for each province. We validate our projection results drawing on data from 2007-2023 for China's disposable income, data from 2007 to 2019 for provincial income inequality in China, as well as national income inequality data for the past 20 to 60 years from select developed countries. The proposed methodology provides flexibility to generate similar data products according to a user's specific needs. Our resulting datasets have several potential applications and can serve as inputs for research on drivers and impacts across social, economic, and environmental domains.
© 2024. The Author(s).