Building simulations of real-time economic activity

During college I read tons of micro and macroeconomic textbooks. I learned all the classical frameworks, math equations, and models of rational behavior used to explain and predict the decisions of individuals and organizations. The books were always heavy on math and social sciences but were often light on one essential thing: strong evidence that the theories, equations, and models correspond to the real world.

In academia there have been more attempts at grand theories of micro and macroeconomic behavior than empirical evidence “What is the actual measured effect of X on Y?.” The fact that economics is categorized as a social science has always suggested its distancing from “hard” sciences, like physics and statistics. This begs the question: what would it take for real-world economic activity to be simulated by the type of data available in games like Simcity and Age of Empires?

Axtell and Epstein’s 1996 landmark book Growing Artificial Societies is what originally drew me to this question. The Sugarscape models introduced were not revolutionary in their predictions (a bunch of macro models were able to make comparable predictions for two decades prior), but the way in which basic computer code could produce markets emergent from individual agent decisions was incredible. The potential to early readers was not immediately obvious, but the recent advancements in neural networks has made large-scale agent-based modeling more practical.

For decades, the main limitations in simulating this type of activity for commercial use were the type of data available, and the time/cost of experimentation. In the past, if you wanted scanner data the only place to get it was from the checkout line at retail stores. Today, everything is digital. The ERP, ordering, POS, inventory management, and supply chain systems are all digital. The number of places you can pick up data from is materially different than it was 10 years ago, and it’s going to change even more with the rise of 5G and the ability to put sensors on every economic good. Generative ABM’s present an opportunity to use this type of data where the cost of experimentation becomes very cheap, and simulations can easily be replayed in millions of configurations. However, replicating actual human behavior, even in small environments is surprisingly difficult.

The problem is not a data engineering one. As much as a big, unsolved technical challenge would make for a satisfying explanation, the amount of data required for realistic representations of human behavior is not particularly large, compared to the amount of data processed in other fields such as cloud computing, financial services, and the energy sector. Modern advancements in core tools like data warehouses (Snowflake,Databricks), BI (Alteryx,Tableau) and streaming (Confluent, Warpstream) have created ways for software engineers and data scientists to process the modest amount of data required to analyze activity like commerce.

One of the keys to all of this is the integration of external datasets. Corporations around the world have tremendous amounts of data, measuring what many aspects of the economy is doing in real-time. But it’s not the scale of data they have been collecting for decades that is so valuable; it’s all of the real observational human information and entropy that it contains. Specifically, the historical purchasing behavior, foot traffic, web browsing patterns, and product preferences. If agents were able to realistically adopt the behavioral personas of consumers in corporate environments, it could revolutionize how large organizations and governments make decisions. Rather than taking a top-down broad view of their business/economy, they could take a bottom-up approach to figure out which variables change consumer behavior and what aggregate outcome would have the biggest impact on financial performance/GDP growth.