Global Migration Flows Permalink
Estimates of international migration patterns derived from anonymized, aggregated platform data; an open dataset and interactive visualization.
Estimates of international migration patterns derived from anonymized, aggregated platform data; an open dataset and interactive visualization.
Mapping social capital across U.S. communities. Public data covers nearly every ZIP code, high school, and college.
A public measure of social connectedness between geographies. Used widely to study migration, trade, COVID-19, and inequality.
Mapping social capital across the United Kingdom from around six billion friendships, with the Behavioural Insights Team and partners.
Published in Proceedings of the 21st International Conference on World Wide Web (WWW '12), 111–120, 2012, 2012
Advertisers are demanding more accurate estimates of the impact of targeted advertisements, yet no study proposes an appropriate methodology to analyze the effectiveness of a targeted advertising campaign, and there is a dearth of empirical evidence on the effectiveness of targeted advertising as a whole. The targeted population is more likely to convert from advertising so the response lift between the targeted and untargeted group to the advertising is likely an overestimate of the impact of targeted advertising. We propose a difference-in-differences estimator to account for this selection bias by decomposing the impact of targeting into selection bias and treatment effects components. Using several large-scale online advertising campaigns, we test the effectiveness of targeted advertising on brand-related searches and clickthrough rates. We find that the treatment effect on the targeted group is about twice as large for brand-related searches, but naively estimating this effect without taking into account selection bias leads to an overestimation of the lift from targeting on brand-related searches by almost 1,000%.
Published in International Conference on World Wide Web (WWW), 2015, 2015
One of the largest challenges for a recommender system is building a ranking of “quality” or “relevance” in situations where these features cannot be observed directly. These models are often trained on various types of survey data, including Likert-scale quality ratings or pairwise comparison surveys, but there has been little work detailing the efficiency of these techniques for eliciting quality ranking and a parsity of work on how to analyze and interpret pairwise choice data. We present techniques for using pairwise choice data for quality ranking and we find, under simulation, that Likert scale elicitation is more efficient under the best possible conditions but in the presence of differential item functionality (i.e., the fact that different scale points may mean different things to different people) or low quality inputs (e.g., lack of attention or understanding by survey participants or noisily measured input features) pairwise comparison becomes a more efficient survey method. We confirm this finding by using different survey techniques to infer the relevance of individuals’ Facebook News Feed stories. Pairwise choice elicitation can be finished quickly by survey participants, is easily to implement and scale, produces models with interpretable results and is robust to noise and interpretational issues. Thus, we argue, pairwise choice surveys have wide potential for application.
Published in Proceedings of the 25th International Conference on World Wide Web (WWW '16), 1103–1111, 2016, 2016
Identifying the same internet user across devices or over time is often infeasible. This presents a problem for online experiments, as it precludes person-level randomization. Randomization must instead be done using imperfect proxies for people, like cookies, email addresses or device identifiers. Cookies present a unique problem for randomized experiments because a user may be associated with multiple cookies, some of which might be assigned to the test group and some to the control group during an experiment, making inference at the person level difficult. We find the cookie treatment effect estimator converges to a weighted average of the marginal effects of treating more of a user’s cookies. If the marginal effects of cookie treatment exposure are positive and constant, it underestimates the person level treatment effect by a factor equal to the number of cookies per user. Using cookie assignment data from Atlas and advertising exposure and purchase data from Facebook, we compare simulated cookie and person level advertising effectiveness experiments. The effect on statistical power is substantial: we find that sample sizes in a cookie test need to be two to three times larger to achieve the same power as an experiment with perfect treatment assignment.
Published in Journal of Economic Perspectives, 32(3), 259–280, 2018, 2018
We introduce a new measure of social connectedness between U.S. county pairs, as well as between U.S. counties and foreign countries. Our measure, which we call the Social Connectedness Index (SCI), is based on the number of friendship links on Facebook, the world’s largest online social network. Within the U.S., social connectedness is strongly decreasing in geographic distance between counties. The population of counties with more geographically-dispersed social networks is richer, more educated, and has higher life expectancy. Region-pairs that are more socially connected have higher trade flows, even after controlling for geographic distance and the similarity of regions along other demographic and socioeconomic measures. Higher social connectedness is also associated with more cross-county migration and patent citations. Social connectedness between U.S. counties and foreign countries is correlated with past migration patterns, with social connectedness decaying in the time since the primary migration wave from that country. Trade with foreign countries is also strongly related to the social connectedness with those countries. These results suggest that the SCI captures an important role of social networks in facilitating economic and social interactions. Our findings highlight the potential for the SCI to mitigate the measurement challenges that pervade empirical literatures that study the role of social interactions across the social sciences.
Published in Journal of Political Economy, 126(6), 2224–2276, 2018, 2018
We show how data from online social networking services can help researchers better understand the effects of social interactions on economic decision making. We use anonymized data from Facebook, the world’s largest online social network, to first explore heterogeneity in the structure of individuals’ social networks. We then exploit the rich variation in the data to analyze the effects of social interactions on housing market investments. To do this, we combine the social network information with housing transaction data. Variation in the geographic dispersion of social networks, combined with time-varying regional house price changes, induces heterogeneity in the house price experiences of different individuals’ friends. We show that individuals whose geographically distant friends experienced larger recent house price increases are more likely to transition from renting to owning. They also buy larger houses and pay more for a given house. Similarly, when homeowners’ friends experience less positive house price changes, these homeowners are more likely to become renters, and more likely to sell their property at a lower price. We find that these relationships are driven by the effect of social interactions on individuals’ housing market expectations. Survey data show that individuals whose geographically distant friends experienced larger recent house price increases consider local property a more attractive investment, with bigger effects for individuals who regularly discuss such investments with their friends.
Published in Proceedings of the 2019 ACM Conference on Economics and Computation (EC '19), 199–213, 2019, 2019
Decision makers in health, public policy, technology, and social science are increasingly interested in going beyond ‘one-size-fits-all’ policies to personalized ones. Thus, they are faced with the problem of estimating heterogeneous causal effects. Unfortunately, estimating heterogeneous effects from randomized data requires large amounts of statistical power and while observational data is often available in much larger quantities the presence of unobserved confounders can make using estimates derived from it highly suspect. We show that under some assumptions estimated heterogeneous treatment effects from observational data can preserve the rank ordering of the true heterogeneous causal effects. Such an approach is useful when observational data is large, the set of features is high-dimensional, and our priors about feature importance are weak. We probe the effectiveness of our approach in simulations and show a real-world example in a large-scale recommendations problem.
Published in Review of Economic Studies, 86(6), 2403–2452, 2019, 2019
We study the relationship between homebuyers’ beliefs about future house price changes and their mortgage leverage choices. Whether more pessimistic homebuyers choose higher or lower leverage depends on their willingness and ability to reduce the size of their housing market investments. When households primarily maximize the levered return of their property investments, more pessimistic homebuyers reduce their leverage to purchase smaller houses. On the other hand, when considerations such as family size pin down the desired property size, pessimistic homebuyers reduce their financial exposure to the housing market by making smaller downpayments to buy similarly-sized homes. To determine which scenario better describes the data, we investigate the cross-sectional relationship between house price beliefs and mortgage leverage choices in the U.S. housing market. We use plausibly exogenous variation in house price beliefs to show that more pessimistic homebuyers make smaller downpayments and choose higher leverage, in particular in states where default costs are relatively low, as well as during periods when house prices are expected to fall on average. Our results highlight the important role of heterogeneous beliefs in explaining households’ financial decisions.
Published in Journal of Urban Economics, 118, 103264, 2020, 2020
We use anonymized and aggregated data from Facebook to explore the spatial structure of social networks in the New York metro area. We find that a substantial share of urban residents’ connections are to individuals who are located nearby. We also highlight the importance of transportation infrastructure in shaping urban social networks by showing that social connectedness declines faster in travel time and travel cost than it does in geographic distance. We find that areas that are more socially connected with each other have stronger commuting flows, even after controlling for geographic distance and ease of travel. We also document significant heterogeneity in the geographic breadth of social networks across New York zip codes, and show that this heterogeneity correlates with access to public transit. Zip codes with geographically broader social networks also have higher incomes, higher education levels, and more high-quality entrepreneurial activity. We also explore the social connections between New York zip codes and foreign countries, and highlight how these are related to past migration movements.
Published in Social Informatics (SocInfo 2020), 2020, 2020
We use aggregated data from Facebook to study the structure of social networks across European regions. Social connectedness declines strongly in geographic distance and at country borders. Historical borders and unions — such as the Austro-Hungarian Empire, Czechoslovakia, and East/West Germany — shape present-day social connectedness over and above today’s political boundaries and other controls. All else equal, social connectedness is stronger between regions with residents of similar ages and education levels, as well as between regions that share a language and religion. In contrast, region-pairs with dissimilar incomes tend to be more connected, likely due to increased migration from poorer to richer regions.
Published in Working paper (arXiv:2101.04737), 2021, 2021
Due to their essential role as places for socialization, “third places” - social places where people casually visit and communicate with friends and neighbors - have been studied by a wide range of fields including network science, sociology, geography, urban planning, and regional studies. However, the lack of a large-scale census on third places kept researchers from systematic investigations. Here we provide a systematic nationwide investigation of third places and their social networks, by using Facebook pages. Our analysis reveals a large degree of geographic heterogeneity in the distribution of the types of third places, which is highly correlated with baseline demographics and county characteristics. Certain types of pages like “Places of Worship” demonstrate a large degree of clustering suggesting community preference or potential complementarities to concentration. We also found that the social networks of different types of social place differ in important ways: The social networks of ‘Restaurants’ and ‘Indoor Recreation’ pages are more likely to be tight-knit communities of pre-existing friendships whereas ‘Places of Worship’ and ‘Community Amenities’ page categories are more likely to bridge new friendship ties. We believe that this study can serve as an important milestone for future studies on the systematic comparative study of social spaces and their social relationships.
Published in Journal of International Economics, 129, 103418, 2021, 2021
We use de-identified data from Facebook to construct a new and publicly available measure of the pairwise social connectedness between 170 countries and 332 European regions. We find that two countries trade more when they are more socially connected, especially for goods where information frictions may be large. The social connections that predict trade in specific products are those between the regions where the product is produced in the exporting country and the regions where it is used in the importing country. Once we control for social connectedness, the estimated effects of geographic distance and country borders on trade decline substantially.
Published in American Economic Journal: Applied Economics, 14(3), 488–526, 2022, 2022
We use de-identified data from Facebook to study the nature of peer effects in the market for cell phones. To identify peer effects, we exploit variation in friends’ new phone acquisitions resulting from random phone losses. A new phone purchase by a friend has a large and persistent effect on an individual’s own demand for phones of the same brand. While peer effects increase the overall demand for phones, a friend’s purchase of a particular phone brand can reduce an individual’s own demand for phones from competing brands, in particular if they are running on a different operating system.
Published in Nature, 608(7921), 108–121, 2022, 2022
Social capital—the strength of an individual’s social network and community—has been identified as a potential determinant of outcomes ranging from education to health. However, efforts to understand what types of social capital matter for these outcomes have been hindered by a lack of social network data. Here, in the first of a pair of papers, we use data on 21 billion friendships from Facebook to study social capital. We measure and analyse three types of social capital by ZIP (postal) code in the United States: (1) connectedness between different types of people, such as those with low versus high socioeconomic status (SES); (2) social cohesion, such as the extent of cliques in friendship networks; and (3) civic engagement, such as rates of volunteering. These measures vary substantially across areas, but are not highly correlated with each other. We demonstrate the importance of distinguishing these forms of social capital by analysing their associations with economic mobility across areas. The share of high-SES friends among individuals with low SES—which we term economic connectedness—is among the strongest predictors of upward income mobility identified to date. Other social capital measures are not strongly associated with economic mobility. If children with low-SES parents were to grow up in counties with economic connectedness comparable to that of the average child with high-SES parents, their incomes in adulthood would increase by 20% on average. Differences in economic connectedness can explain well-known relationships between upward income mobility and racial segregation, poverty rates, and inequality. To support further research and policy interventions, we publicly release privacy-protected statistics on social capital by ZIP code at https://www.socialcapital.org.
Published in Nature, 608(7921), 122–134, 2022, 2022
Low levels of social interaction across class lines have generated widespread concern and are associated with worse outcomes, such as lower rates of upward income mobility. Here we analyse the determinants of cross-class interaction using data from Facebook, building on the analysis in our companion paper. We show that about half of the social disconnection across socioeconomic lines—measured as the difference in the share of high-socioeconomic status (SES) friends between people with low and high SES—is explained by differences in exposure to people with high SES in groups such as schools and religious organizations. The other half is explained by friending bias—the tendency for people with low SES to befriend people with high SES at lower rates even conditional on exposure.
Published in Working paper, 2023
Despite potentially large economic returns, rates of internal migration remain low in many developing countries. This paper uses new, de-identified data from Facebook to quantify the role of social networks in explaining this development puzzle. We study this question in India, a country that exhibits substantial wage dispersion across regions but remains relatively under-urbanized. Detailed records of nearly 20 million individuals on the evolution of social connections and residential choice reveal that networks and migration are strongly linked. Across several identification strategies, a model of migration suggests that social networks account for roughly 20% of the relationship between migration and distance. We develop a simple, static model of spatial equilibrium, which suggests that equalizing social connections across locations increases average wages by 3% (24% for the bottom wage-quartile) through increased migration. This impact is larger than fully removing the marginal effect of distance in migration decisions, akin to building rapid transport infrastructure. Taken together, our data suggest that - by reducing migration frictions - increasing social connections across space may have considerable economic gains. We provide suggestive evidence for economic and emotional support mechanisms underlying network effects and show that college attendance can boost the size and diversity of social networks by 20%.
Published in Proceedings of the National Academy of Sciences (PNAS), 120(28), e2211062120, 2023, 2023
Social networks shape and reflect economic life. Prior studies have identified long ties, which connect people who lack mutual contacts, as a correlate of individuals’ success within firms and places’ economic prosperity. However, we lack population-scale evidence of the individual-level link between long ties and economic prosperity, and why some people have more long ties remains obscure. Here, using a social network constructed from interactions on Facebook, we establish a robust association between long ties and economic outcomes and study disruptive life events hypothesized to cause formation of long ties. Consistent with prior aggregated results, administrative units with a higher fraction of long ties tend to have higher-income and economic mobility. Individuals with more long ties live in higher-income places and have higher values of proxies for economic prosperity (e.g., using more Internet-connected devices and making more donations). Furthermore, having stronger long ties (i.e., with higher intensity of interaction) is associated with better outcomes, consistent with an advantage from the structural diversity constituted by long ties, rather than them being weak ties per se. We then study the role of disruptive life events in the formation of long ties. Individuals who have migrated between US states, have transferred between high schools, or have attended college out-of-state have a higher fraction of long ties among their contacts many years after the event. Overall, these results suggest that long ties are robustly associated with economic prosperity and highlight roles for important life experiences in developing and maintaining long ties.
Published in Working paper, 2024, 2024
We develop a framework for group-based intervention targeting when learning spillovers are mediated by networks and apply it to place-based entrepreneurship policy using a new dataset linking U.S. entrepreneurs on Facebook to the firms they own. Motivated by evidence of disparities across aspiring entrepreneurs’ networks, the model combines network position with place-specific spillovers to predict local treatment effects. We estimate spillovers using a new quasi-experimental design that exploits the timing of friends’ migration. The estimated model implies that policies aimed at reducing geographic deficits in firm creation face an equity-efficiency trade-off due to large positive spillovers in high-entrepreneurship places.
Published in Journal of Political Economy Microeconomics, 2(3), 463–494, 2024, 2024
We use de-identified data from Facebook to study how social connections affect beliefs and behaviors in high-stakes settings. During the Covid-19 pandemic, individuals with friends in areas currently experiencing worse disease outbreaks reduced their mobility substantially more than their otherwise similar neighbors with friends in less affected areas. To explore the mechanisms through which social connections shape behaviors, we show that individuals with higher friend exposure to Covid-19 are more likely to publicly post in support of social distancing measures and less likely to be members of groups seeking to ‘reopen’ the economy. These findings suggest that friends influence individuals’ behaviors in part through their beliefs, even in the presence of ubiquitous information from expert sources.
Published in Working paper, 2025, 2025
Social capital is widely believed to impact a wide range of outcomes including subjective well-being, social mobility, and community health. We aggregate data on over 20 million Facebook users in the United Kingdom to construct several measures of social capital including cross-type connectedness, social network clustering, and civic engagement and volunteering. We find that social networks in the UK bridge class divides, with people below the median of the socioeconomic status distribution (low-SES people) having about half (47%) of their friendships with people above the median (high-SES people). Despite the presence of these cross-cutting friendships, we find evidence of homophily by class: high-SES people have a 28% higher share of high-SES friends. In part, this gap is due to the fact that high-SES individuals live in neighbourhoods, attend schools, and participate in groups that are wealthier on average. However, up to two thirds of the gap is due to the fact that high-SES people are more likely to befriend other high-SES peers, even within a given setting. Cross-class connections vary by region but are positively associated with upward income mobility: low-SES children who grew up in the top 10% most economically connected local authorities in England earn 38% more per year on average (£5,100) as adults relative to low-SES children in the bottom 10% local authorities. The relationship between upward mobility and connectedness is robust to controlling for other measures of social connection and neighbourhood measures of income, education, and health. We also connect measures of subjective well-being and related concepts with individual social capital measures. We find that individuals with more connections to high-SES people and more tightly-knit social networks report higher levels of happiness, trust, and lower feelings of isolation and social disconnection. We make our aggregated social capital metrics publicly available on the Humanitarian Data Exchange to support future research.
Published in Proceedings of the National Academy of Sciences (PNAS), 122(18), e2409418122, 2025, 2025
Existing estimates of human migration are limited in their scope, reliability, and timeliness, prompting the United Nations and the Global Compact on Migration to call for improved data collection. Using privacy protected records from three billion Facebook users, we estimate country-to-country migration flows at monthly granularity for 181 countries, accounting for selection into Facebook usage. Our estimates closely match high-quality measures of migration where available but can be produced nearly worldwide and with less delay than alternative methods. We estimate that 39.1 million people migrated internationally in 2022 (0.63% of the population of the countries in our sample). Migration flows significantly changed during the COVID-19 pandemic, decreasing by 64% before rebounding in 2022 to a pace 24% above the pre-crisis rate. We also find that migration from Ukraine increased tenfold in the wake of the Russian invasion. To support research and policy interventions, we release these estimates publicly through the Humanitarian Data Exchange.
Published in AEA Papers and Proceedings, 115, 132–138, 2025, 2025
We introduce, analyze, and describe subnational data on cross-gender friendships for nearly 200 countries and territories, using data from 1.38 trillion ties between 1.8 billion Facebook users. Homophily by gender exists nearly everywhere, with individuals’ strongest ties exhibiting less homophily than their peripheral connections. Across countries, cross-gender friendship rates align with existing measures of gender disparities. Within countries, cross-gender friending rates correlate with support for gender equality. In the US, cross-gender friendships are rarer in areas with a larger White share of the population, higher incomes, and more per-capita religious congregations. We share our data at the Humanitarian Data Exchange.
Published in Forthcoming, Journal of Political Economy, 134(4), 1159–1209, 2026, 2026
We use de-identified friendship data from Facebook to study the social integration of Syrian migrants in Germany. Our analysis establishes five key findings: (1) Places differ substantially in their propensities to socially integrate migrants. This regional variation in integration outcomes largely reflects causal place-based effects. (2) Spatial variation in migrants’ social integration can be decomposed into the rate at which Germans befriend their neighbors in general and the particular rate at which they befriend migrants versus other Germans. We follow the friending behavior of Germans that move across locations to show that both forces are more affected by local institutions and policies than by persistent individual characteristics or preferences of local natives. (3) Integration courses causally affect place-specific equilibrium integration levels by increasing the rate at which Germans befriend Syrian migrants. (4) Social integration helps migrants obtain help from natives across a range of settings such as finding jobs and housing. (5) Natives quasi-randomly exposed to a migrant in high school are more likely to befriend other migrants later in life.