Scale, context, and heterogeneity: the complexity of the social space | Scientific Reports – Nature.com

Posted: June 3, 2022 at 12:38 pm

Results are organized in two sub-sections: (4.1) Community detection from the Twitter dataset, and (4.2) Spatial patterns based on income composition data extracted from the US Census Bureau dataset.

The extraction of global communication patterns can be inferred from Twitter. Figure1 shows Twitter activity on a global scale. Nodes correspond to the physical location where tweets were sent. These nodes are displayed in a heatmap where yellow to red regions concentrate more activity. The map obviously shows certain similarity with a global population map, where the most densely populated regions are highlighted. However, some particularities are evident. China, India, and other countries located in southeastern Asia show much lower node densities in comparison to their actual demographic weight. One potential explanation is because the accessibility to the Internet is restricted for people with limited resources, but also because they have their own national microblogging services and social networks. In the case of the African continent, its actual demographic weight is clearly underestimated.

Global heatmap showing tweets location. Main hotspots are shown in red. This figure was generated using ArcGIS Desktop and Mapbox.

The interconnectivity between nodes is estimated based on mentions or retweets. It is represented by flows (Fig.2a). The spatial networks show different configurations based on the regions where tweets come from and go to. Thus, a very dense global network emerges between the United States, Europe, and some other particular hotspots. Additionally, we observe other densely connected local networks in small regions such as Japan, East Asia, India, South Africa, and some particular regions in Latin America. Zooming into the United States mainland (Fig.2b), the interconnectivity between both coastline sectors and the entire eastern half of the country becomes more evident.

Flows of interconnectivity between users via mentions or retweets: (a) global scale, and (b) the United States mainland. To make the display clearer, we apply a thinning algorithm for reducing line densities. We reduce the total number of links by 90% in (a) and 95% in (b). This figure was generated using ArcGIS Desktop and Mapbox.

These maps reflect the spatial heterogeneity of global human dynamics. The traditional dominance of some western countries is also reflected in Twitter. Nodes summarize regions where the wealth is concentrated. Thus, the global networks topology is a proxy for dominant mobility flows traced by migrations, commercial relationships, and preferred trade routes across the globe. This interconnectivity shows aspects related to cultural dominance, where the majority of the western countries share the same entertainment industry (and knowledge of a common language). Local networks represent regions with enough cultural, geopolitical, or economical affinity for ruling communication within their influence areas.

The collective identity is the common structure of beliefs, values, symbols, and behaviors that result from our association in communities. Axelrod35 argues that thecollective behavior is mostly determined by an evolving and complex process of human interactions and information accumulationover time. We learn by imitation and therefore, weare prone to become similar to those that we are exposed to and frequently interact with. Initial differences between communities behaviors are reinforced over time, which leads to their eventual divergence and the emergence of multiple cultures.

Fragmentation and clustering of thesocial space allows to detect communities where people preferably interact with each other by defining the way that trade routes are predominant. Hedayatifar et al.32 found a significant correlation between the level of communication and the topology shown by international trade networks. Geographical distances and neighborhood relationships are two relevant factors36, but not the only ones. Historical past, geopolitical relationships, and cultural influence between countries are equally important for understanding the map of global interactions on Twitter.

For community detection, we map the Twitter dataset into a lattice composed of a regular grid with 100km wide cells. We then run the Louvain algorithm to partition the whole network into regions and to identify the clusters with the highest interconnectivities. On a global scale, we identify 14 major communities (Fig.3a) and 86 minor communities or sub-communities from the subsequent fragmentation of the first ones (Fig.3b). Clusters imply specific cross-cultural, cross-national, and/or cross-linguistic associations. In the Americas, three large communities are differentiated due to language. But once we run the algorithm in successive iterations, minor communities emerge in Latin America. It is noticeable that throughout the fragmentation process some of the clusters are equivalent to nations (Brazil, India, and several European countries), whereas in the case of the United States, it is internally partitioned into different sub-communities within, showing a rich cultural diversity.

Node clustering and detection of communities/sub-communities at a global scale from Twitter dataset. We detect (a) 14 major communities and (b) 86 sub-communities. Consistent partitions were obtained over 85% of realizations. To make the display clearer, we apply a thinning algorithm for reducing line densities. We reduce the total number of links by 90%. This figure was generated using ArcGIS Desktop and Mapbox.

Further zooming into the dataset and running the algorithm for a particular region allows us to refine the results. In Fig.4, we show communities and sub-communities across the mainland United States. In this case, we overlay a grid of 10km wide cells (see Hedayatifar et al.33). As it is evident, the internal clustering differs according to the scale, showing relevant changes in both the number and size of the partitions. The reduced number of communities detected corresponds mostly to vast regions surrounding the most populated cities in the US central states (Fig.4a). Most communities are far more extensive than their own states, which is obvious as the number of communities is lower than the number of states. This effect is particularly clear with the integration of North and South Carolina into one single cluster, but also in New England. On the other hand, the state of California is internally partitioned into two different communities due to the influence of San Francisco in the north and Los Angeles in the south. At this scale, the number of sub-communities increases substantially up to 216, as shown in Fig.4b. Again, some states show a clear homogeneity with a unique dominant cluster (Maine, Montana, and Wyoming), whereas the great majority show a clear diversity of sub-communities inside.

Node clustering and detection of communities/sub-communities from Twitter dataset in the United States mainland. We detect (a) 39 major communities and (b) over 216 sub-communities. Consistent partitions were obtained over 85% of realizations. To make the display clearer, we apply a thinning algorithm for reducing line densities. We reduce the total number of links by 40%. This figure was generated using ArcGIS Desktop and Mapbox.

The Louvain algorithm for community detection dynamically fragments the territory, showing its spatial heterogeneity across different scales. Thus, to properly understand the complex reality, we must first understand the spatial context where the algorithm is applied. For instance, the human interactions captured by the Twitter dataset transcend the traditional administrative boundaries. Zooming into multiple scales allows us to understand much better such interactions, and their effect on the markets, commercial agreements, and business opportunities, or even to avoid conflicts. The scalable structure of communities was recently used for implementing adaptive responses to COVID-19 restrictions in the United States. Buchel et al. 37 proposed to consider multiscale social bubbles for lifting shelter-at-home and mobility restrictions. Dwellers created social bubbles to minimize infection rates locally, while the different US states proposed travel zones to minimize transmissibility between remote areas. The analysis of mobility patterns has contributed to define the limits of human interactions and to assess the effects of the policies adopted by authorities, providing valuable information to policy-makers for adopting more effective travel restrictions, as well as quarantine policies that minimize the disruption of socio-economic activities.

Some of the most influential factors behind the complexity of thesocial space are related to household income. From a social andbehavioral perspective, income determines our lifestyle and world perception. Eagle et al.38 demonstrated that wealthy people travel more frequently and to more places. There is also a positive payoff in some cities between commuting farther for better jobs, while keeping better housing conditions39,40. Other studies analyze the correlation between social diversity and economic prosperity. Yong41 showed how the wealthiest regions develop much more complex and heterogeneous social networks where the emergence of labor opportunities can occur more easily.

Depending on the spatial scale and level of data aggregation, income composition allows to differentiate between an urban world, increasingly dynamic and wealthy, and a rural world in crisis. However, at local scales, we can also observe how some well-known urban regions arerelatively poor, and some rural regions are relatively wealthy. In this way, the spatial scale is very relevant for properly understanding the complexity behind income-related human dynamics.

In the last few decades, the ideological and political division between rural and urban regions has escalated in the United States and other western countries42,43. Many policy experts attribute the spread of reactionary movements against globalization to the increasing confrontation between rural and urban voters. Just a few years ago, Brexit or the Trump victory in the 2016 US Presidential election were the most notable examples of these reactionary movements. Traditional division between American voters shows an evident spatial pattern that is always mentioned in media: while the Democratic Party concentrates most of its votes in the urban regions in the two coastlines, the Republican Party is the most voted in the central states. However, this spatial pattern is more complex than a simple division between the rural and urban America, especially in the face of a very polarized electoral scenario44,45.

Results from the 2016 US Presidential election showed that rural people accounted for only about 15% of the national population. Although rural voters preferred Trump and they certainly contributed to Republicans victory, they were not enough to swing the elections results on its own nor to support the media rhetoric of a rural revolt46. Instead, Trump combined rural and small city over-performance in the industrial midwest. In other words, Trump voters were not so rural. In fact, the majority of Trump voters came from suburban areas where dwellers commute to work in some medium or large city. However, this spatial pattern diverges depending on the context and other additional factors. For example, the Latino vote in Florida is different from other statesin the US. This pattern is particularly explained by the importance of Latin American voters, some of whom are residents with mediumhigh incomes living in the most important cities. Politically, neither the Blue America is so blue, nor the Red America is so red in political terms.

Figure5 shows the income composition in the United States by considering the influence of the mesh size in data aggregation. Each individual node corresponds to a census tract, whose area is roughly equivalent to a neighborhood with 25008000 people. Data aggregation is conducted by applying a circular buffer whose radius ranges from 2 up to 1000km. Mesh size considers all the nodes within the buffer, showing an interconnected effect all in all. The larger the buffer size, the higher the computational costs are.

Income composition in the United States mainland by considering different aggregation levelsfrom 2 to 1000km. Regions below national average income are shown in blue, while regions above national average income are shown in red. White-colored regions that emerge as gaps show regions with similar values to the national average. This figure was generated in Python using the library #Cartopy.

Different income compositions emerge according to the level of data aggregation. With smaller buffers, a very granular pattern shows a high entropy and spatial diversity. As buffer size increases, complete cities emerge as wealthy areas in contrast to poorer and extensive rural areas showing an evident polarization of urban versus ruralregions. Significant differences in wealth between cities emerge at a aggregation distance of 100km. Larger distances draw the Eastern and Western sectors as the only wealthy regions, whereas Central America is shown as a large economically deprived region.

On the other hand, zooming into New York City (Fig.6) we can understand much better the income composition across intra-urban scales. At short distances, neighborhoods in blue have a low average income. However, these fade with buffers larger than 20km showing the whole city as a wealthy region.

Income composition in New York City by considering different aggregation levelsfrom 2 to 1000km. Regions below national average income are shown in blue, while regions above national average income are shown in red. White-colored regions that emerge as gaps show regions with similar values to the national average. This figure was generated in Python using the library #Cartopy.

A similar approach is conducted for analyzing income composition across the United Statesover time. Figure7 shows the evolution from the year 1969 to 2017 considering six individual years and two unique aggregation levels: 100 and 1000km. The methodology used is the same as applied before, but instead of census tracts, we estimate spatial patterns using counties due to data limitations. This exercise enables us to validate the previous results obtained with the census tracts, but also to substantively reduce computational costs due to the lower number of nodes.

Income composition in the United States mainland over time by considering different aggregation levels: (a) 100km and (b) 1000km. Six years are represented: 1969, 1979, 1989, 2000, 2010, and 2017. Regions below national average income are shown in blue, while regions above national average income are shown in red. White-colored regions that emerge as gaps show regions with similar values to the national average. This figure was generated in Python using the library #Cartopy.

At an aggregation distance of 100km, we can observe the high spatial diversity between rich and poor regions. The wealth concentration is mostly dominated by the metropolitan areas showing the division between rural and urban regions. Just a few cities concentrate most of the national wealth47. In general, we can observe how poverty and wealth present a consistent structure over time. The poorest regions located in the southeastern sector remain poor, whereas the wealthiest regions located in the northeast coast corridor and California coastline remain wealthy over time. However, the boundary between poverty and wealth has been shifting over time. In particular, certain regions located in the central states have fluctuated between wealth and poverty over time. Additionally, some cities have collapsed at some point, leading to an impoverishment of the surrounding regions due to their high dependence on those cities. In network science, this demonstrates the high collapse risks in hyper-connected systems motivated by cascading effects. This is particularly significant in the Detroit region, which was wealthy in the past, but it became increasingly poor in recent years.

At larger aggregation distances, the income composition is enormously simplified showing 23 vast regions whose borders have shifted over time. Wealth is mostly concentrated on the East and West coastlines, whereas the central region is mostly distressed. In the most recent decades, industrial relocation processes and the strong attraction of the most populated cities explain the decline of vast inland regions.

See original here:

Scale, context, and heterogeneity: the complexity of the social space | Scientific Reports - Nature.com

Related Posts