Building social cohesion between Christians and Muslims through soccer in post-ISIS Iraq

Salma Mousa On 10 June 2014, the Islamic State of Iraq and Syria (ISIS) captured the Iraqi city of Mosul. ISIS’s offensive culminated in a genocide against Yezidis, Christians, Shi’a, and other minorities, displacing ~100,000 Christians to Iraqi Kurdistan overnight (1). Many Christians believe that their Muslim neighbors were complicit in these raids. These suspicions have discouraged Christians from returning to liberated areas, fueled support for self-defense militias, and heightened the potential for reprisal killings and future conflict (2). At the same time, Muslim communities from neighboring villages have been migrating into Christian enclaves, leading Iraq’s Christians to fear the dilution of their culture and identity (3). Christian-Muslim relations in northern Iraq continue to be marked by mutual distrust and social segregation.

How can social cohesion between groups be rebuilt after war? Intergroup social cohesion, patterns of cooperation among individuals from different social groups who live and work in close proximity (4), is considered key for good governance (5) and economic development (6). However, countries recovering from war often backslide into violence and instability despite heavy international investment in state-building and peacekeeping (7). Sustainable peace requires a combination of policy interventions, such as power-sharing arrangements, and grassroots initiatives that aim to improve interactions between individuals (8). Meaningful intergroup contact represents one such grassroots approach.

Here, I provide causal evidence on whether meaningful contact between groups can build social cohesion after war. Using a field experiment among Iraqis displaced by ISIS, I randomly assigned amateur Christian soccer players to an all-Christian team or to a team mixed with three Muslims for a 2-month league. The leagues largely met the conditions considered key for activating successful intergroup contact: Teammates had to cooperate to achieve their shared goal, players were subject to the equalizing effect of team sports, and local Christian leaders and organizations endorsed the leagues. This study thus serves as a proof of concept that near-optimal contact can build tolerant behaviors after violent conflict—at least toward those encountered in the intervention. The positive effects of contact among Christian study participants did not, however, generalize to Muslim strangers, highlighting a potentially important limitation of contact after war.

The “contact hypothesis” proposes that interpersonal contact across group lines can reduce prejudice if it is cooperative, places participants on equal footing, is endorsed by communal authorities, and is characterized by a common goal (9). Causal evidence shows that such contact reduces prejudice in several nonconflict settings by highlighting commonalities, forging friendships, lowering intergroup anxiety, and inducing empathy (10–12). On the other hand, wordless physical exposure has been found to exacerbate prejudice (13–16), and competitive contact has a similarly negative effect (17). These findings suggest that meaningful (positive and cooperative) contact might hold the potential to rebuild tolerance, at least in times of peace.

Should we expect contact to be similarly effective in conflict settings? Only a handful of contact studies involve groups in conflict (18, 19), in part because contact is more likely to be negative in these settings, which disproportionately shapes prejudice (20). The evidence we do have indicates that studies of ethnic prejudice generate “substantially weaker effects” relative to interventions aimed at reducing prejudice toward other stigmatized groups such as the elderly or the disabled, suggesting that the cleavages common to war are particularly rigid. Relatedly, ethnic violence solidifies group identities, ethnic prejudices, and anxieties around being physically proximate to the outgroup, further tempering expectations around the impacts of contact after war (21–27).

Methodological constraints also limit our knowledge of intergroup contact. Contact is most effective if its effects can be generalized to an entire outgroup rather than just to individuals encountered in an intervention (28). However, most contact studies determine the generalization of contact effects using self-reported attitudes measured immediately after the intervention (29). Policy-makers have subsequently questioned whether contact can change actual behaviors toward the outgroup in lasting ways (19). In response to this concern, I tested the generalization hypothesis using real-world behaviors.

Despite the differences between Christians and Muslims in northern Iraq, amateur soccer is popular among both groups. Scholars and policy-makers consider cross-cutting civic associations such as amateur sports clubs to be engines for social capital (30–33). Intergroup sports in particular exemplify the “positive, energetic, community events…centered on nonpolitical issues” (34) that faciltate the “sustained, meaningful interaction with members of different groups” recommended by policy-makers to integrate communities affected by ISIS (35).

Leveraging the social potential of team sports, the experiment comprised four soccer leagues spread across two waves and study sites (table S2 and fig. S5). Research staff invited Christian teams in two northern Iraqi cities to participate. Forty-two of the ~45 teams in the area were recruited on a first come, first served basis, resulting in a sample size that varied between 183 and 459 Christian players depending on the outcome (see the supplementary materials and methods). Captains were told that a local Christian community organization was working with a United States–based university to offer a soccer league for displaced people and to research their experiences. Participants were told that community-building was one of the leagues’ aims and, as such, each team would be allocated an additional three players who may or may not be Christian in an effort to include diverse groups. Treated teams received additional Muslim players drawn from local Muslim teams, whereas control teams received fellow Christians. Christian and Muslim added players were indistinguishable in baseline skill (table S4), and league guidelines ensured that they played roughly the same number of minutes per game (see the supplementary text). A total of 91.8% of contacted participants were retained until the end of the study, whereas the remainder dropped out before treatment assignments were made or because of injuries sustained during games. Because Muslims were only present in the treated group by design, they were excluded from the main analysis, although I measured changes in their attitudes over time.

Christians comprised the majority group in the study sites but remain an at-risk minority in Iraq, which has important implications for the dynamics of intergroup contact. To mitigate the power differentials between Muslims and Christians, and to increase the comfort of Christian participants, discussions between the research team and coaches concluded that Muslim players should remain a numeric minority (~25%) of each squad. This ethnic composition likely lowered perceived intergroup threat while preserving strong team identities (36). Further tempering intergroup anxieties, all Muslim participants had been displaced by ISIS regardless of denomination (45% Sunni Arab and 55% Shi’ite Shabak). This team composition likely biased in favor of a positive contact experience.

Limiting the study to internally displaced people also ensured that Christians would not come into contact with possible ISIS collaborators. It is important to note that Christians are marginalized (or persecuted outright) in many parts of the Arab world, where prejudice among some segments of the Muslim majority poses the larger challenge. The research team’s access to the Christian community, however, and the anticipated benefits of building social cohesion between displaced communities motivated the focus on Christian regard toward Muslims in this study. Other ethical considerations, and the steps taken to address them, are detailed in the supplementary text.

Participants were offered a baseline and end-line survey described as helping researchers understand the attitudes and experiences of displaced Iraqis. I used a baseline survey item capturing perceived commonalities with Muslims to conduct a block randomization. The item asked respondents to rate how much they had in common with Sunni Arabs on a four-point scale with no neutral option. I ranked each team based on their average response to this item and randomized within closely ranked pairs. Figure S1 and table S5 show the balance on baseline demographics and attitudes between the treatment and control groups.

In addition to these experimental leagues, I also created a comparison league without any Muslim players to explore the effects of no intergroup exposure at all. Assignment to this league was nonrandom, however, and these groups were not eligible to receive Muslim players. I used data from the comparison group for exploratory analyses.

This study investigated intergroup social cohesion as opposed to intragroup cohesion, which can be strengthened by war (4, 37). I focused on two core components of social cohesion: interactions with outgroup peers (what I label “on-the-field outcomes”) and interactions with outgroup strangers (“off-the-field outcomes”) (38, 39).

On-the-field outcomes captured tolerant behaviors toward teammates or league-mates. First, at the end of the season, participants voted for an added player to receive a “best newcomer” award based on sportsmanship. This player could not be on the respondent’s team. A positive treatment effect on this outcome signifies reduced ingroup bias (17). Second, the end-line survey asked players if they agreed to register for a mixed team next season. Third, the research staff contacted players 6 months after the league’s end to record whether they regularly trained with Muslims.

Off-the-field outcomes focused on behaviors toward Muslims outside of the intervention, capturing the intervention’s ability to overcome social segregation and build broader social cohesion. First, all players were invited to attend a neighborhood social event consisting of traditional dancing and dinner that took place up to 4 months after the intervention ended. Players were encouraged to bring their families and friends, meaning that Christians were confronted with the possibility of socializing not only with Muslim players, but also with Muslim players’ family members and friends. The outcome of interest was whether a player attended and, conditional on attending, whether he brought his female family members. Second, I instituted a voucher system to track whether treated players were more likely to patronize businesses in Muslim neighborhoods. All players receive an $8 (USD) voucher for a restaurant in Muslim-majority Mosul, a 40-min car ride away. Each voucher was stamped with the player’s individual identification number, was valid for 4 months after the intervention ended, and was stored by the restaurant manager when availed. Third, I recorded whether participants chose to donate their $1 survey compensation to a Christian organization (most commonly, one’s local church) or to a neutral nongovernmental organization that benefits both Muslims and Christians (e.g., a cancer ward or orphanage).

I also recorded a set of outcomes that focused on intergroup attitudes rather than behaviors. I combined similar survey items into an index to reduce measurement error. I did this using an unsupervised hierarchical clustering algorithm, a data-driven method for identifying latent clusters in survey data (described in the supplementary materials and methods). I then conducted a factor analysis on these clusters. The Cronbach’s alpha for each index was between 0.5 and 0.7, indicating strong internal consistency given the low number of items in each index. The resulting indices—now dependent variables—cover national unity, comfort with Muslims as neighbors, and blaming Muslim civilians for Christian suffering (Table 1).

For the main analysis, I estimated the average treatment effect on each behavioral outcome and attitudinal index, controlling for randomization block and other baseline covariates while clustering standard errors at the team level. Details about the statistical analysis are provided in the supplementary materials and methods. I prespecified all outcomes and analyses, except when otherwise noted as exploratory, in a preregistered analysis plan made available at the American Economics Association website under study ID AEARCTR-0003540.

Figure 1 summarizes the behavioral results. Looking first at tolerance on the field, treated players were 13 percentage points more likely to report that they would not mind being assigned to a mixed team next season (P = 0.044), 26 percentage points more likely to vote for a Muslim player (not on their team) to receive a sportsmanship prize (P = 0.003), and 49 percentage points more likely to train with Muslims 6 months after the intervention ended (P < 0.001). The training outcome does not merely capture the inertia of continuing to play with teammates: 15% of treated teams recruited Muslim players from other teams in the league or from the neighborhood. Qualitative evidence described in the supplementary text provides further evidence of the interventions’ positive effects on mixed teams, including newly forged friendships.

Fig. 1 Behavioral results.The intervention consistently improved on-the-field behavioral outcomes, with no detectable effects on off-the-field outcomes. The left panel shows covariate-adjusted mean outcomes for treated and control players, with covariates held at median or modal values. The right panel shows the difference between treated and control players, with 95% confidence intervals.

Moving to generalized tolerance off the field, estimated effects were smaller and not statistically distinguishable from zero. Treated players were not detectably more likely to attend a mixed social event or to patronize a Muslim-owned restaurant in Mosul up to 4 months after the intervention ended. Conditional on attending the social event, treated players brought their wives at almost identical rates as control players (fig. S6). Moreover, neither self-reported comfort in mixed neighborhoods nor trust in Muslims to receive a cash transfer on one’s behalf improved, further suggesting little change with regard toward Muslim strangers (fig. S6).

Exploratory analyses revealed two factors that may have amplified the behavioral effects of contact. First, the treatment improved generalizable behaviors relative to the comparison league, suggesting a beneficial effect to simply having Muslim players in the league itself (figs. S7 to S9). Second, treatment effects were strongest among the most successful teams, operationalized by reaching the final (table S7). Success alone had little effect on tolerance, but playing on a successful, mixed team built tolerance toward Muslim strangers (table S7).

Like behaviors toward Muslim strangers, personal beliefs also proved difficult to change. I found no effect of the treatment on Christians’ reported comfort with Muslim neighbors or blame directed at Muslim civilians for Christian suffering. I did observe a positive treatment effect for the national unity index of 0.43 standard deviations (P < 0.001), driven by an item reflecting the view that ethnic and religious divisions are arbitrary (fig. S6). Relative to the other indices, the national unity index mostly captures abstract attitudes rather than beliefs about specific outgroups. I found a similar pattern in an exploratory analysis of local residents (n = 121): Exposure to the leagues correlated with a stronger belief in the arbitrariness of group-based divisions, pointing to potential spillover effects among fans (table S11). Among all main analyses, the effects on on-the-field behaviors and the national unity index survived the Benjamini-Hochberg multiple-comparisons correction at the 0.10 level (40) and remained broadly consistent under permutation tests (fig. S10), across study waves (fig. S11), and with block-bootstrapped standard errors (table S1).

As an exploratory analysis, I tested directly whether treatment effects were stronger for on-the-field behaviors relative to off-the-field behaviors. To do this, I estimated the average treatment effect on engaging in at least one, at least two, and all three on-the-field behaviors, and repeated this exercise for off-the-field behaviors. I then tested these treatment effects against each other. As shown in Fig. 2, treated players were consistently more likely to engage in on-the-field behaviors (however measured) relative to control players, whereas the two groups engaged in off-the-field behaviors at similar rates. The intervention’s added boost for on-the-field outcomes relative to off-the-field outcomes is estimated at ~30 percentage points or more (P ? 0.05).

Fig. 2 On-the-field versus off-the-field behaviors.The intervention shifted the probability of engaging in at least one on-the-field behavior more than it shifted the probability of engaging in at least one off-the-field behavior. The same was true of engaging in at least two behaviors or all three. The left panel shows covariate-adjusted means for the control and treatment groups separately for on-the-field and off-the-field outcomes. The right panel shows differences between on-the-field treatment effects and off-the-field treatment effects (i.e., difference-in-differences), with 95% confidence intervals generated using a block-bootstrapping approach (see the supplementary materials and methods).

Naturalistic studies of contact, especially those that involve competition or take place in the aftermath of war, are likely to involve some amount of negative contact experiences that could negatively affect outcomes (17, 19). Proxying for aggression using yellow and red cards, I did not find evidence of increased hostility among those on all-Christian teams. Table S12 demonstrates that the prevalence of cards did not differ across match types: Matches that brought together all-Christian teams with mixed teams were not more hostile than matches between two treated or two control teams. Moreover, I did not detect differences in the number of cards based on the referee’s religious identity (table S13). I also ruled out backlash effects among control participants and Muslim players. Analyzing changes in attitudes over time, neither control participants nor Muslim players became more prejudiced (figs. S12 and S13). These analyses suggest that competitive contact does not worsen prejudiced attitudes but does not alleviate them either.

Ongoing civil wars in the Middle East and Africa, persistent sectarianism across the Arab world, and a dearth of policies aimed at reintegrating communities hit by ISIS in particular have reinvigorated the question of how to build social cohesion in the wake of violence. Despite the potential of intergroup contact, we know little about whether it can build lasting, real-world behavioral change, especially after war. This study provides causal evidence regarding both of these questions. I found that Christians assigned to compete on a soccer team with Muslim teammates were more likely to engage in tolerant behaviors toward Muslim peers encountered in the intervention up to 6 months after the intervention ended. These improvements did not come at the cost of exacerbating prejudice among the control group, as has been found in other studies of Muslim-Christian contact (21).

Contact was systematically weaker, however, at shifting generalized tolerance toward Muslim strangers. Several factors help to explain why contact effects did not generalize. This pattern could be symptomatic of how contact operates more broadly, compelling researchers to measure long-term, actual behaviors to understand the conditions under which contact effects extend to an entire outgroup [(41); for exceptions from nonconflict zones, see (42) and (43)]. The distrust, hostility, and trauma ingrained by war likely makes generalization all the more challenging. Groups recovering from violent conflict often continue to feel that their well-being, resources, and identity are under threat, conditions that run in stark contrast to the ideals of positive, cooperative contact (44). The quality of contact is particularly important in these settings (45, 46). In this vein, the data suggest that those on successful teams were able to unlock improved behaviors toward outgroup strangers, further indicating that an exceptionally positive experience may be needed to overturn the negative experiences instilled by war and pointing to a fruitful avenue for future work (41). Postconflict settings could also exacerbate the role of minority status, which is known to dilute contact effects (47, 48). Christians remain a targeted minority in Iraq, potentially making generalization of positive effects more difficult relative to members of advantaged groups.

Behaviors toward known contacts versus strangers also differ in costliness, shedding light on this pattern of results. Driving to Mosul entails higher psychological (e.g., intergroup anxiety) and economic (e.g., time and fuel) costs relative to measures such as voting for a peer to receive an award, for instance, possibly decreasing the sensitivity of this class of outcomes. Relatedly, some behaviors may be easier to shift relative to self-reported attitudes, a pattern echoed in other prejudice reduction studies, especially among victims of conflict (21, 45, 49).

Even if contact effects do not generalize to the entire outgroup, strengthening ties between peers could still build resilience and prevent future conflict. I found descriptive evidence of tolerant social norms among local residents most exposed to the leagues, pointing to the potential for spillover effects. Future work should explore the extent to which localized cohesion can shield these communities from future shocks to tolerance, such as a resurgence in ethnic violence or prejudicial rhetoric by political entrepreneurs. Providing causal answers to these questions can inform the hundreds of millions of dollars allotted by the U.S. Agency for International Development (USAID) for civil society, conflict mitigation, and peace stabilization activities in 2020 (50) and the billions of dollars spent globally on peacebuilding programs (51).