Throughout the evolution of the internet, digital social technology companies have addressed problems on their platforms with a variety of product interventions and policies. Most platforms have adapted Community Standards — the “rules” for participation — that administrators use to determine what counts as appropriate or inappropriate content, behavior, or actors [13, 28, 66]. When a user violates these rules, platforms typically take an action against the user as punishment or to encourage correction. From educational warnings, to feature rate-limiting, to temporary suspensions, to permanent bans, issuing consequences for violating behaviors is a core tenet of the moderation process [4].
Consequences have become a standard practice against violating actors, behavior, and content online because of their perceived effectiveness [29]. Within the academic literature on content moderation, there have been many studies that suggest the effectiveness (or ineffectiveness) of different types of consequences, and some of these studies may differ from the expectations of studies done offline within criminology and related disciplines [53]. For example, permanent bans have causally resulted in significant changes to user behavior or the variety of content in a platform’s ecosystem [17, 20]; similarly, warnings causally reduced problematic behavior of users [74]. However, field experiments that compare the relative effectiveness of different implementations of consequences in digital spaces are rare.
More specifically, few studies look at the impact of the timing of consequences — such as frequency, duration, or sequence. While there may be key differences between a “softer” intervention like an ephemeral warning compared to a “harder” consequence like a permanent ban, there are still limited empirical insights about the relative effects of different consequence types. For example, is a warning more effective when sent twice rather than when sent only once? Is a suspension for a week more impactful than a suspension for a day?
In addition to the specifics of the intervention’s impact, there is the related question of how it changes behavior over time. Many interventions in the literature have tried to measure the sustaining or waning effects of such interventions over long periods after the intervention occurs [26, 36, 47]. But as there have been limited studies on the relative effects of different consequences, there are also subsequently limited studies on how long the effects of a given consequence might last.
Finally, there is also the question of heterogeneous effects of duration on different user populations. For example, does the length of a consequence affect first-time violators differently than repeat violators? Understanding differential impacts directly informs platforms’ strategies to not only try to reform individual users, but also to create strategies against violators who might be more dedicated in their repeat behaviors.
Investigating the relative impact of consequences is important because we need to know what types of consequences are most effective to combat problematic actors, behavior, or content. In the field of Trust & Safety [33], many product and policy teams at platforms have designed and implemented consequence frameworks [5]. These teams usually desire to reduce the amount of problematic actors, behavior, or content in general. Perhaps more importantly, they strive to prevent repeat offenses from people who are allowed to remain on the platform. These desires are particularly important as platforms begin to move away from outright banning participants to adopting newer perspectives around restorative justice, ones that encourage behavioral reform [18, 49, 73]. These policy-to-enforcement frameworks, however, may be grounded more in intuition or hypotheses [6, 15], and not necessarily backed by applied empirical evidence of consequence effectiveness. Therefore, creating evidence of consequence effectiveness can help policy and product teams determine best practices for establishing more successful systems to reduce initial and repeat violations.
This paper describes the result of two field experiments that test differences in consequence durations. Specifically, the experiments vary the duration of temporary suspensions — enforced in response to a range of violations on a popular social platform — in order to test the relative effectiveness of different consequence durations on key safety- and engagement-related user behavior outcomes. Measuring both of these outcomes implies a trade-off between reducing future harmful behavior and excessively punishing users, especially users who have a strong potential to reform.
This study investigates these trade-offs through the following research questions:
RQ1: What is the effect of suspension duration on offending users’ subsequent offending and engagement?
RQ2: How does the effect of suspension duration on offending users’ subsequent offending evolve over time?
RQ3: What is the effect of suspension duration on first-time vs. repeat violators, in terms of both subsequent offending and engagement?
2 Related Work
2.1 Suspensions as Consequences
Consequences are one key piece of a spectrum of interventions for platform governance or “content moderation” processes [15, 56]. Generally, a consequence acts as a punishment for an offense, in a general theory of incentives towards behavioral change [53]. While a platform may enable different types of interventions against violating actors, behavior, or content, a consequence specifically is an action taken by a platform in response to a violation, usually of the platform’s rules, policies, or Community Standards [66]. Interventions may be “proactive” (i.e., addressing the problem before it appears widely to other users across the platform; identified, for example, through machine learning classification) or “reactive” (i.e., addressing the problem after it appears to other users; identified, for example, through reporting mechanisms filled out by users) [35]. In both cases, if a problem is identified as violating the platform’s rules, a consequence may be issued against the user’s account.
For online platforms, consequences — sometimes referred to as “enforcement action” [4] or “remedies” [32] — are an integral part of the Trust & Safety field in the contemporary technology industry, to address a range of policy-violating issues. And internet technology sites, apps, and platforms have executed a wide range of consequences, ranging from warnings to removals [32]. Suspending — or the ability to “remove content temporarily [or] prevent users from accessing their accounts temporarily — from anywhere between minutes and forever” [32] — remains one of these key interventions.
Suspensions then act as a temporary intervention either against an entire account or against use of a particular feature of the platform (e.g., temporary inability to view, upload, or edit content; ability to make money; etc.). And suspensions have become so essential that even the Santa Clara Principles on Transparency and Accountability in Content Moderation (2018) recommends that platforms “publish the numbers of posts removed and accounts permanently or temporarily suspended [authors’ emphasis] due to violations of their content guidelines” [1]. In practice, though, suspensions may be enacted in any particular form by the administrators of the technology platform.
2.2 Effects of Consequences
The effect of consequences on digital platforms in general is clear: a range of different consequences can reduce offenses, as well as reduce recidivism. Warnings work to reduce hateful language [74]. Restricting access reduces new user acquisition in hateful communities [16]. Comment deletion on Facebook decreased subsequent rule-breaking, while just hiding comments didn’t have an effect [36]. And comment deletion on Reddit reduced immediate noncompliance rates [67]. In general, a variety of different interventions — especially permanent ones — result in positive outcomes.
Two types of “account access” consequences — bans and suspensions — can also be effective. Multiple studies support the same conclusion for bans: they can have significant impact on violations and recidivism. For example, closing subreddits on Reddit resulted in many accounts stopping activity, and remaining accounts decreased hate speech usage by 80% [17]. In another study of Reddit, removing 2000 communities led to 15.6% of those communities’ participants leaving Reddit. However, 6.6% of remaining participants reduced toxicity, while 5% increased toxicity [20]. Deplatforming on Twitter not only reduced conversation about the deplatformed users, but also overall activity and toxicity of supporters [39].
Suspensions, of course, are another type of consequence that are stronger than a warning but less severe than a permanent ban. Suspensions allow platforms to take temporary enforcement actions against users that restrict access to all features of the platform for a predetermined amount of time [52]. Reactivation of the account may also involve the completion of some action, like verification, modifications, or acknowledgement.
Suspensions can also be effective, but there is limited work on these temporary consequences. While this paper has already referenced studies that demonstrate the impact of permanent bans in digital platforms, there is only one paper that specifically investigates differences in temporary suspensions (and only with observational data). This study finds that “all suspensions, including 24-h, 48-h and permanent suspensions … decreased the offense probability of suspended offenders after suspension and increased the latencies of their offenses after suspension” [75].
As one addendum, it is important to note that the digital environment poses a unique challenge to “access”-type consequencing: the ability for someone to migrate. Users who create an account on one platform, when suspended or banned, may migrate to either another account on the same platform, to other communities within the same platform (e.g., [14]), or to other similar platforms entirely [37]. Multiple accounts are not uncommon on social technology platforms [42, 43, 55]. Migration is found to be common [31], and some studies find that user migration can lead to increased toxic behavior [38, 64].
2.3 Relative Effects of Suspensions and Different Suspension Durations
While suspensions can be effective, it’s less clear how the duration of a suspension impacts recidivism. In offline criminology research, reviews of the literature suggest that differences in punishment severity do not appear to be effective at deterring crime [53]. Further, incarceration length’s impact on recidivism “appears too heterogeneous to draw universal conclusions,” even if some studies point to longer sentences potentially being more likely to deter crime [10].
While criminology is helpful for developing initial hypotheses, some literature from smaller, temporary offline behavioral interventions might suggest some additional directions. For example, a study of timeouts on children found that 15 and 30 minute timeout periods “produced a 35% decrease in deviant behavior,” far more than a “1 min [period which] resulted in an average increase of 12%” [71]. However, the study also found that there was “little difference between the effectiveness of 15 and 30 min” [71]. Similarly, longer suspension periods (91-180 days) for driving offenses led to lower offense ratios than shorter suspension periods (1-30 days; [27]). However, school suspensions on students that were more severe did not deter those students from misbehaving in the future (and might have made reoffense worse for younger students; [45]).
Given there are limited papers on suspensions, the literature on digital consequences and other online interventions points to some evidence of lasting effects. Warnings, for instance, have been shown to be effective across some periods of time. One experiment on Twitter found that “the act of warning a user of the potential consequences of their behavior can significantly reduce their hateful language one week after receiving the warning” [74]. Other papers in the misinformation studies space point to similar lengths. For instance, one study of inoculation found intervention effects that lasted up to 3 months [47]. In another misinformation study, where eight different interventions were compared, the authors found that “the interventions also differed in the duration of their effects...interventions focused on teaching new skills (inoculation and media literacy) showed less decay than interventions that labeled specific pieces of content as reliable or unreliable (preemptive fact checking, source credibility, warnings)” [26]. Deletion of comments were effective, and “the effect on...lowered rule-breaking behavior lasted longer than the effect on continued commenting behavior” [36].
Moving to online environments, there is little evidence of differences around the behavior of first-time vs. repeat violators, as well as the potential differential impact of interventions on recidivism across both groups. Some descriptive research demonstrates that “a small number of individuals typically accounts for the vast majority of the behavior” [57]. However, research on reactions to content moderation processes have shown that users — and one may surmise it is the case especially for not-yet-violating users — often do not fully understand reasons for removal [50], the consequence itself, or how to contest the system’s decision [69]. Therefore, it may be that first-time violators are more-strongly impacted by consequences, and especially so by longer suspensions compared to shorter ones.
Roblox also provides a layer of safety features, such as content moderation, reporting tools, and — important to this work — interventions and consequences. Roblox’s moderation systems mirror those of many other social technology platforms. Roblox has extensive Community Standards that span safety, civility, integrity, and security, with the goal of “always ma[king] it a key priority to ensure...community members can connect, create, and come together in a space that is welcoming, safe, inclusive and respectful” [62, 63]. Violations of these policies can result in enforcement actions (i.e., consequences), such as warnings, content removal, account- or feature-level restrictions, or permanent deletions. Violations are identified and consequences are issued based on a combination of user-submitted reports [61], automatic classifiers [11], and human moderators. Roblox’s transparency report has more details on the platform’s moderation and enforcement processes [58].

Figure 1: Example of a message sent to a user following an enforcement action, after they were reported for Bullying or Harassment.
Some of Roblox’s community standards deal with extremely serious violations, which were excluded from these experiments. Further, the experiments described in this paper only include users aged 13 years old or over. The platform’s Terms of Service dictates participation in product experiments, and data used in the study was de-identified and analyzed in aggregate. The benefits of the experiment greatly outweigh any possible harms, given the relatively short duration of the consequences issued (varying suspension times by definition changes the consequence length per user, but the consequences used in this study are not permanent nor particularly lengthy). Field experiments are especially valuable for designing tools to understand online communities better that can help improve platforms across the internet as a whole, since platforms can utilize this data and information to inform their own product decisions or policy enforcement effectiveness.

Figure 2: The left sub-figure depicts the randomization and analysis unit: a user, which represents a cluster of one or more platform accounts associated with a specific person. The right sub-figure depicts the analysis window, which begins at the time the user’s suspension starts on a given account.
Users were divided into mutually exclusive subsets, such that some users were only eligible for Experiment 1 and others were only eligible for Experiment 2. Participants were randomized and analyzed at the user-level, where a “user” represents a cluster of one or more platform accounts associated with an individual person, as defined by Roblox’s internal detection systems [59]. The experiment analysis window for each user begins at the time the user’s suspension starts on a given account, after a moderator files the enforcement action. Importantly, the analysis window does not begin at the time that the account’s suspension expires, because users with multiple accounts [43] could still conceivably access the platform on alternate accounts during the suspension period.
The experiment data was analyzed with CUPED (Controlled Experiment Using Pre-Experiment Data; [24]) to improve the sensitivity of the experiments. CUPED incorporates pre-experiment data to reduce variance in the outcome metric. Variance is reduced by a factor of R2, where R2 is the proportion of variance explained by a regression of the outcome metric on pre-experiment covariates. Specifically, for each outcome, one covariate is used: the same metric over a 7-day pre-experiment window. This approach reduced variance substantially for count-based outcomes (i.e. by a factor of 0.830 for number of consequences in Experiment 1 and a factor of 0.985 for total time spent in Experiment 2). Using CUPED, both experiments were powered to detect a 1.5% change in reoffense rate, as well as a 1.2% change in total time spent in Experiment 1, and a 1.4% change in time spent in Experiment 2.
Spillover across alternate accounts s accounted for by cluster-randomizing at the user-level. This correction is important because social network users often maintain multiple accounts, especially in online gaming settings [7, 43]. Furthermore, the treatment (suspension duration) is likely to strongly affect a user’s propensity to switch to a different account. However, it’s important to note that this clustering does not incorporate any information about peer relationships, such as friend group structures or co-play communities. Thus, if a user’s potential outcomes are affected by the treatments of their friends, we have a SUTVA violation. For example, there may be peer comparison effects: i.e., a 1-day suspension may be less of a deterrent if Roblox issued my friend a 1-hour suspension. Alternatively, interference might spread through the discussion of punishments on public forums, which is known to be a popular topic.
In order to assess the strength of evidence for interference across friends, we employ Aronow’s ex post test for interference in randomized experiments [2]. This test measures the dependence between outcomes for a fixed subset of users and the treatment statuses of other users, and compares it to the null distribution that would occur if there were no indirect effects. Specifically, our fixed subset of users is a 50% sample of the 1-day suspension group, and our test statistic is the Spearman rank correlation between the reoffense outcomes of these users and their number of Roblox friends who received a 1-hour suspension. If having more friends who receive 1-hour suspensions reduces the deterrence of a 1-day suspension, the observed correlation should be greater than the vast majority of draws from the null distribution. However, we find that this is not the case: the p-value is 0.152. Thus, we do not see strong evidence for interference across friends in experiment 1.
Outcome Variable | Mean | Standard Deviation |
Reoffense rate | 0.27 | 0.44 |
Number of consequences | 0.67 | 7.29 |
Number of reports against | 4.57 | 98.33 |
Time-to-reoffense (hours) | 137.15 | 134.65 |
Days active | 11.24 | 7.32 |
Total time spent (hours) | 134.37 | 3005.28 |
Table 1: Descriptive statistics for experiment 1 outcome variables (1-hour suspension group).
Outcome Variable | Mean | Standard Deviation |
Reoffense rate | 0.47 | 0.50 |
Number of consequences | 1.64 | 20.12 |
Number of reports against | 10.70 | 201.34 |
Time-to-reoffense (hours) | 125.61 | 120.77 |
Days active | 12.39 | 7.29 |
Total time spent (hours) | 333.83 | 8065.01 |
Table 2: Descriptive statistics for experiment 2 outcome variables (1-day suspension group).

: Effects of 1-day suspension vs. 1-hour suspension (baseline) after first policy violation.

: Effects of 3-day suspension vs. 1-day suspension (baseline) after second policy violation.

Figure 5: Top: Reoffense rate over time in experiment 1. The x-axis shows days from the start of a user’s suspension. Each point references users who were observed for at least that many days. The red region represents the length of the longer suspension. Bottom: Relative effect on reoffense rate over time (baseline = 1-hour).

Figure 6: Top: Reoffense rate over time in experiment 2. The x-axis shows days from the start of a user’s suspension. Each point references users who were observed for at least that many days. The red region represents the length of the longer suspension. Bottom: Relative effect on reoffense rate over time (baseline = 1-day).

Figure 7: Users segmented by number of historical violations (of any policy). Effects of 1-day suspension vs. 1-hour suspension (baseline) after first policy violation.

Figure 8: Users segmented by number of historical violations (of any policy). Effects of 3-day suspension vs. 1-day suspension (baseline) after second policy violation.
Suspensions are a common moderation action across almost all social platforms today. The results demonstrated here may be easily applicable to other social technology platforms, and platform operators, designers, and policymakers should think about how these results might impact their own platform’s enforcement decisions. For example, would user behavior on a given platform – in the context of the platform’s use – be substantially affected by shorter or longer durations of consequences (e.g., if it impacted the ability to write a chat message vs. stream a video)? How would changing the length of consequences impact the larger user community on a given platform? And how does differing consequence length intersect with other enforcement mechanisms to produce a strategic spectrum of interventions to address problematic user behavior? These are all questions that platform designers, engineers, and researchers should consider both theoretically and practically in their approaches to designing stronger platform consequence models.
While both experiments contain a significantly large sample size of participants, these results may be culturally specific to Roblox’s platform. For example, the ease of creating alternate accounts is an important platform-specific factor that may influence the effects of suspension duration. Roblox is also an interactive environment that may comprise different affordances than, say, a text-based social media app, resulting in different user behaviors, expectations, and – then – effectiveness of some interventions. The demographics of the Roblox community compared to the audiences participating on other platforms may also impact potential behavior change. Further, implementation and effects may be dependent on the processes (such as reporting flows, moderation systems, delivery of interventions, etc.) that Roblox uses to identify and consequence on platform-specific violations. Platform designers can think critically about what aspects of Roblox’s system features, user behaviors, behavioral incentives, and consequences and educational messaging might be parallel to another platform or where they might diverge. Ultimately, generalizability is a question of empirical theory testing, and systems designers should be encouraged to test what could work best in their own context and for their own users. In each of the sections below, we give specific recommendations on how platforms might approach the main takeaways from this paper.
For first-time violators in particular, it may be important for platform designers and policymakers to emphasize education and transparency [48] or pursue community-based approaches [41, 72] that might help adapt restorative justice approaches for users new to violative behaviors. On the other hand, smaller effects on frequent violators implies that stronger approaches may need to be taken after a certain point. However, researchers are split on whether stronger punishments for repeat violators is more effective or even defensible [19, 30], and some research into punishment suggests that “penalty escalation” may be misguided [22]. Platform operators should think carefully about how various thresholds for defining user cohorts and intervention durations might have significant implications for consequence effectiveness. Researchers should investigate this area more deeply in future work.
Outcome | 1-Hour | 1-Day | Absolute Diff. | Relative Diff. | Adjusted |
Variable | Suspension | Suspension | 95% CI | 95% CI | P-value |
Reoffense rate | 0.270 | 0.252 | (-0.021, -0.016) | (-7.584, -5.897) | <0.0001 |
Number of consequences | 0.667 | 0.625 | (-0.057, -0.028) | (-8.395, -4.36) | <0.0001 |
Number of reports against | 4.462 | 4.257 | (-0.329, -0.082) | (-7.271, -1.941) | 0.00171 |
Time-to-reoffense (hours) | 150.000 | 160.572 | (9.121, 12.022) | (6.048, 8.048) | <0.0001 |
Days active | 11.238 | 11.216 | (-0.06, 0.016) | (-0.532, 0.143) | 0.25916 |
Total time spent (hours) | 131.951 | 130.953 | (-2.664, 0.667) | (-2.012, 0.498) | 0.25916 |
Table 3: Experiment 1 results. Variant values are CUPED-adjusted means.
Outcome | 1-Day | 3-Day | Absolute Diff. | Relative Diff. | Adjusted |
Variable | Suspension | Suspension | 95% CI | 95% CI | P-value |
Reoffense rate | 0.472 | 0.434 | (-0.042, -0.035) | (-8.897, -7.362) | <0.0001 |
Number of consequences | 1.624 | 1.569 | (-0.095, -0.015) | (-5.804, -0.969) | 0.01117 |
Number of reports against | 10.501 | 10.327 | (-0.504, 0.157) | (-4.764, 1.458) | 0.30376 |
Time-to-reoffense (hours) | 128.439 | 140.432 | (10.598, 13.387) | (8.203, 10.471) | <0.0001 |
Days active | 12.384 | 12.168 | (-0.271, -0.162) | (-2.187, -1.309) | <0.0001 |
Total time spent (hours) | 327.031 | 331.804 | (-2.481, 12.028) | (-0.781, 3.7) | 0.23659 |
Table 4: Experiment 2 results. Variant values are CUPED-adjusted means.
Table 5: Experiment 1 reoffense rate after 7, 14, and 21 days.
Table 6: Experiment 2 reoffense rate after 7, 14, and 21 days.
Outcome | Cohort | 1-Hour | 1-Day | Absolute Diff. | Relative Diff. | Adjusted |
Variable | Suspension | Suspension | 95% CI | 95% CI | P-value | |
Reoffense rate | 0 violations | 0.128 | 0.112 | (-0.02, -0.013) | (-15.103, -10.182) | <0.0001 |
Reoffense rate | 1–4 violations | 0.209 | 0.190 | (-0.023, -0.015) | (-10.773, -7.199) | <0.0001 |
Reoffense rate | 5+ violations | 0.415 | 0.397 | (-0.022, -0.014) | (-5.369, -3.376) | <0.0001 |
Total time (hrs) | 0 violations | 29.107 | 27.630 | (-1.802, -1.151) | (-6.163, -3.984) | <0.0001 |
Total time (hrs) | 1–4 violations | 49.221 | 47.486 | (-2.183, -1.286) | (-4.42, -2.629) | <0.0001 |
Total time (hrs) | 5+ violations | 267.155 | 266.688 | (-4.495, 3.56) | (-1.681, 1.331) | 0.82011 |
Table 7: Experiment 1 sub-group analysis. Variant values are CUPED-adjusted means.
Outcome | Cohort | 1-Day | 3-Day | Absolute Diff. | Relative Diff. | Adjusted |
Variable | Suspension | Suspension | 95% CI | 95% CI | P-value | |
Reoffense rate | 1–4 violations | 0.315 | 0.271 | (-0.051, -0.038) | (-16.01, -12.104) | <0.0001 |
Reoffense rate | 5+ violations | 0.532 | 0.496 | (-0.041, -0.032) | (-7.597, -5.966) | <0.0001 |
Total time (hrs) | 1–4 violations | 47.706 | 43.724 | (-4.642, -3.323) | (-9.667, -7.03) | <0.0001 |
Total time (hrs) | 5+ violations | 433.354 | 441.513 | (-1.833, 18.151) | (-0.453, 4.218) | 0.1095 |
Table 8: Experiment 2 sub-group analysis. Variant values are CUPED-adjusted means. Note that users with 0 historical violations were not eligible for this experiment, by definition.