Feelings about Bodies: Emotions on Diet and Fitness Forums Reveal Gendered Stereotypes and Body Image Concerns

Cinthia Sánchez Department of Computer Science, University of ChileChile Minh Duc Chu [email protected] Information Sciences Institute, University of Southern CaliforniaUSA Zihao He [email protected] Information Sciences Institute, University of Southern CaliforniaUSA Rebecca Dorn [email protected] Information Sciences Institute, University of Southern CaliforniaUSA Stuart Murray [email protected] Department of Psychiatry and Behavioral Sciences, University of Southern CaliforniaUSA  and  Kristina Lerman [email protected] Information Sciences Institute, University of Southern CaliforniaUSA
(2024)
Abstract.

[Warning: This paper discusses eating disorders and body dysmorphia, which some may find distressing.]

The gendered expectations about ideal body types can lead to body image concerns, dissatisfaction, and in extreme cases, disordered eating and other psychopathologies across the gender spectrum. While research has focused on pro-anorexia online communities that glorify the ‘thin ideal’, less attention has been given to the broader spectrum of body image concerns or how emerging disorders like muscle dysmorphia (‘bigorexia’) present in online discussions. To address these gaps, we analyze 46 Reddit discussion forums related to diet, fitness, and associated mental health challenges. Using membership structure analysis and transformer-based language models, we project these communities along gender and body ideal axes, revealing complex interactions between gender, body ideals, and emotional expression. Our findings show that feminine-oriented communities generally express more negative emotions, particularly in thinness-promoting forums. Conversely, communities focused on the muscular ideal exhibit less negativity, regardless of gender orientation. We also uncover a gendered pattern in emotional indicators of mental health challenges, with communities discussing serious issues aligning more closely with thinness-oriented, predominantly feminine-leaning communities. By revealing the gendered emotional dynamics of online communities, our findings can inform the development of more effective content moderation approaches that facilitate supportive interactions, while minimizing exposure to potentially harmful content.

reddit, eating disorders, mental health, fitness
copyright: acmlicensedjournalyear: 2024doi: XXXXXXX.XXXXXXXconference: CSCW; 978-1-4503-XXXX-X/18/06isbn: ;ccs: Human-centered computingccs: Information systemsccs: Applied computing Psychologyccs: Human-centered computing Empirical studies in collaborative and social computing

1. Introduction

Social norms shape perceptions of the ideal body type and how people feel about their own bodies. In Western culture, these n›orms often dictate different ideals for men and women, influencing what is considered attractive or healthy (Calogero and Thompson, 2010; Murnen and Don, 2012a). Women are expected to aspire to the ‘thin ideal’, which emphasizes low body weight and thin shape. In contrast, men are expected to be strong and visibly muscular, as embodied by the ‘muscular ideal’. These gendered expectations can lead to body image concerns, body dissatisfaction, and in more extreme cases, disordered eating and other psychopathologies across the gender spectrum (Peat et al., 2008).

How do people talk about their body image concerns online? Understanding discussions on such sensitive topics is crucial for developing moderation approaches that foster safe and supportive online environments while protecting vulnerable individuals from harmful or triggering content. Most of the previous works in this space focused on eating disorders (ED), specifically pro-anorexia communities that glorify the ‘thin ideal’ (Juarascio et al., 2010; Oksanen et al., 2016). Although such online communities provide a safe space to vent and emotional support for individuals who may feel stigmatized in mainstream society (Oksanen et al., 2016; Yeshua-Katz and Martins, 2013), they can also harm individuals by normalizing obsessive thoughts around body size and shape (Lerman et al., 2023) and allowing them to share tips on dangerous weight loss practices (Ging and Garvey, 2018; Juarascio et al., 2010; Oksanen et al., 2016; Yeshua-Katz and Martins, 2013).

Recently ED phenotypes that are more prevalent in men were identified, characterized by muscle dysmorphia and strict eating rules for greater muscularity (aka ‘bigorexia’) (Murray et al., 2012). This syndrome is associated with suicidality, elevated risk of illicit muscle-building substance use, and heightened risk for psychiatric morbidity. However, we know little about how this condition presents online or how body dissatisfaction is generally expressed and interacts with mental health in online communities.

Emotions and toxicity are important dimensions of social interaction and a critical consideration for creating supportive and safe online environments (Prescott et al., 2019). In diet and fitness online communities, emotions and toxicity significantly influence body image self-perception (Brytek-Matera and Schiltz, 2011) and body dissatisfaction (Kast, 2018).

To better understand the emotional dimensions of online discussions about the body and their interaction with body image ideals, gender and mental health, we collect data from Reddit, a popular platform hosting discussion forums on a variety of topics. We identify a representative set of 46 discussion forums, or subreddits, about body image concerns, including diet and fitness communities and those focusing on psychopathologies like anorexia, bigorexia, and related mental health conditions. We address the following research questions.

  • RQ1: Are there systematic differences in the structure, content, and membership of diet and fitness communities? Where do these communities fall on the body ideal’s spectrum?

  • RQ2: Are there systematic differences in the emotions and toxicity expressed in different communities? What does emotional expression reveal about body ideals and gender stereotypes?

  • RQ3: How do responses to members’ posts differ in emotions and toxicity?

  • RQ4: Is toxicity associated with harmful content?

We find that diet and fitness communities on Reddit discuss a wide range of body image concerns, from the thin ideal to the muscular ideal. Using a method described by Waller and Anderson (2021), we analyze the membership structure of the subreddits and project them along the axes of gender and body ideal. This allows us to quantify communities along body ideal spectrum. Not surprisingly, the body ideal dimension is aligned with the gender dimension, although there are feminine communities focused on building muscle (e.g., r/xxfitness) and mixed-gender forums dedicated to weight loss (r/intermittentfasting).

We also use transformer-based language models to analyze emotions and toxicity in Reddit forums, revealing that these online communities tend to amplify gendered body stereotypes. We find significant differences in emotional expression across the body ideal spectrum: feminine-oriented communities generally express more negative emotions, with thinness-promoting communities exhibiting the highest levels of negativity. Conversely, communities focused on the muscular ideal show less negativity, regardless of gender orientation, including feminine communities like r/FlexinLesbians and r/xxfitness. Surprisingly, even communities promoting potentially pathological behaviors associated with muscle dysmorphia (bigorexia) display far less negativity than thinness-oriented communities. Our analysis also reveals a gendered pattern in how emotions indicate mental health challenges, with communities discussing serious issues like suicide, self-harm, and body dysmorphia exhibiting emotional expression patterns more closely aligned with thinness-oriented, predominantly feminine-leaning communities. By revealing the interplay between gender, body ideals, and emotional dynamics of online communities, our findings can inform the development of more effective content moderation systems that facilitate supportive interactions, while minimizing exposure to potentially harmful or triggering content.

2. Related Works

2.1. Gender Stereotypes and the Body: Thin Ideal vs Muscular Ideal

Body image—the feelings and attitudes an individual holds towards their own body—is largely shaped by social forces. The concept of the ‘ideal body’, as communicated by typical Western media, is split by traditional conceptualizations of gender, where the ideal feminine individual is presented with a thin body—the ‘thin ideal’—and the ideal masculine person with a muscular body—the ‘muscular ideal’ (Murnen and Don, 2012a). These ideals are taught from a young age, with popular dolls like Barbie or Bratz, who have noticeably thin bodies, and action figures with exaggerated muscular physiques (Boyd and Murnen, 2017). Jáuregui-Lobera et al. (2013) explores gender differences in weight misperception, self-reported physical fitness, and dieting among adolescents, revealing significant associations with self-esteem, body appreciation, mental health, and risk of EDs. However, the popular associations between masculinity and muscularity are being challenged, through women bodybuilders (Grogan et al., 2004), men pining for thinner bodies (Jones and Morgan, 2010), and more. In this work, we explore the online relationship between gendered expression (i.e., masculinity, femininity) and body image ideals (thin, muscular), and emotions.

2.2. Eating Disorders

Eating disorders (EDs) are characterized by distorted body image, obsessive thoughts about body size and shape, and core disturbances in eating and feeding behaviors. Individuals with EDs often fail to self-identify their condition and fewer than 20% ever receive treatment. This makes peer support provided by online communities an important factor for recovery. Until recently, EDs were mainly associated with anorexia and bulimia, which disproportionately affect women and girls. These disorders are associated with an intense fear of gaining weight, extreme dieting, and overexercising as compensatory behavior for eating.

There is a growing recognition of ED phenotypes in males, characterized by muscle dysmorphia and strict eating rules for greater muscularity (aka ‘bigorexia’). This syndrome is associated with an elevated risk of illicit muscle-building substance use and a heightened risk for medical and psychiatric morbidity, including increased suicidality.

Eating disorders have become more prevalent in recent years, leading researchers to link them to the spread of idealized body images in the media, particularly on social media platforms like Twitter, Instagram, and TikTok (Kao et al., 2024). Exposure to these images fuels body image concerns, a key risk factor for developing depression and EDs (Choukas-Bradley et al., 2022). Studies have shown that people compare themselves to idealized body images and as a result, feel worse about their appearance (Saiphoo and Vahedi, 2019; Fardouly and Vartanian, 2016; Choukas-Bradley et al., 2022).

2.3. ‘Thin Ideal’: Online Weight Loss and Anorexia Communities

Anorexia communities have long existed online, on blogs and message boards, and more recently on social media platforms such as Reddit. Researchers argue these communities provide both benefits and harms for individuals struggling with eating disorders. Such communities offer social support (Juarascio et al., 2010) and a sense of belonging to individuals who often feel stigmatized and misunderstood (Oksanen et al., 2016; Yeshua-Katz and Martins, 2013) and help individuals better understand and manage their illness (McCormack, 2010). However, anorexia communities can also increase psychological distress and exacerbate anorexia by promoting unhealthy behaviors like extreme calorie restriction and overexercising.

Studies have shown that users often start with mainstream content that revolves around dieting and fitness but are ultimately led to more harmful (Marks et al., 2020) content. Chu et al. (2024) characterized this phenomenon as a feedback loop similar to online radicalization, which drives the increase in content glorifying eating disorders and self-harm by trapping individuals in echo chambers that reinforce extreme behaviors. They showed that both the Pro- and Anti-ED communities on Twitter are strongly connected to the Keto & Diet, Body Image, and Weight Loss communities.

2.4. ‘Muscular Ideal’: Online Bigorexia and Bodybuilding Communities

With the increasing exposure to muscular male images in Western culture, there has been an upsurge in online content characterized by (i) the mass overvaluation of a muscular body ideal and (ii) the broad dissemination of methods to optimize muscularity. However, in contrast to the wealth of research about the thin ideal, online content focusing on the muscular ideal has not been well characterized.

Empirical efforts to assess the clinical impact of engaging in online pro-muscularity communities have suggested that disordered eating practices may be a central feature of these online communities. Preliminary content analyses illustrated that (i) compulsive exercise, (ii) binge eating, (iii) strict restrictive eating practices, and (iv) illicit anabolic steroid use commonly feature (Murray et al., 2016). Follow-up analyses demonstrated that engagement in pro-muscularity online content was linearly associated with clinical eating disorder symptomatology (Quiniones and Oster, 2019). Importantly, more recent evidence suggests that the pursuit of a more muscular ideal, and engagement in online pro-muscularity communities, extends to women, and in turn, is associated with an array of negative psychiatric outcomes (Cunningham, 2023).

2.5. Community Moderation on Reddit

Community moderation is meant to foster safe and supportive environments in online communities while limiting toxicity and harmful content. This task is especially challenging in sensitive health discussions where overly lax or overly strict moderation can have lasting effects. Moderators face the challenge of identifying harmful or potentially triggering content, while still allowing for nuanced discussions of personal experiences. Reddit is based on a decentralized model, where human moderators use a range of tools from simple regex-based methods like Automoderator (Jhaver et al., 2019) to machine learning-based methods (He et al., 2023) to manage online communities or subreddits. Tools like Perspective and Detoxify provide automated toxicity detection, but the risk of over-blocking makes it less suitable for nuanced community moderation tasks (Chandrasekharan et al., 2019).

Identifying harmful content in online communities remains a challenge. While automated tools to detect specific types of harmful content such as toxicity and hate speech (Davidson et al., 2017) have been developed, they lack nuance and risk overblocking (Chandrasekharan et al., 2019), which can have disparate impacts on vulnerable or marginalized groups (Dorn et al., 2023). In addition, evolving community jargon quickly makes such tools obsolete (Chancellor et al., 2016). We fill a gap in existing literature by directly comparing thin ideal and muscular ideal communities and how they discuss body image concerns.

3. Data

Reddit is a vibrant social media platform hosting discussion forums (subreddits) on a wide range of topics (Hofmann et al., 2022; Chen et al., 2024). We retrieve Reddit data from Academic Torrent111https://academictorrents.com/, which collects Reddit submissions and comments using the Pushshift API (Baumgartner et al., 2020). Our dataset spans from January 2019 to November 2023 and includes data from 46 subreddits focused on fitness, diet, eating disorders, and related mental health conditions.

To identify these subreddits, we rely on our experience and existing literature to generate relevant keywords for Reddit searches. Additionally, we employ Reddit’s search recommendation system to find and include similarly relevant subreddits until the search space is exhausted. This led to 26 subreddits. We manually verify the activity and relevance of these subreddits to check if they matched our query criteria. Using this initial data, we construct a subreddit mention network based on the text of submissions and comments. From this network, we observe subreddits that are not part of the initial set but were frequently mentioned from existing subreddits. To make our data more comprehensive including two-way connections from active nodes, we select new subreddits with the highest in-degree and dense connections to various other nodes within each community cluster identified by the Louvain algorithm (more in §§\S§4.1). As a result, we expand our dataset by including submissions and comments from 28 additional subreddits, leading to a total of 54 subreddits.

For each of the 54 subreddits, we remove submissions and comments from AutoModerator, a bot that allows subreddit moderators to automate certain moderation tasks, including automatic posts and replies. Additionally, we discard content generated by specific subreddit bots such as steroidsBot, bodybuildingbot, Anabotlics, WeightroomBot and EDAnonymous_Bot as these bots generated a large proportion of the content. Finally, we exclude deleted and duplicate content in both submissions and comments from our data. We then filter out subreddits with fewer than 500 submissions, leading to a final list of 46 subreddits (Table 1).

To counteract the imbalance in the number of submissions and comments across subreddits, we randomly sampled at most 5,000 submissions and 5,000 comments for each subreddit, leading to a total of 178,272 submissions, 218,139 comments, and 212,529 unique users. See Table 2 in Appendix A.1 for more detailed statistics of each subreddit.

Table 1. Subreddits collected in our data. The prefix ”r/” was removed for clarity.
Subreddit (r/)
steroids, Brogress, BulkOrCut, GettingShredded, weightroom, nattyorjuice, powerbuilding, bodybuilding, gainit, bodyweightfitness, Instagramreality, ketorecipes, ShittyRestrictionFood, progresspics, goodrestrictionfood, EDanonymemes, Volumeeating, amiugly, safe_food, FlexinLesbians, xxfitness, xxketo, EDAnonymous, BingeEatingDisorder, ARFID, EdAnonymousAdults, EatingDisorders, fuckeatingdisorders, eating_disorders, bulimia, AnorexiaNervosa, BodyDysmorphia, BDDvent, fit, ketogains, fasting, omad, 1200isplenty, CICO, intermittentfasting, loseit, Fitness, keto, MadeOfStyrofoam, drunkorexia, SuicideWatch

4. Methods

4.1. Constructing Network of Subreddits Mentions

To analyze the structure of online communities discussing diet, fitness, and body image concerns, we construct a network of subreddit mentions. In this directed network, each node represents a subreddit, and an edge between two nodes signifies an interaction where a user mentions another subreddit in their posts (in the title or content) or comments at least once. The weight of the edge represents the frequency of such interactions. We use regular expressions (Aho, 1991) to extract mentions of subreddits in the text by identifying strings that match the pattern r/subreddit_name. Each mention of a subreddit within a unit of text (whether a post or a comment) is counted once. Subreddits included in the analysis must be mentioned at least ten times, encompassing open, banned, and quarantined subreddits. We identify higher-level clusters (groups of subreddits) of densely linked nodes in the subreddit mention network using the Louvain modularity method (Blondel et al., 2008).

4.2. Mapping Subreddits along the Dimensions of Gender and Body Ideal

To quantify online communities along a dimension of a social construct like gender or body ideal, we use the community embedding method of Waller and Anderson (2021). Specifically, we create a user-community bipartite graph G=(U,C,E)𝐺𝑈𝐶𝐸G=(U,C,E)italic_G = ( italic_U , italic_C , italic_E ), where U𝑈Uitalic_U are the users and C𝐶Citalic_C are the subreddits. A link e𝑒eitalic_e exists between a user u𝑢uitalic_u and a community c𝑐citalic_c if the user has commented or posted in the subreddit. The weight of the link represents the frequency of the user activities. We then create community embeddings using node2vec (Grover and Leskovec, 2016). As a result, communities with similar user bases are clustered together in the high-dimensional embedding space.

Next, we identify several seed pairs P={(cia,cib)|i=1n}𝑃evaluated-atsubscriptsuperscript𝑐𝑎𝑖subscriptsuperscript𝑐𝑏𝑖𝑖1𝑛P=\{(c^{a}_{i},c^{b}_{i})|_{i=1}^{n}\}italic_P = { ( italic_c start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT } of communities that differ in the target construct but are similar in other respects. For example, [r/AskMen, r/AskWomen] and [r/daddit, r/Mommit] are pairs representing masculinity and femininity of the gender dimension. Similarly, [r/loseit, r/gainit] and [r/AnorexiaNervosa, r/GettingShredded] represent the thin and muscular directions of the body ideal spectrum. Note that the seed pair communities only need to differ primarily in the social construct, but not have to at the poles of the target dimension. The axis x𝑥xitalic_x is a single vector robustly representing the desired social dimension, which is defined as the average embeddings of communities on one side from the pair minus that on the other side, as x=1ni=1n(e(cia)e(cib))𝑥1𝑛superscriptsubscript𝑖1𝑛𝑒subscriptsuperscript𝑐𝑎𝑖𝑒subscriptsuperscript𝑐𝑏𝑖x=\frac{1}{n}\sum_{i=1}^{n}(e(c^{a}_{i})-e(c^{b}_{i}))italic_x = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_e ( italic_c start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_e ( italic_c start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ), where e()𝑒e(\cdot)italic_e ( ⋅ ) represents the embedding of the community. We project the community onto the axis x𝑥xitalic_x by computing the cosine similarity between the community embedding and the axis.

4.2.1. Thin–Muscular Ideal Axis

We identify four pairs of seed subreddits to define the thin-muscular ideal axis: [r/intermittentfasting, r/steroids], [r/AnorexiaNervosa, r/GettingShredded], [r/loseit, r/gainit], and [r/Instagramreality, r/nattyorjuice]. Similar to the approach by Waller and Anderson (2021), we select pairs of subreddits that are similar in nature but differ primarily along the thin-muscular ideal axis. For example, both r/Instagramreality and r/nattyorjuice serve as communities that critique people’s physical appearances, questioning their authenticity. However, r/Instagramreality focuses on exposing edited images on Instagram, often aimed at making people look thinner, while r/nattyorjuice centers on discussions about whether individuals with muscular physiques use anabolic supplements.

After projecting the subreddits onto the axis, the resulting scores represent the community’s position along the body ideal spectrum. Communities with higher (resp. lower) scores are more strongly associated with the thin (resp. muscular) ideal. It is important to note that a community’s position only reflects its association with the target construct, but not the identity of individual community members.

4.2.2. Masculine–Feminine Axis

The gender axis was defined by Waller and Anderson (2021) using ten pairs: [r/AskMen, r/AskWomen], [r/TrollYChromosome, r/CraftyTrolls], [r/AskMenOver30, r/AskWomenOver30], [r/OneY, r/women], [r/TallMeetTall, r/bigboobproblems], [r/daddit, r/Mommit], [r/ROTC, r/USMilitarySO], [r/FierceFlow, r/HaircareScience], [r/malelivingspace, r/InteriorDesign], [r/predaddit, r/BabyBumps]. Of the 46 subreddits we study, 26 have been assigned a score along the gender axis by Waller and Anderson (2021). Therefore, we use these scores to represent the communities along the gender axis.

4.3. Measuring Textual Similarity Among Subreddits

Language of online communities carry their mindset and beliefs (Jiang et al., 2022; He et al., 2024c, a). We encode the submissions of each subreddit using DistilBERT (Sanh et al., 2019) finetuned on Reddit data222https://huggingface.co/mwkby/distilbert-base-uncased-sentiment-reddit-crypto into 768-d vectors, which represent their semantics. We then measure the textual similarity between subreddits using the cosine similarity between the centroids of corpus embeddings (Kour et al., 2022). Based on this metric, we identify groups of subreddits with similar content using the agglomerative hierarchical clustering algorithm. We choose the number of classes that maximizes the silhouette metric.

4.4. Measuring Emotions and Toxicity

Language carries cues to affect, which include emotions and toxicity. We measure the toxicity and emotions expressed in the language of posts and comments within the communities. To detect toxicity, we utilize the Detoxify library (Hanu and Unitary team, 2020), which provides a real toxicity score ranging from 0 to 1, with 1 indicating high toxicity. Additionally, Detoxify returns scores for various specific types of toxicity, such as obscene, threat, insult, and identity attack. To avoid noise, we consider submissions and comments with toxicity levels equal to or greater than 0.01.

For emotion analysis, we employ a model333https://huggingface.co/SamLowe/roberta-base-go_emotions derived from RoBERTa (Liu et al., 2019), trained on the GoEmotions dataset (Demszky et al., 2020) for multilabel classification. The GoEmotions dataset comprises Reddit comments labeled with 28 categories (27 emotion categories and a neutral category). This model outputs a score between 0 and 1, representing the confidence of each sentiment category, including the neutral category.

Refer to caption\Description
Figure 1. Network of subreddit mentions. Node colors indicate detected higher-level clusters, links share the same colors as the source subreddit, and node sizes are proportional to their degrees. The cluster in light blue discusses mental health issues, the dark pink cluster is related to the keto diet, the dark green one is about body image concerns, the purple cluster covers a variety of topics related to extreme dieting and eating disorders, the clusters in orange and light green discuss a range of topics, including bodybuilding, fitness, and physique goals.

5. Results

5.1. Analysis of Subreddits

5.1.1. Subreddit Mention Network

To analyze the structure of online communities discussing diet, fitness, and body image concerns, we construct a network of subreddit mentions, where each node represents a subreddit and edges link subreddits that are mentioned in posts of another subreddit. This network provides insights into the interconnectedness of various body-focused communities on Reddit and reveals potential gateways between eating disorder communities and mainstream spaces like diet and fitness communities.

To better capture the interactions between different subreddits, we use the original 54 unfiltered subreddits, and their pre-sampled submissions and comments to construct the mention network. Figure 1 shows the network of 1,950 subreddits with 18,202 occurrences of mentions among them, where the subreddits in the same higher-level cluster share the same color to aid visualization. Of these subreddits, 71 have been banned, 18 are gated, and 3 are quarantined. Six distinct clusters represent forums on related topics like diet and weight loss, fitness and healthy living, nutrition and recipes, mental health, etc.

Refer to caption\Description
Figure 2. Hierarchical agglomerative clustering of diet and fitness subreddits by text similarity in submissions, based on the cosine similarity between corpora centroids. Cells with dark colors indicate more similarity (less distance) between the subreddits.

Overall, the mention network reveals distinct subreddit clusters, with no isolated clusters and significant interaction among them. Subreddits discussing mental health issues like suicide (r/SuicideWatch), self-harm (r/MadeOfStyrofoam), or body dysmorphia (r/BodyDysmorphia, r/BDDVent) are closely linked at the top and labeled in light blue. These communities are linked to the keto cluster (dark pink) and the body image cluster (dark green). The lower section of the network reveals a dense structure with three interlinked clusters. These clusters discuss a range of topics, from weight loss (e.g., r/loseit) and bodybuilding (e.g., r/bodybuilding) to eating disorders like anorexia (e.g., r/anorexianervosa) and bigorexia (e.g., r/GettingShredded).

This network allows us to expand our original set of subreddits by identifying other relevant but undiscovered forums that are frequently mentioned by a given community. Furthermore, the structure reveals that weight management, fitness, and restrictive diet forums are strongly connected to mental health spaces. Notably, subreddits that promote body positivity and acceptance, such as r/PlusSize, are rarely mentioned and disconnected in this space. This raises concerns that vulnerable individuals seeking diet and weight loss advice may inadvertently encounter more forums discussing problematic topics such as high food restrictions. Isolated from healthy body image spaces, these online users can be exposed to and trapped in harmful content. Although we lack data from all subreddits mentioned by users, the observed connectivity patterns suggest that they correctly reflect the structure of attention in these discussions.

5.1.2. Subreddit Content Similarity

Figure 2 shows the matrix of pairwise similarity between subreddits. The color in the figure gives the similarity metric, with darker colors denoting more similar subreddits. Subreddits are sorted by similarity using a hierarchical agglomerative clustering algorithm. There are three main content clusters. The EDs and mental health communities (top and left of the matrix) have very similar content and are grouped together. The middle cluster is composed of a mix of communities discussing dieting and fitness. The last cluster (bottom and right) is largely composed of fitness communities, with a small number of diet communities. These results suggest that diet and fitness communities largely share similar content.

Refer to caption\Description
Figure 3. Distribution of subreddits along the muscular-thin ideal dimension. The subreddits are evenly distributed to represent their ranking on the body ideal spectrum.

5.1.3. Quantifying Subreddits along the Body Ideal Spectrum

Using the method described in Waller and Anderson (2021), we embed the subreddits in a high dimensional space based on user co-activity (Appendix B.1). We create an axis in the embedding space to represent the body image’s ideal dimension, defined by the concepts of the ‘thin ideal’ and the ‘muscular ideal’. We compute the score of each community by projecting them onto this axis. Note that the subreddits are evenly distributed to represent their ranking on the body ideal spectrum. Figure 3 shows where the 46 subreddits in our set fall on this axis. The subreddits related to eating disorders, like r/EDAnonymous, r/EDAnonymemes, r/safe_food, etc., fall at the thin end of the thin/muscular spectrum. In contrast, r/powerbuilding, r/bodybuilding are on the muscular end of the spectrum.

We find that 26 of the subreddits in our dataset also appear in Waller and Anderson (2021); therefore, we use their gender scores to represent these communities along the masculine/feminine dimension (Appendix B.1). We find r/xxfitness and r/xxketo at the feminine end of the gender spectrum, followed by many of the ‘thin ideal’ subreddits like r/EatingDisorders, r/loseit and r/fasting. At the masculine end of the gender spectrum are the ‘muscular ideal’ subreddits like r/powerbuilding, r/Brogress and r/BulkOrCut. Overall, there is a high correlation (r=0.84𝑟0.84r=0.84italic_r = 0.84) between the gender and body ideal scores for these 26 communities.

5.2. Emotions and Gender Stereotypes

Refer to caption
\Description
Figure 4. Distribution of emotions in subreddits. The bars show the median confidence values of the neutral emotion (i.e., that the post expresses no emotion) in submissions (left), comments (middle), and their difference (right), in different subreddits. Subreddits are sorted in descending order by the median confidence values of the neutral emotion in submissions.

We plot the distributions of emotions in different subreddits in Appendix B.2. Previous studies have identified a gendered divide in emotional online language: women tend to express warmer and more compassionate language, while men often use colder and more impersonal language (Park et al., 2016). Do men and women exhibit the same behaviors when talking about their bodies? To answer this, we first look at emotions expressed in submissions made to the subreddits. The neutral emotion implies that the text does not express any emotion. Figure 4(left) shows the median confidence values of the neutral emotion in the submissions to each subreddit, ranked by their median values. Larger values imply less emotionality.

We observe a striking separation by the body ideal. Subreddits with submissions that express less emotion include those focusing on the muscular ideal, like r/steroids, r/Brogress, and r/weightroom. At the other end of the spectrum with fewer non-emotional posts are the thin ideal forums discussing anorexia and other eating disorders. Mental health-related forums (r/Suicidewatch, r/BDDVent) are understandably also more emotional.

Refer to caption\Description
Figure 5. Spearman’s correlation coefficient between (left) the body ideal scores and toxicity/emotion scores of different communities in submissions, and (right) the gender scores and toxicity/emotion scores of different communities in submissions. Emotions include three positive ones (approval, admiration, and joy), three negative ones (annoyance, disappointment, and sadness), and neutral. Confidence intervals were obtained by 1000 bootstrap iterations.

5.2.1. The Thin Ideal is Associated with Negative Emotions and Toxicity

To further explore how emotions are aligned with body image norms, we calculate confidence scores of the emotions and toxicity expressed in a subreddit’s submissions (Fig. 11 and 12 in the Appendix) and correlate them with its body ideal score (§§\S§5.1.3). Figure 5 shows the Spearman correlation coefficient between the median emotion and toxicity confidence score in the submissions and the (a) muscular-thin ideal and (b) masculine-feminine scores.

Thin ideal forums express more negative emotions like annoyance, disappointment, and sadness, and also more toxicity. Among the positive emotions, only joy is expressed in the thin ideal forums but not approval or admiration. This shows that negative emotions are correlated with body image stereotypes but positive emotions are not. These trends can be partly explained by the gender divide in emotional expressions (Park et al., 2016). As illustrated in Figure 5(right), communities with more feminine membership express both positive and negative emotions and toxicity, while communities with more masculine membership are less emotional. These observations confirm findings from survey and psychology literature that women tend to be more dissatisfied with their bodies compared to men and also have more body dysmorphia (Murnen and Don, 2012a).

Refer to caption\Description
Figure 6. TSNE embeddings of communities by their emotions and toxicity. Each community is represented by a 7d vector consisting of its 75th percentile emotion (approval, admiration, joy, annoyance, disappointment, sadness) and toxicity score of the submissions. Communities are colored by their positions along the thin ideal-muscular ideal dimension. Communities cluster into two groups based on the affect of the submissions.

5.2.2. Emotional Landscape of Reddit

To succinctly visualize the emotions expressed by different Reddit communities, we represent each subreddit as a vector of its emotion (approval, admiration, joy, annoyance, disappointment, sadness) and toxicity confidence scores. We use the 75th percentile score of the submissions made to each forum. Figure 6 shows the TSNE plot of the embeddings of these vectors. Reddit communities cluster into two groups that share a similar emotional tone. Many communities (though not all) on the thin ideal spectrum share emotions similar to those shared on suicide support and body dysmorphia forums.

5.3. Emotions, Toxicity, and Community Engagement

How do members of online communities in this space respond to submissions? What emotions are expressed in the comments on the forums in our sample? Figure 4(right) in Appendix B.3 compares the distribution of the neutral emotion confidence scores in the comments to the confidence score in the submissions. Overall, comments are more neutral, i.e., less emotional, than submissions. The exceptions are the muscular ideal communities r/steroid, r/Brogress, and r/weightroom, where members respond with more emotional language than submissions. While the dominant emotion in comments across most subreddits in our study is neutral, Figure 13(b) in Appendix B.3 indicates that the dominant emotion in comments in r/progresspics, r/amiugly, and r/Brogress is admiration. This suggests that users often respond positively to posts that showcase their peer’s fitness journeys and request community feedback on their appearance. Additionally, comments in mental health and eating disorder communities, such as r/EatingDisorders and r/SuicideWatch, predominantly exhibit a caring tone, suggesting that communities provide emotional support for their peers.

We also calculate toxicity scores for submissions and comments. Figure 8 shows the distribution of toxicity scores across all subreddits in our dataset, highlighting the differences in their median scores. Generally, there is little difference in toxicity scores between submissions and comments for most subreddits. However, significant deviations from this trend exist. Specifically, members of mental health and ED communities typically respond to peers’ posts with less toxicity. This is likely because submissions in these forums often express distress, prompting supportive responses. In contrast, comments in muscular-focused communities like r/steroids, r/Brogress, BulkOrCut, and bodybuilding tend to be more toxic than submissions. This trend is also evident in appearance-focused subreddits such as amiugly and progresspics, where users invite peers to comment on their appearance. Interestingly, comments in these communities often exhibit a mix of toxicity and admiration. In muscular ideal forums, individuals’ body self-disclosures receive validation often using explicit language. The following examples show how toxicity serves to amplify admiration and support:

r/progresspics: ”You look fucking phenomenal”
r/Brogress: ”Fucking amazing bro. Keep it up. Looking big man”
r/amiugly: ”You’re so handsome holy shit”

Refer to caption\Description
Figure 7. Spearman’s correlation coefficient between (left) the body ideal scores and toxicity/emotion scores in comments made on different communities, and (right) the gender scores and toxicity/emotion scores of comments. Confidence intervals were obtained by 1000 bootstrap iterations.

To quantify differences in the emotions of community responses, we calculate confidence scores of the emotions and toxicity of comments (Fig. 14 and 15 in Appendix B.3) and correlate them with its body ideal score. Figure 7 shows the Spearman correlation coefficient between the median emotion and toxicity confidence score of comments and the (a) muscular-thin ideal and (b) masculine-feminine scores. Compared to the emotions of submissions (Fig. 5), toxicity in the thin ideal communities is much reduced. Comments in the more masculine communities are now more toxic than in the feminine communities, in contrast to submissions. Masculine communities are also more emotional. Positive emotions are now more aligned with the gender axis. These results support our conclusion that feminine, and to a lesser degree thin ideal, communities provide more emotional support and validation in the form of positive emotions to their members.

5.4. Toxicity and Online Harms

Toxic language on social media is often associated with harm, and many studies have utilized computational models to detect toxicity in natural language to understand this phenomenon (Pascual-Ferrá et al., 2021; Arora et al., 2023; Jiang and Vosoughi, 2020). However, this pattern fails in online spaces serving marginalized identities (Dorn et al., 2023) or discussing sensitive topics, including body image concerns. As shown in Figure 8, forums dedicated to mental health and eating disorders exhibit high levels of toxicity in both submissions and comments. Closer examination reveals that, despite being flagged as highly toxic, these subreddits rely on moderation to maintain a safe environment without spreading harm. For example, r/SuicideWatch444https://www.reddit.com/r/SuicideWatch/ provides a safe space for peer support for individuals struggling with suicidal thoughts. To minimize harm, moderators enforce strict rules to ensure safety and civility, such as avoiding inciting language and judgmental comments. Users in these forums often seek emotional support through self-disclosure, which may include explicit language (e.g., obscenities). Toxicity detection algorithms are prone to incorrectly flag such texts as toxic (Garg et al., 2023). Therefore, such posts are frequently flagged as toxic. Below is an original post from r/SuicideWatch illustrating this point:

r/SuicideWatch: Fuck everyone who has hurt me
Fuck you all. Fuck you piece of shit medical legal assholes. I hope you all live a long and ugly life full of suffering. I hope you go to bed every night crying because of the evil you emit. Fuck my ”family”. Fuck all you former friends who left when it got hard. Fuck ”doctors”, ”judges”, and whatever other little bitches out there who are just stuck up on that high horse of theirs. FUCK YOU ALL, GO KILL YOURSELVES. But do it away from me so I can die in peace.”

Conversely, forums focusing on the muscular ideal, such as r/steroids and r/GettingShredded, present a higher risk of amplifying harm despite having lower toxicity scores compared to other communities.

For example, r/steroids features discussions that promote the use of illegal muscle-building substances with significant health risks (Pärssinen and Seppälä, 2002). The information shared on these forums can lead to the dissemination of misinformation and advice that contradicts medical guidelines, posing serious physical harm to participating users.

r/steroids: [Compounds] Experience threads for common stacks
I think a great idea for new compound experience threads could be really common stacks like nandrolone/dbol or tren/mast etc. It could give people good insight in to what the compounds can do differently when combined with other complimentary compounds; it could help inform people of the specific synergies present between certain compounds. Just an idea.”

\Description
Refer to caption
Figure 8. Distribution of toxicity scores in subreddits. The bars show the median confidence values of toxicity in submissions (left), comments (middle), and their difference (right), in different subreddits. Subreddits are sorted in descending order by the median confidence values of toxicity in submissions.

6. Discussion

Despite rising awareness about the gender spectrum, as well as the efforts to destigmatize mental health conditions, body image concerns remain stubbornly gendered. Our analysis of online discussions on 46 Reddit forums dedicated to diet, fitness, and related topics reveals a range of body ideals spanning the spectrum from the thin ideal, which values thin and light bodies, to the muscular ideal, which values strong and visibly muscular bodies. Where on the spectrum a particular community falls is highly biased by gender: Reddit communities that are more aligned with the thin ideal are generally more feminine, while those aligned with the muscular ideal are decidedly more masculine. While it is not surprising to find EDs-related communities on the thin ideal and feminine end of the body and gender spectra, other mental health communities discussing suicide and body dysmorphia are similarly feminine and on the thin ideal end of the body spectrum. This supports findings from psychology literature that women feel worse about their bodies (Murnen and Don, 2012a) — and this distress translates into poor mental health. Surprisingly, the muscularity-oriented communities are not close to the mental health communities either on the body spectrum or in the interconnections between forums as manifested by the subreddit mention network.

We examine emotions and toxicity on Reddit diet and fitness forums and found another stark gender divide. Posts in thin ideal communities are more emotional, often expressing sadness and disappointment, while muscular ideal communities show more positive emotions like approval and admiration. This aligns with previous research indicating women discuss body image more negatively than men (Murnen and Don, 2012b; Muth and Cash, 1997). Gendered societal norms about emotional expression contribute to this divide: women converse more passionately and personally. However, both communities generally respond positively to posts about appearance, with feminine-oriented subreddits offering emotional support and masculine spaces using aggressive compliments. Overall, our findings suggest that online communities express and amplify, rather than challenge, gendered norms about bodies.

These results show that masculine communities have different expectations for their forums. They are less inclined to provide self-disclosures, such as sharing emotional posts or selfies, and rely less on peers for support. In contrast, feminine communities are more open to discussing distress and body image-related mental health issues. Conversely, members in muscular spaces avoid expressing emotional concerns about their bodies due to rigid masculinity stereotypes against showing vulnerability (Räisänen and Hunt, 2014). These ingrained norms discourage the creation of safe venues for the open expression of such feelings in online spaces. Without community support and recognition, members of muscular ideal communities may lack self-awareness of potential psychopathology. Alternatively, it could be that members in masculine communities express body dissatisfaction in ways that existing linguistic tools cannot detect. Further research is needed to investigate these implicit behaviors.

Our results point to the complex nature of online communities and the challenges in moderating discussions to create a safe space to discuss topics that may be distressing or stigmatized in mainstream society. Prior research has recognized the tensions in moderating online communities to limit harmful, potentially triggering content while allowing for disclosures of emotional experiences. Our work identifies additional considerations for moderation — they need to be sensitive to differences in emotional tone but not propagate stereotypes about gender and body image norms.

6.1. Limitations

Selection of subreddits.

This work relies on the semi-manual identification of relevant online forums. While we tried to compensate for the ad hoc nature of manual selection by identifying additional forums using the subreddit mention network, the list of subreddits may still be incomplete, limiting our conclusions.

Selection of seed pair of communities for the body ideal dimension.

Waller and Anderson (2021) identify one seed pair of communities for the gender dimension and algorithmically augment it with additional similar pairs of communities, to ensure that the dimension is not overly tied to idiosyncrasies of the manually selected seed communities. However, we manually select the four seed pairs for the body ideal dimension without augmenting them, because the number of subreddits in our study (46) is relatively small compared to Waller and Anderson (2021)’s (10,006), making the augmentation algorithm impractical.

Algorithmic bias in emotion and toxicity classifiers.

Our results could potentially be colored by biases in emotion and toxicity classifiers or their training data (He et al., 2024b). As a result of these biases, our emotion and toxicity recognition could vary systematically across genders and affect our judgments about emotions from people of different genders talking about their bodies. We suspect that this is not the case, as muscular ideal forums with feminine membership look more like other muscular forums rather than the thin ideal forums. Furthermore, the results of emotion and toxicity run by our algorithms on Waller and Anderson (2021)’s gendered seed subreddits do not show a clear pattern differentiating between the language usage of men and women-dominated subreddits (Appendix B.3). This suggests that our classifiers are robust against potentially gendered linguistic bias.

We do not distinguish between sources of negative emotions: whether from the author’s body dissatisfaction or chafing about society’s restrictive body norms. However, the range of negative emotions and consistent expressions across a variety of online forums suggest that these negative emotions are connected to body dissatisfaction.

Ethics Statement

This research touches on highly sensitive topics related to mental health, which calls for extra precautions to minimize risks to study subjects as well as researchers. All data used for this study is public and collected following Reddit’s terms of service. The study protocol was reviewed by the authors’ IRB. To minimize privacy risks, identifiable information was removed, and analysis was carried out on aggregated data. As a result, we believe that the risks of negative outcomes due to the use of these data are trivial. To minimize risks to researchers, we did not collect images and regularly met with the team to identify potential sources of distress.

7. Conclusion

Our analysis of diet and fitness communities on Reddit reveals a complex landscape of body image concerns and emotional expressions across the spectrum from the thin ideal to the muscular ideal. By mapping these communities along gender and body ideal axes, we have uncovered patterns in how different groups engage with and express body-related issues online. We find that thinness-oriented subreddits, which tend to align more with feminine communities, exhibit higher levels of negative emotions and toxicity compared to muscularity-oriented subreddits. This emotional divide extends to communities discussing serious mental health issues, which show patterns more closely aligned with thinness-oriented spaces.

Community moderation needs to consider gender dynamics and body ideals when developing strategies to foster healthier online communities, particularly for populations that may be at higher risk for negative emotional experiences and potential mental health challenges. Future work should focus on developing interventions that can mitigate the amplification of gender stereotypes and provide better support for users across the entire spectrum of body image concerns.

Acknowledgements.
This project is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR00112290021. Cinthia Sánchez was supported by ANID/Scholarship Program/DOCTORADO BECAS CHILE/2022 - 21222229. Additionally, Cinthia Sánchez acknowledges the support of the Millennium Institute for Foundational Research on Data and the National Center for Artificial Intelligence, Chile.

References

  • (1)
  • Aho (1991) Alfred V Aho. 1991. Algorithms for finding patterns in strings, Handbook of theoretical computer science (vol. A): algorithms and complexity.
  • Arora et al. (2023) Arnav Arora, Preslav Nakov, Momchil Hardalov, Sheikh Muhammad Sarwar, Vibha Nayak, Yoan Dinkov, Dimitrina Zlatkova, Kyle Dent, Ameya Bhatawdekar, Guillaume Bouchard, and Isabelle Augenstein. 2023. Detecting Harmful Content on Online Platforms: What Platforms Need vs. Where Research Efforts Go. ACM Comput. Surv. 56, 3, Article 72 (oct 2023), 17 pages. https://doi.org/10.1145/3603399
  • Baumgartner et al. (2020) Jason Baumgartner, Savvas Zannettou, Brian Keegan, Megan Squire, and Jeremy Blackburn. 2020. The Pushshift Reddit Dataset. Proceedings of the International AAAI Conference on Web and Social Media 14, 1 (May 2020), 830–839. https://doi.org/10.1609/icwsm.v14i1.7347
  • Blondel et al. (2008) Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (Oct. 2008), P10008. https://doi.org/10.1088/1742-5468/2008/10/p10008
  • Boyd and Murnen (2017) Hope Boyd and Sarah K Murnen. 2017. Thin and sexy vs. muscular and dominant: Prevalence of gendered body ideals in popular dolls and action figures. Body image 21 (2017), 90–96.
  • Brytek-Matera and Schiltz (2011) Anna Brytek-Matera and Lony Schiltz. 2011. Association between attitudes towards body image, negative emotions about one’s own body and self-state representations in a clinical sample of eating disordered women. Archives of Psychiatry and Psychotherapy 2 (2011), 37–43.
  • Calogero and Thompson (2010) Rachel M Calogero and J Kevin Thompson. 2010. Gender and body image. Handbook of gender research in psychology: Volume 2: Gender research in social and applied psychology (2010), 153–184.
  • Chancellor et al. (2016) Stevie Chancellor, Jessica Annette Pater, Trustin Clear, Eric Gilbert, and Munmun De Choudhury. 2016. # thyghgapp: Instagram content moderation and lexical variation in pro-eating disorder communities. In Proceedings of the 19th ACM conference on computer-supported cooperative work & social computing. 1201–1213.
  • Chandrasekharan et al. (2019) Eshwar Chandrasekharan, Chaitrali Gandhi, Matthew Wortley Mustelier, and Eric Gilbert. 2019. Crossmod: A cross-community learning-based system to assist reddit moderators. Proceedings of the ACM on human-computer interaction 3, CSCW (2019), 1–30.
  • Chen et al. (2024) Kai Chen, Zihao He, Keith Burghardt, Jingxin Zhang, and Kristina Lerman. 2024. IsamasRed: A Public Dataset Tracking Reddit Discussions on Israel-Hamas Conflict. arXiv preprint arXiv:2401.08202 (2024).
  • Choukas-Bradley et al. (2022) Sophia Choukas-Bradley, Savannah R Roberts, Anne J Maheux, and Jacqueline Nesi. 2022. The perfect storm: A developmental–sociocultural framework for the role of social media in adolescent girls’ body image concerns and mental health. Clinical Child and Family Psychology Review 25, 4 (2022), 681–701.
  • Chu et al. (2024) Minh Duc Chu, Zihao He, Rebecca Dorn, and Kristina Lerman. 2024. Large Language Models Help Reveal Unhealthy Diet and Body Concerns in Online Eating Disorders Communities. arXiv:2401.09647
  • Cunningham (2023) Mitchell Lee Cunningham. 2023. Disentangling the relations of drive for toned muscularity with eating, exercise and body image psychopathology in women. Ph. D. Dissertation.
  • Davidson et al. (2017) Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media, Vol. 11. 512–515.
  • Demszky et al. (2020) Dorottya Demszky, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Nemade, and Sujith Ravi. 2020. GoEmotions: A Dataset of Fine-Grained Emotions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 4040–4054. https://doi.org/10.18653/v1/2020.acl-main.372
  • Dorn et al. (2023) Rebecca Dorn, Negar Mokhberian, Julie Jiang, Jeremy Abramson, Fred Morstatter, and Kristina Lerman. 2023. Non-Binary Gender Expression in Online Interactions. arXiv preprint arXiv:2303.04837 (2023).
  • Fardouly and Vartanian (2016) Jasmine Fardouly and Lenny R Vartanian. 2016. Social media and body image concerns: Current research and future directions. Current opinion in psychology 9 (2016), 1–5.
  • Garg et al. (2023) Tanmay Garg, Sarah Masud, Tharun Suresh, and Tanmoy Chakraborty. 2023. Handling bias in toxic speech detection: A survey. Comput. Surveys 55, 13s (2023), 1–32.
  • Ging and Garvey (2018) Debbie Ging and Sarah Garvey. 2018. ‘Written in these scars are the stories I can’t explain’: A content analysis of pro-ana and thinspiration image sharing on Instagram. New Media & Society 20 (3 2018), 1181–1200. Issue 3.
  • Grogan et al. (2004) Sarah Grogan, Ruth Evans, Sam Wright, and Geoff Hunter. 2004. Femininity and muscularity: Accounts of seven women body builders. Journal of gender studies 13, 1 (2004), 49–61.
  • Grover and Leskovec (2016) Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855–864.
  • Hanu and Unitary team (2020) Laura Hanu and Unitary team. 2020. Detoxify. Github. https://github.com/unitaryai/detoxify.
  • He et al. (2024a) Zihao He, Rebecca Dorn, Siyi Guo, Minh Duc Chu, and Kristina Lerman. 2024a. COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities. arXiv preprint arXiv:2406.12074 (2024).
  • He et al. (2024b) Zihao He, Siyi Guo, Ashwin Rao, and Kristina Lerman. 2024b. Whose Emotions and Moral Sentiments Do Language Models Reflect? arXiv preprint arXiv:2402.11114 (2024).
  • He et al. (2023) Zihao He, Jonathan May, and Kristina Lerman. 2023. CPL-NoViD: Context-Aware Prompt-based Learning for Norm Violation Detection in Online Communities. arXiv preprint arXiv:2305.09846 (2023).
  • He et al. (2024c) Zihao He, Ashwin Rao, Siyi Guo, Negar Mokhberian, and Kristina Lerman. 2024c. Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities. In Findings of the Association for Computational Linguistics: EACL 2024. 1523–1536.
  • Hofmann et al. (2022) Valentin Hofmann, Hinrich Schütze, and Janet B Pierrehumbert. 2022. The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 16. 1259–1267.
  • Jáuregui-Lobera et al. (2013) Ignacio Jáuregui-Lobera, Mercedes Ezquerra-Cabrera, Rocío Carbonero-Carreño, and Inmaculada Ruiz-Prieto. 2013. Weight misperception, self-reported physical fitness, dieting and some psychological variables as risk factors for eating disorders. Nutrients 5, 11 (2013), 4486–4502.
  • Jhaver et al. (2019) Shagun Jhaver, Iris Birman, Eric Gilbert, and Amy Bruckman. 2019. Human-machine collaboration for content regulation: The case of reddit automoderator. ACM Transactions on Computer-Human Interaction (TOCHI) 26, 5 (2019), 1–35.
  • Jiang et al. (2022) Hang Jiang, Doug Beeferman, Brandon Roy, and Deb Roy. 2022. CommunityLM: Probing Partisan Worldviews from Language Models. In Proceedings of the 29th International Conference on Computational Linguistics. 6818–6826.
  • Jiang and Vosoughi (2020) Jiachen Jiang and Soroush Vosoughi. 2020. Not Judging a User by Their Cover: Understanding Harm in Multi-Modal Processing within Social Media Research. In Proceedings of the 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in Multimedia (Seattle, WA, USA) (FATE/MM ’20). Association for Computing Machinery, New York, NY, USA, 6–12. https://doi.org/10.1145/3422841.3423534
  • Jones and Morgan (2010) William Jones and John Morgan. 2010. Eating disorders in men: A review of the literature. Journal of public mental health 9, 2 (2010), 23–31.
  • Juarascio et al. (2010) Adrienne S Juarascio, Amber Shoaib, and C Alix Timko. 2010. Pro-eating disorder communities on social networking sites: a content analysis. Eating disorders 18, 5 (2010), 393–407.
  • Kao et al. (2024) Hsien-Te Kao, Isabel Erickson, Minh Duc Hoang Chu, Zihao He, Kristina Lerman, and Svitlana Volkova. 2024. Machine Learning Insights Into Eating Disorder Twitter Communities. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. 1–8.
  • Kast (2018) Hinde Kast. 2018. The Unspoken Power of Toxic Words on Body Image. Master’s thesis. University of Southern California.
  • Kour et al. (2022) George Kour, Samuel Ackerman, Eitan Daniel Farchi, Orna Raz, Boaz Carmeli, and Ateret Anaby Tavor. 2022. Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora. In Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), Antoine Bosselut, Khyathi Chandu, Kaustubh Dhole, Varun Gangal, Sebastian Gehrmann, Yacine Jernite, Jekaterina Novikova, and Laura Perez-Beltrachini (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 405–416. https://doi.org/10.18653/v1/2022.gem-1.35
  • Lerman et al. (2023) Kristina Lerman, Aryan Karnati, Shuchan Zhou, Siyi Chen, Sudesh Kumar, Zihao He, Joanna Yau, and Abigail Horn. 2023. Radicalized by Thinness: Using a Model of Radicalization to Understand Pro-Anorexia Communities on Twitter. arXiv preprint arXiv:2305.11316 (2023).
  • Liu et al. (2019) Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs/1907.11692 (2019). arXiv:1907.11692 http://arxiv.org/abs/1907.11692
  • Marks et al. (2020) Rosie Jean Marks, Alexander De Foe, and James Collett. 2020. The pursuit of wellness: Social media, body image and eating disorders. Children and Youth Services Review 119 (2020), 105659. https://doi.org/10.1016/j.childyouth.2020.105659
  • McCormack (2010) Abby McCormack. 2010. Individuals With Eating Disorders and the Use of Online Support Groups as a Form of Social Support. CIN: Computers, Informatics, Nursing 28 (1 2010), 12–19. Issue 1.
  • Murnen and Don (2012a) SK Murnen and BP Don. 2012a. Body image and gender roles. Encyclopedia of body image and human appearance 1 (2012), 128–134.
  • Murnen and Don (2012b) Sarah Murnen and Brian Don. 2012b. Body Image and Gender Roles. Vol. 1. pp. 128–134. https://doi.org/10.1016/B978-0-12-384925-0.00019-5
  • Murray et al. (2016) Stuart B Murray, Scott Griffiths, Leila Hazery, Tori Shen, Tom Wooldridge, and Jonathan M Mond. 2016. Go big or go home: A thematic content analysis of pro-muscularity websites. Body image 16 (2016), 17–20.
  • Murray et al. (2012) Stuart B Murray, Elizabeth Rieger, Tom Hildebrandt, Lisa Karlov, Janice Russell, Evelyn Boon, Robert T Dawson, and Stephen W Touyz. 2012. A comparison of eating, exercise, shape, and weight related symptomatology in males with muscle dysmorphia and anorexia nervosa. Body Image 9, 2 (2012), 193–200.
  • Muth and Cash (1997) Jennifer L Muth and Thomas F Cash. 1997. Body-Image Attitudes: What Difference Does Gender Make? 1. Journal of applied social psychology 27, 16 (1997), 1438–1452.
  • Oksanen et al. (2016) Atte Oksanen, David Garcia, and Pekka Räsänen. 2016. Proanorexia communities on social media. Pediatrics 137, 1 (2016).
  • Park et al. (2016) Gregory Park, David Bryce Yaden, H Andrew Schwartz, Margaret L Kern, Johannes C Eichstaedt, Michael Kosinski, David Stillwell, Lyle H Ungar, and Martin EP Seligman. 2016. Women are warmer but no less assertive than men: Gender and language on Facebook. PloS one 11, 5 (2016), e0155885.
  • Pärssinen and Seppälä (2002) Miia Pärssinen and Timo Seppälä. 2002. Steroid use and long-term health risks in former athletes. Sports medicine 32 (2002), 83–94.
  • Pascual-Ferrá et al. (2021) Paola Pascual-Ferrá, Neil Alperstein, Daniel J Barnett, and Rajiv N Rimal. 2021. Toxicity and verbal aggression on social media: Polarized discourse on wearing face masks during the COVID-19 pandemic. Big Data & Society 8, 1 (2021), 20539517211023533. https://doi.org/10.1177/20539517211023533 arXiv:https://doi.org/10.1177/20539517211023533
  • Peat et al. (2008) Christine M Peat, Naomi L Peyerl, and Jennifer J Muehlenkamp. 2008. Body image and eating disorders in older adults: a review. The Journal of general psychology 135, 4 (2008), 343–358.
  • Prescott et al. (2019) Julie Prescott, Terry Hanley, and Katalin Ujhelyi Gomez. 2019. Why do young people use online forums for mental health and emotional support? Benefits and challenges. British Journal of Guidance & Counselling 47, 3 (2019), 317–327.
  • Quiniones and Oster (2019) Cherry Quiniones and Candice Oster. 2019. Embracing or resisting masculinity: Male participation in the proeating disorders (proana) online community. Psychology of Men & Masculinities 20, 3 (2019), 368.
  • Räisänen and Hunt (2014) Ulla Räisänen and Kate Hunt. 2014. The role of gendered constructions of eating disorders in delayed help-seeking in men: a qualitative interview study. BMJ open 4, 4 (2014), e004342.
  • Saiphoo and Vahedi (2019) Alyssa N. Saiphoo and Zahra Vahedi. 2019. A meta-analytic review of the relationship between social media use and body image disturbance. Computers in Human Behavior 101 (12 2019), 259–275.
  • Sanh et al. (2019) Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
  • Waller and Anderson (2021) Isaac Waller and Ashton Anderson. 2021. Quantifying social organization and political polarization in online platforms. Nature 600, 7888 (2021), 264–268.
  • Yeshua-Katz and Martins (2013) Daphna Yeshua-Katz and Nicole Martins. 2013. Communicating Stigma: The Pro-Ana Paradox. Health Communication 28, 5 (2013), 499–508.

Appendix A Data

Table 2. Number of submissions and comments sampled per subreddit. From the largest subreddits, 5,000 examples were randomly subsampled. The comments belong to the selected submissions.
Subreddit Submissions Comments Unique authors
1200isplenty 5000 5000 7218
gainit 5000 5000 6804
Volumeeating 5000 5000 5602
amiugly 5000 5000 9102
ARFID 5000 5000 5023
bulimia 5000 5000 5394
fasting 5000 5000 7554
fuckeatingdisorders 5000 5000 4058
intermittentfasting 5000 5000 7875
MadeOfStyrofoam 5000 5000 5577
keto 5000 5000 7780
ketorecipes 5000 5000 6310
loseit 5000 5000 8373
nattyorjuice 5000 5000 6051
omad 5000 5000 5877
xxfitness 5000 5000 7703
SuicideWatch 5000 5000 8593
bodyweightfitness 5000 5000 7566
CICO 5000 5000 6750
EatingDisorders 5000 5000 5568
AnorexiaNervosa 5000 5000 5559
BingeEatingDisorder 5000 5000 6426
BodyDysmorphia 5000 5000 5828
EDAnonymous 5000 5000 6608
xxketo 5000 5000 5527
EdAnonymousAdults 5000 5000 3921
GettingShredded 5000 5000 6649
Fitness 5000 5000 8286
eating_disorders 4504 5000 5193
ketogains 4313 5000 5460
powerbuilding 3347 5000 4034
goodrestrictionfood 2868 2786 1816
BDDvent 2856 2435 1797
Instagramreality 2488 5000 6158
drunkorexia 2382 5000 2312
bodybuilding 2351 5000 4603
BulkOrCut 2247 4835 3802
progresspics 1928 5000 5071
ShittyRestrictionFood 1609 5000 3037
EDanonymemes 1346 5000 3298
fit 1257 1431 1769
safe_food 1121 4979 3176
Brogress 1053 4651 4039
steroids 1013 5000 3391
weightroom 1012 5000 2584
FlexinLesbians 577 2022 1484
178272 218139 246606

A.1. Reddit Data

Statistics of different subreddits are shown in Table 2.

Appendix B Analysis of Subreddits

B.1. Subreddits along the Dimensions of Gender and Body Ideal

TSNE embeddings of communities from the bipartite user co-activity (posting and commenting) network is shown in Figure 9. Distribution of communities along the masculine-feminine dimension is showin in Figure 10.

Refer to caption\Description
Figure 9. TSNE embeddings of communities from the user co-activity network. For the body ideal spectrum, the identified thin/muscular-ideal communities in the seed pairs to construct the axis are marked by upper/lower triangles, and the remaining communities are marked by circles. For the gender spectrum, the feminine/masculine communities are colored in red/blue, and the remaining communities not studied by Waller and Anderson (2021) are colored in gray.
Refer to caption\Description
Figure 10. Distribution of communities along the masculine-feminine dimension. Each community is placed equidistantly along the axis to represent its relative ranking, with the actual positions not reflecting raw scores but rather the ordinal ranking within the spectrum.

B.2. Emotions in Subreddits

Figures 11, 12, 14, and 15 show the distributions of positive and negative emotions, in submissions and comments in different subreddits. Figure 13 shows the dominant emotions for different subreddits in submissions and comments.

Refer to caption
\Description
Figure 11. Distribution of positive emotion scores in submissions in different subreddits.
\Description
Refer to caption
Figure 12. Distribution of negative emotion scores in submissions in different subreddits.
Refer to caption
(a)
Refer to caption
\Description
(b)
Figure 13. The most dominant emotion in (a) submissions and (b) comments
Refer to caption
\Description
Figure 14. Distribution of positive emotion scores in comments in different subreddits.
Refer to caption
\Description
Figure 15. Distribution of negative emotion scores in comments in different subreddits.

B.3. Emotion and Toxicity Analysis of Gendered Subreddits

Figures 17 and 16 show the distributions of neutral emotion scores and toxicity scores in seed pair communities identified by Waller and Anderson (2021) for the gender axis.

Refer to caption
\Description
Figure 16. Distribution of neutral emotion scores in gendered seed subreddits.
Refer to caption
\Description
Figure 17. Distribution of toxicity scores in gendered seed subreddits.