Open-Source-AI-Generated Images: Characterizing the Civitai Ecosystem.

1. Introduction

The commodification of AI Generated Content (AIGC) has had a significant impact on online creative communities (doi:10.1126/science.adh4451; 10.1145/3475799). For example, the Generative Diffusion Model (GDM) (diffusion) has achieved state-of-the-art outcomes in the realm of image generation, with open-source implementations like Stable Diffusion (stable-diffusion) easily accessible. Their open-source nature further enables fine-tuning and extension of the models.

This has driven the emergence of AIGC social platforms such as Civitai, PixAI, and Tensor.art. These are online platforms for sharing models, images and discussing open-source generative AI. They are designed akin to social media services, allowing users to showcase their creations, participate in discussions, and receive feedback, thereby creating a sense of community. Uniquely, they also allow users to develop and share their own generative AI models. For instance, bespoke models can be developed for generating particular types of images (e.g., containing particular people or artistic styles) and, subsequently, other users can then share the outputs (images) from these models for further social discussion. These unique features have attracted a significant number of creators sharing numerous novel models and artworks, catalyzing new trends in AI content creation (lloyd2023there; cao2023comprehensive).

However, the unrestricted proliferation of diverse models represents a double-edged sword: while they can help unleash creativity, they also pose challenges and risks that require careful consideration. Numerous issues concerning the abuse of generative AI have already been reported, including flooding online communities with not-safe-for-work (NSFW) images (ungless2023stereotypes), disseminating deceptive deepfakes (yadav2019deepfake), and infringing upon copyright (franceschelli2022copyright). Anecdotally these platforms have often been the origin of the generative AI models that produce the aforementioned abusive content, and also where the abusive content is initially shared (gorwa2023moderating; 10.1145/3514094.3534167). Thus, the proliferation of abusive content from these platforms can exert a broader influence, permeating other social media communities.

As a result, there is an arguable need to somehow moderate the use of these models on such platforms. However, to date, there have been no prior studies that could inform the debate. With this in-mind, we conduct the first large-scale empirical study of an emerging AIGC social platform, focusing on the Civitai — the largest social platform for image models (civitai-stat-2). As of November, 2023, it has attracted 10 million unique visitors each month. We compile a dataset comprising all metadata (for both images and models) shared on Civitai until 15^th, December, 2023, containing 87,042 generative models and 2,740,149 AI-generated images. Using a range of techniques, we then label each model and image with information about its themes and the presence of NSFW concepts. We explore the following research questions:

•

RQ1: As each model can be highly bespoke, what are the key themes the models are designed to generate images for? Further, what are the subsequent themes of the images generated, and do they reflect a prevalence of abusive content?
•

RQ2: How popular are models that are designed for generating abusive images, and what types of image prompts do users utilize to generate such content?
•

RQ3: Are users more active in engaging with abusive models and images, as measured by social metrics such as comments and favorites?
•

RQ4: Do the creators of abusive models and images exhibit distinct positions within the wider social network (i.e. centrality), as compared to creators who do not?

We offer the first characterization of the themes of models and images on Civitai and reveal a prevalence of abusive content. Our main findings include:

(1)

We find a range of models (and subsequently generated images) each geared towards a particular theme. 16.97% models and 72.05% images contain tags related to NSFW content; 23.54% models and 32.98% images are deepfakes. Moreover, deepfakes in Civitai tend to be associated with NSFW content (e.g., naked deepfakes), with a positive correlation between tags for NSFW content and deepfakes (model: $\phi=0.17$ ; image: $\phi=0.10$ ). We also find that over half of the deepfake victims are celebrities.
(2)

Models that are designed for NSFW content are more popular than non-NSFW models. On average, NSFW models have generated 36.36 images (per model) vs. 24.20 for non-NSFW models. However, we also find that non-NSFW models are frequently re-purposed to generate NSFW content, via prompting. 37.05% of the NSFW images are generated by prompting non-NSFW models to contain NSFW concepts. Additionally, we find frequent references to real person names in the textual description of deepfake models. The most common victims are social media celebrities, such as Instagram influencers or OnlyFans stars.
(3)

Civitai users are more active in engaging with NSFW models and images, as measured by common social network metrics. Compared with their non-NSFW counterparts, NSFW models and images receive significantly more downloads/views (models: 3.32x; images: 1.18x), favorites (models: 3.22x; images: 1.63x), and financial “tips” (models: 1.92x; images: 1.53x).
(4)

Creators sharing abusive models and images are have higher centrality in the social follower network. For example, creators who have shared at least 3 NSFW or deepfake models/images hold higher median centrality like betweenness (models: 2.59x; images: 1.35 $e^{-06}$ vs. 0), in-degrees (models: 1.50x; images: 6.00x), and PageRank (models: 1.003x; images: 1.005x), compared with those who haven’t. Therefore, these creators tend to have more follower links, hold bridge positions and befriend more influential users.

2. Primer on Civitai

As a social platform, Civitai enables users to share their AI models and generated images, as well as receive feedback, comments and even tips from other users. In this section, we introduce the necessary pre-knowledge about Civitai.

Models and images. Civiati hosts diffusion models and AI-generated images, uploaded by creators. Every model/image is associated with a unique ID and a preview web page public to any users. Various social metadata is visible as well, involving tags (assigned by users or Amazon Rekognition (Civitai_tag; Civitai_real_people)), statistics (e.g., number of downloads/views, likes, and rating scores), and text comments by other registered users. Creators can also attach descriptive information to their models and images, i.e. textual descriptions of models’ usage and images’ configurations, resources, and prompts used for generation.

Users. Similar to common social platforms, Civitai users have profile pages displaying their self-reported information and all their models/images. Users can also attach external links to their profile page, as promotion for their accounts on other social platforms (e.g., Instagram and X) or profitable platforms (e.g., Ko-fi and Patreon). Furthermore, users can follow each other and leave rating scores to the profile pages.

3. Methodology

3.1. Data collection

We compile a dataset containing the metadata of all models, images, and creators in Civitai. To accomplish this, we utilize the Civitai REST API.¹¹1https://github.com/civitai/civitai/wiki/REST-API-Reference In addition, we employ selenium webdrivers to crawl the relevant Civitai webpages, enabling us to gather the volume of tips to each of models, images and creators.

Model data. We collect 87,042 models’ metadata. The metadata contain publish date, statistics (e.g., number of downloads, likes, comments, and rating score (range from 0 to 5)), flag for real-human deepfake, flag for NSFW, amount of tips, tags and description of models’ content. Of these models, 8.0% are checkpoint models (base models), 84.4% are LoRA (hu2022lora; lora-sd) (or LyCORIS (yeh2024navigating)) models (fine-tune models), 5.8% are embeddings and 1.8% are other models. Figure 1a shows the daily count of the uploaded models.

Image and prompt data. We collect 2,740,149 images’ metadata (including the preview images of models). The metadata contains publish date, size (e.g., height and width), statistics (e.g., number of consumers’ five reactions - cry, laugh, like, dislike, and heart; number of comments, views), amount of tips, tags of images’ content, and used models. In all, these images are shared by 56,502 creators. Additionally, the metadata may also include the prompts used to generate the images. In total, we have gathered 1,534,922 prompts.

Creator data. There are 3 types of creators in Civitai, model-only creators (M), image-only creators (I), model-and-image creators (MI). We find 56,779 creators, with 4,233 (7.5%) model-only creators who create 11% of models, 45,147 (79.5%) image-only creators who create 57% of images, and 7,399 (13%) model-and-image creators who create 89% of models and 43% of images (CDF of the number of creators’ creations plotted in Figure 1b). We collect metadata of these creators. The metadata contains the joined date, statistics (e.g., number of received likes, followers, downloads, and rating score), profile description, external links, and list of followers.

Refer to caption — Figure 1. (a) Daily count of the uploaded models; (b) CDF of the number of creations for each type of the creators.

3.2. GPT-assisted Qualitative Analysis

Our study involves two qualitative analyses – (i) extracting the themes of models and images (§LABEL:sec:theme), and (ii) identifying person names as well as occupations from models’ descriptions (§4.1). Considering that our studies cover a large-scale dataset, we leverage ChatGPT, as relevant literature have highlighted its potential in facilitating open coding (xiao2023supporting; gao2023collabcoder) and named entity recognition (wei2023zeroshot).

Model implementation. We use the gpt-3.5-turbo-0125 model as it can return its responses as a JSON object in a desired format, where we can easily parse and extract labels. We access the model through OpenAI’s API with parameter temperature set to $0$ to make the response focused and deterministic.

Extracting tags’ themes. We first select the top 500 most popular tags associated with models and images respectively. To aid in our later analysis, we then utilize ChatGPT to mine and summarize potential thematic categories. Our prompts are constituted by a “system” message making ChatGPT respond in a desired JSON format and a “user” message in JSON syntax to improve ChatGPT’s annotation performance and efficiency (zhu2024apt):

Following this, two authors manually review ChatGPT’s responses to correct mis-classified tags and consolidate duplicate categories (e.g., merging tags in the “Gender and Body Attributes” and “Human Characteristics” categories into a new category named “Human attributes”).

Person name recognition. To inspect who are the victims targeted by deepfake models, we leverage ChatGPT to recognize real people’s names in each model’s description. For this, we organize corresponding prompts as:

To validate the results, we manually label person names from a 100 randomly sampled models. We treat ChatGPT as a third-party annotator and compare its annotation against our human labels. We find that ChatGPT reports the correct occupations of all its person names. These results validates ChatGPT’s capability in recognizing person names and their occupations.

3.3. Prompt comparison with mainstream AIGC platforms.

Our study also contains a comparative analysis of usage of NSFW content in prompts between Civitai and two mainstream AIGC platforms, Stable Diffusion and Midjourney.

For this, we collect two prompt datasets, DiffusionDB (1,528,512 distinct prompts from Stable Diffusion Discord) (wang2022diffusiondb) and JourneyDB (1,466,884 distinct prompts from Midjourney) (sun2023journeydb). Each of the two datasets we selected pertains a large volume of user-generated prompts rooted on one specific platforms, providing a comprehensive lens for us to understand how NSFW content distribute in prompts on the corresponding platform.

Afterwards, we employ OpenAI’s moderation API, configured with the text-moderation-006 model, to quantify the degree of NSFW content exposed in each prompt’s text in our Civitai datasets plus DiffusionDB and JourneyDB. OpenAI’s moderation API takes a prompt’s text as a input and then reports the value of the degree of NSFW content (ranging from 0 to 1), as well as a flag defining whether the prompt is NSFW. We choose this moderation model because it is effective in detecting NSFW content (Nekoul_Lee_Adler_Jiang_Weng_2023) and has shown its ability to process prompt text by being practically employed to moderate ChatGPT’s prompt input (Civitai_moderation).

4. RQ3: User activities with abusive AIGC

In this section, we delve deeper into users’ creation and consumption on abusive AIGC. We aim to profile users’ creation from their usage of abusive models, NSFW prompts and reference to person names in real-human deepfakes, which are three crucial angles to analyze abusive AI creation. We then inspect their influence on users’ consuming behaviors.

4.1. Profiling creation of abusive AIGC

Usage of deepfake and NSFW models. We first take a close view on users’ usage patterns on real-human deepfake and NSFW models, as this can offer moderators a vital lens to the popularity and productivity of abusive models . We utilize the labels reported by Civitai API to annotate the models as real-human deepfakes or NSFW independently. In all, 13,516 models (15.53%) are classified as real-human deepfakes, and 7,614 models (8.75%) are associated with NSFW content. These models have been used to produce 149,227 (5.46%) and 261,432 (9.54%) unique images, respectively. This implies that abusive models are not a minor class and playing a role in AI creation, where a notable portion of images are produced by these models. Additionally, deepfake and NSFW content tend to co-appear on themes of AI-generated images (Figure LABEL:fig:theme_phi_image), an opposite trend is observed in the usage patterns of the models. A phi coefficient of -0.10 suggests that models for real-human deepfakes and NSFW content are less likely to be used together in image creation. We note that this is relevant to the prevalent usage of NSFW prompts, which will be detailed in following prompt analysis.

Moreover, by comparing the image base generated by abusive models, we also highlight that the real-human deepfake and NSFW models have different influence on creators’ usage patterns. Figure 2 presents the comparison of the distribution of productivity among distinct types of models, measured as the number of images per model. Compared with non-NSFW models ( $min=0,mid=14,max=133,\mu=24.20$ ), NSFW models are used to generate more images ( $min=0,mid=22,max=100,\mu=36.36$ ). In contrast, real-human deepfake models are used to generate less images ( $min=0,mid=7,max=133,\mu=11.08$ ) than the non-real-human deepfake models ( $min=0,mid=16,max=100,\mu=27.88$ ). Such a significant difference implies that the types of abusive models act a crucial part in creators’ productivity as well. In the case of Civitai, creators have a much stronger propensity to generate images with NSFW rather than real-human deepfake models. This insight could guide moderators in pinpointing the models that are likely to spur the creation of abusive images by creators.

Usage of NSFW prompts. Our analysis indicates a link between deepfakes and NSFW content, yet usage patterns suggest they’re not often used together. We suspect this is due to a preference for NSFW prompts instead. This motivates us to examine how much NSFW content appears in prompts.

First, we explore the distribution of NSFW content in prompts’ text. Figure 3 presents the distribution of the degree of NSFW content exposed in prompts text reported by OpenAI’s moderation API. The threshold of NSFW degree for the API to raise a NSFW flag is set to 0.53 by default. Generally, NSFW prompt is not a minor class, where 404,330 (27.24%) prompts are reported as NSFW. Additionally, we notice the distribution appears to be bimodal with the main peak at around degree $=0$ and a lower peak around degree $=1$ . Notably, 39.12% of NSFW prompts contain a very high degree of NSFW content ( $>0.9$ ). Moreover, Figure 4 illustrates the distribution of NSFW content degree in prompts within Civitai, Stable Diffusion Discord, and Midjourney. Regarding to all prompts, Civitai possesses an overall higher distribution of NSFW content degree in prompts than other two platforms (Figure 4(a)). When it comes to only NSFW prompts, promopts in Civitai and Midjourney possess more NSFW content than those in Stable Diffusion (Figure 4(b)). Nonetheless, a one-sided two-sample Kolmogorov–Smirnov test reports that Civitai still holds a significantly ( $p<0.001$ ) higher distribution of NSFW content in prompts than Stable Diffusion Discord ( $D=0.150$ ) and Midjourney ( $D=0.023$ ). These findings indicate that emerging AIGC platforms lacking rigorous moderation may face a considerable influx of NSFW prompts, alongside a pronounced inclination among creators to engage with NSFW content in more extreme manifestations.

Occupation	#Models	#Images (NSFW%)	Representatives (#Images)
Actress	3,916	50,843
(9.73%)	Emma Watson (648), Natalie Portman (542), Ana De Armas (500), Alexandra Daddario (445), Scarlett Johansson (398)
Model	1,663	19,323
(13.31%)	Emily Bloom (187), Cara Delevingne (150), Kendall Jenner (125), Jenna Ortega (110), Nicola Cavanis (100)
Actor	717	7,877
(3.44%)	Henry Cavill (271), Fares Fares (135), Nicolas Cage (107), Arnold Schwarzenegger (88), Harrison Ford (82)
Singer	757	7,296
(10.36%)	Billie Eilish (253), Dua Lipa (230), Taylor Swift (221), Avril Lavigne (214), Britney Spears (170)
Internet influencer	296	3,323
(13.12%)	Belle Delphine (164), Brooke Monk (100), Ricardo Milos (74), Dasha Taran (71), Kris H Collins (67)
Character	225	2,698
(10.08%)	Hermione Granger (99), Jill Valentine (87), Sabine Wren (69), 2B (61), El Chavo del Ocho (60)
Pornstar	210	2,376
(18.56%)	Katja Kean (76), Simone Peach (60), Teagan Presley (59), Alex Coal (58), Anita Blond (50)
Adult Model	275	2,239
(8.35%)	Lucid Lavender (64), Matthew Rush (39), Sean Cody (39) Hailey Leigh (36), Bunny Colby (34)
Streamer	145	1,745
(15.70%)	Valkyrae/Rachell Hofstetter (166), Alexandra Botez (80), Sasha Grey (78), Andrea Botez (62)
Idol	166	1,239
(7.75%)	Akina Nakamori (52), Cherprang Areekul (38), Song Yi (37), Yuino Mashu (36) Kim Ji-Woo (35)

Table 1. Top-10 occupations of celebrities involved in creation of deepfake models, ranked by their counts of derivative images. “#Models/#Images” presents the number of deepfake models/derivative images containing person names within corresponding occupation. “NSFW%” shows the percentage of images labeled as NSFW by Civitai API.

Reference to person names. Implied by our themes analysis, real-world celebrities has potentially been involved in abusive creations on Civitai (§LABEL:sec:theme). In this part, we take a closer look on creators’ reference to person names to inspect who are the main victims encountering deepfakes. Leveraging ChatGPT’s intelligence, we extract person names with their occupations from the textual usage descriptions of real-human deepfake models (§3.2). Additionally, it is also important to reflect the main targeted industries to understand the trend of deepfake attacks. Thus, we then group these celebrities by occupations and rank the groups by their number of derivative images. We manually review dominant groups (top-100) and consolidate duplicate groups by standardizing their occupation names (e.g., merging all groups containing “actress” in the names into a general group named “actress”).

In all, within deepfake models, ChatGPT recognizes 8,297 distinct person names from 10,170 (75.24%) models, as well as 116,994 (78.40%) images generated by these models. These results suggests a prevalence among creators to target at celebrities when creating real-human deepfakes. Moreover, Table 1 summarizes the topic-10 occupations and statistics of corresponding models and images. Generally, we find that celebrities from three industries are the main targets of deepfakes on Civitai – entertainment (e.g., actress/actor, model, and singer), adult (e.g., pornstar and adult model), and social media (Internet influencer and streamer). Interestingly, regarding NSFW deepfakes, models associated with celebrities from social media industries are more likely to be employed to create NSFW images (14.01% labeled as NSFW) than those associated with celebrities from entertainment (10.01%), or even adult (13.61%) industries. By further exploring the affiliations of these online celebrities with social platforms, we find that most of them are either closely associated with subscription platforms (e.g., Belle Delphine with OnlyFans and Andrea Botez with Fanhouse) or well-known as Instagram models (e.g., Kris H Collins and Brooke Monk). Diverging from the traditional focus on the entertainment industry and politicians (10.1145/3583780.3614729), our findings highlight social media industry as a new domain suffering significant deepfake creation, notably online celebrities exposed to bodily and sexual content, who are more susceptible to being targeted for NSFW deepfake model training (van2020verifying; maddocks2020deepfake).

4.2. Profiling consumption on abusive content

Existing literature have underscored the importance of moderating communities’ active consumption, as it can potentially encourage creators to produce more abusive AIGC . Inspired by this, we here inspect the influence by abusive AIGC on users’ consumption and the association between such consumption and abusive creation.

Metrics to quantify consumption. We examine several metrics to quantify users’ consumption on AIGC:

•

Number of downloads/views: The total number of times that a model has been downloaded or a image has been viewed.
•

Number of favorites: The total number of favorites that a model or image has received.²²2While a model has direct statistics of favourites, an image’s favorites are represented by two emoji-based reactions, “like” and “heart”, left by viewers under the image.
•

Number of comments: The total number of comments that a model/image has received.
•

Rating score: The overall rating score the model (not supported for images) possesses.
•

Buzz: The volume of Buzz a model/image accumulates by receiving tips from users. Here the Buzz is the in-site digital currency on Civitai (civitai_buzz).

Real-human deepfake

NSFW

Mean diff

(True vs. False)

p-value

Mean diff

(True vs. False)

p-value

Number of downloads

0

582.42<1539.32

***

3842.83>1155.68

***

Number of favorites

0

66.55<237.61

***

569.23>176.72

***

Number of comments

2.32<3.24

***

4.53>2.31

***

Rating score

3.27<3.43

***

3.58>3.28

***

Buzz

0

6.64<45.24

***

70.33>36.54

***

(a) Consumption on models

Real-human deepfake

NSFW

Mean diff

(True vs. False)

p-value

Mean diff

(True vs. False)

p-value

Number of views

747.25<958.80

***

1030.18>866.55

0

***

Number of favorites

1.51<2.54

***

3.09>1.89

***

Number of comments

0.017<0.032

***

0.029<0.033

***

Rating score

-

Buzz

0.15<0.47

***

0.55>0.36

***

(b) Consumption on images

Table 2. Comparison on metrics of content consumption by the Mann-Whitney U test between models/images groups categorized by their label as real-human deepfakes or NSFW. Note, as Civitai API doesn’t label out deepfake images, we annotate an image as real-human deepfake if it is generated by a model labeled as real-human deepfake by Civitai API. “Mean diff” column shows the comparison results of the mean value of corresponding metrics between two groups. *** denotes that

p<0.001

.

Consumption on deepfake and NSFW content. Using the above metrics, we next inspect whether abusive content would influence users’ consumption. For this, we first group models and images by their labels as real-human deepfakes or NSFW independently. We then perform the Mann-Whitney U test to assess in-group difference on each of aforementioned metrics. Table 2 summarize the comparison results. Generally, all comparisons possess statistical significance ( $p<0.001$ ), which evidences that users’ consuming behaviors have been significantly influenced by abusive AIGC.

Moreover, we observe the communities presenting opposite attitudes between real-human deepfakes and NSFW AIGC. According to Table 2(a) and 2(b), NSFW AIGC pertain higher volume of almost all consumption metrics on average. Compared with not NSFW ones, NSFW models and images seems more popular and more likely to be downloaded or viewed. Meanwhile, creators can gain more favorites by sharing NSFW models and images. Additionally, NSFW models not only are assigned with higher ranking score, but also can prompt viewers to leave more comments. However, these trends get reversed when regarding to real-human deefakes. Either models or images belong to real-human deepfakes pertain lower volume of all consumption metrics on average.

Reminding that a same reversion is observed on abusive models’ productivity (Figure 2), we hypothesize above result is caused by the fact that more productive abusive models and their generated images are more likely to be consumed. Thus, we conduct the Pearson correlation analysis between models’ productivity and users’ consumption on these models and their generated images. Table 3 presents the results of correlation analysis on models and images grouped as real-human deepfakes or NSFW. Aligning to our assumption, except images’ Buzz, all consumption metrics pertain a significant ( $p<0.001$ ) positive correlation with models’ productivity ( $r>0$ ). Meanwhile, such a relation mainly present on both real-human deepfake and NSFW models’ downloads, favorites, comments and rating scores ( $r>0.2$ ), while the connection on all metrics of images is minor ( $r<0.2$ ). Our results reveal the crucial role of users’ consumption in promoting image creation with abusive models, particularly noting that more productive real-deepfake and NSFW models attract greater consumption. Consequently, managing user consumption emerges as a potent strategy to mitigate the image generation by abusive models.

4.3. Answers and implications for RQ3

	Real-human deepfake		NSFW
	Pearson’s $r$	p-value	Pearson’s $r$	p-value
Number of downloads	0.344	***	0.287	***
Number of favorites	0.313	***	0.349	***
Number of comments	0.288	***	0.223	***
Rating score	0.281	***	0.408	***
Buzz	0.032	***	0.113	***

(a) Correlation between a model’s productivity and users’ consumption on the model.

	Real-human deepfake		NSFW
	Pearson’s $r$	p-value	Pearson’s $r$	p-value
Number of views	0.099	***	0.170	***
Number of favorites	0.134	***	0.173	***
Number of comments	0.032	***	0.056	***
Rating score	-	-	-	-
Buzz	0.003	0.193	0.017	***

(b) Correlation between the average productivity among used models in a image and users’ consumption on the image.

Table 3. Correlation analysis by measuring Pearson’s

r

between models’ productivity and users’ consumption on abusive AIGC. For a model, the tested productivity is its number of generated images. For an image, the tested productivity is the average number of generated images by the used models in this image. *** denotes that

p<0.001

.

5. RQ4: Creators of abusive content

6. Related Work

Platforms for AI models. Previous studies have looked at online platforms for AI models, with a particular emphasis on traditional platforms like GitHub and Huggingface. These investigations cover a wide range of perspectives, including machine learning (taraghi2024deep; matsubara2023torchdistill), software engineering (taraghi2024deep; jiang2023exploring), and social computing (AIT2024103079; wei2024understanding). Additionally, there are also studies that put forward innovative designs for these platforms (kumar2020marketplace; hosny2019modelhubai). In contrast, Civitai and other AIGC social platforms also serve as a hub to showcase AIGC, and an online community for AI creators, attracting a diverse user base that extends beyond programmers and computer scientists. To the best of our knowledge, this is the first large-scale empirical study of an emerging AIGC social platform.

Abuse of generative AI. Several studies have examined the abuse of generative AI. There are two perspectives closely related to our work. The first issue concerns the spread of misinformation through deepfakes (10.1145/3581783.3612704). Multiple studies have looked into the prevalence of deepfakes on social media and their potential impact on security and safety (10.1145/3442381.3449978; 10.1145/3583780.3614729; yang2024characteristics; 10.1145/3491102.3517446; lu2023seeing; 279946). The second issue involves the creation of NSFW content more generally. Numerous studies have highlighted the significant increase in AI-generated NSFW content on the Internet, particularly on social media platforms. Concerns have been raised about the lack of regulation and moderation of this content, and the potential impact it may have on the online environment and community building (chen2023twigma; 10132120; wei2024understanding; gorwa2023moderating; ungless2023stereotypes). In contrast, Civitai and other AIGC social platforms offer more than just AI-generated images — they include generative AI models that produce the abusive images. Overall, our research complements prior studies by providing insights not only from the image angle, but also from the model and creator perspective. We argue this can help in better regulating and moderating potentially abusive models and images.

JZ: Bellows are old content for backups GT: Let’s maybe move these backups out into a different file

7. Profiling model creation and consumption

GT: It’d be good to fire in some breadcrumb text for the intros. Let’s embed that now, so that we can start to polish up the overall storyline

7.1. Prevalent theme of model creation

Commercial use permission. Since Civitai is a model marketplace, another interesting angle to explore is its model creation environment, i.e. the creators’ commercial use permission assigned to their models. Figure 5 presents the distribution of models’ commercial use permission from three prominent perspectives – (i) Credit requirement. This aspect represents whether using a model would require crediting its creator. Generally, credit requirements is not popular among model creators on Civitai. 69,724 (80.10%) models can be used by other users without mentioning or crediting their creators. (ii) Derivatives allowance. This represents whether a model can be utilized for creating derivatives (e.g., fine-tuned versions or variants of original models). Overall, derivatives creation is acceptable among model creators: 74,258 (85.31%) models allow others to generate derivatives. (iii) Permission level. This aspect represents to what extent a model is permitted for commercial use. We observe that the majority of models on Civitai are open for commercial activities, with only 33,468 (38.45%) models not allowing commercial activities. That said, 21,780 (25.02%) models are allowed for providing generating service on third-party platforms and Civitai, and up to 31,794 (36.53%) models are sellable for themselves or their generated images. GT: Sorry, I can’t really parse exactly what you mean from this sentence

In a nutshell, models’ commercial use permission on Civitai is featured by its overall unpopular requirements on creator crediting and broad allowance for derivatives creation. Meanwhile, most models are available for selling or renting with commercial intents. GT: The above is very boilerplate descriptive analysis. Are there any implications you can derive here?

7.2. Model consumption

Real-human deepfake

NSFW

Credit requirement

Derivatives allowance

Mean diff

(Yes vs. No)

p-value

Mean diff

(Yes vs. No)

p-value

Mean diff

(Yes vs. No)

p-value

Mean diff

(Yes vs. No)

p-value

Number of downloads

0

582.42<1539.32

***

3842.83>1155.68

***

1649.23>1326.53

0.334

1409.22>1283.36

***

Number of favorites

0

66.55<237.61

***

569.23>176.72

***

226.27>207.27

***

214.75>189.61

***

Number of comments

2.32<3.24

***

4.53>2.31

***

3.24>2.32

***

2.49<2.57

***

Rating score

3.27<3.43

***

3.58>3.28

***

3.43>3.27

***

3.28<3.43

***

Table 4. Comparison on the number of downloads, favorites, comments and rating score by the Mann-Whitney U test between models groups categorized based on models’ flag of real-human deepfakes, NSFW, credit requirement, and derivatives allowance. “Mean diff” column shows the comparison results of the mean value of corresponding metrics between two groups. *** denotes that

p<0.001

.

	KW H	Mean diff (post-hoc)	p (post-hoc)
Number of downloads	472.64 (***)	Sellable ( $1307.23$ ) $<$ Rentable ( $1559.56$ )	***
		Sellable ( $1307.23$ ) $<$ None ( $1360.20$ )	***
		Rentable ( $1559.56$ ) $>$ None ( $1360.20$ )	***
Number of favorites	269.30 (***)	Sellable ( $198.82$ ) $<$ Rentable ( $217.61$ )	***
		Sellable ( $198.82$ ) $<$ None ( $218.41$ )	***
		Rentable ( $217.61$ ) $<$ None ( $218.41$ )	***
Number of comments	169.75 (***)	Sellable ( $2.43$ ) $<$ Rentable ( $2.70$ )	***
		Sellable ( $2.43$ ) $<$ None ( $2.44$ )	***
		Rentable ( $2.70$ ) $>$ None ( $2.44$ )	***
Rating score	206.12 (***)	Sellable ( $3.18$ ) $<$ Rentable ( $3.38$ )	***
		Sellable ( $3.18$ ) $<$ None ( $3.37$ )	***
		Rentable ( $3.38$ ) $>$ None ( $3.37$ )	1.0

Table 5. Pair-wise comparison on the number of downloads, favorites, comments and rating score the Kruskal Wallis test with Dunn’s post-hoc test among models groups categorized based on models’ permission levels of commercial use. “Mean diff” column shows the comparison results of the mean value of corresponding metrics between two groups. *** denotes that

p<0.001

.

We next inspect how communities respond towards models on Civitai. We focus on analyzing following metrics about social interaction generated by shared models:

GT: I’d suggest you break the below into headed paragraphs that discuss each metric one by one (no need to have this bullet point list). Instead, at this point, you should make explicit how they fit together into the overall analysis of model consumption.

•

Number of downloads: The total number of times that a model has been downloaded by other users. This metrics can serve as a proxy to how popular a model is . More downloads represent the model is spread more broadly and gain higher popularity in Civitai communities.
•

Number of favorites: The total Number of favorites that a model has received from other users. This metrics can serve as a proxy to communities’ reception towards the model . More likes indicates higher reception of the model.
•

Number of comments: The total number of comments that a model has received from other users. This metrics can serve as proxy to consumers’ engagement towards the model . Higher volume of comments indicates more users engaged in discussion under the model.
•

Rating score: The overall rating score the model possesses. This metric can serve as proxy to consumers’ evaluation of the model’s quality . A superior rating score suggests that the model is evaluated as with high quality.

We aim to investigate the potential impact of models’ creation involving abusive GT: this comes out of nowhere. You might want to split off the abusive stuff into a dedicated analysis section themes (such as real-human deepfakes and NSFW content) and the permissions for commercial use on social interactions among consumers. For binary model features, including official flags indicating real-human deepfakes/NSFW content, credit requirements, and derivatives allowances, we categorize models based on these features. Subsequently, we utilize the Mann-Whitney U test to examine whether there are significant differences in the metrics related to consumers’ social interactions among these model groups. The results of the Mann-Whitney U test comparing models grouped by the four binary features are summarized in Table 4. GT: Above needs a bit more motivational breadcrumb text to set the scene as to why we are doing this, and exactly what our task is

For models with multiple-value features, such as permission levels, we classify them into three classes based on their permitted commercial activities: “Sellable” (encompassing “Sell” and “Image”), “Rentable” (involving “Rent” and “RentCivit”), and “None” (related to “None”). We then employ the Kruskal-Wallis H test, followed by Dunn’s post-hoc test, to explore inter-group differences. The comparison results of the Kruskal-Wallis H test among the three groups (“Sellable”, “Rentable”, and “None”) are summarized in Table 5.

GT: This could benefit from some more structure. Why are we diving specifically into NSFW/deep fakes (out of all the possible things we could consider). It might be better to do a broadbrush characterisation here, and then have a separate session dedicated to NSFW/abuse/deepfakes?

In the realm of abusive model creation on themes, consumer attitudes exhibit a notable contrast between deepfake models and NSFW models. Specifically, deepfake models experience significantly ( $p<0.001$ ) fewer downloads, favorites, comments, and a lower rating score compared to non-deepfake models. These findings suggest that deepfake models tend to have lower popularity and are less likely to be disseminated within Civitai communities. Furthermore, consumers demonstrate reduced reception and engagement with deepfake models, reflecting a tendency to assign lower evaluations to their quality. In contrast, NSFW models garner considerable favor among consumers. They consistently receive significantly ( $p<0.001$ ) more downloads, favorites, comments, and a higher rating score than their non-NSFW counterparts. These results suggest that NSFW models are more prone to achieving heightened popularity and are more readily shared within Civitai communities. Additionally, consumers exhibit greater reception toward NSFW models, actively participating in relevant discussions under these models. Moreover, NSFW models tend to receive more favorable public evaluations of their quality. GT: There are 3 stakeholders here: model developers, model users, and content consumers. You might want to make these stakeholders more explicit in the narrative.

Concerning commercial use permissions, consumers exhibit varied responses to different aspects of model permissions. Models without credit requirement are not actively promoted among consumers, receiving significantly ( $p<0.001$ ) fewer favorites, comments, and a lower rating score. Models with derivatives allowance, despite generating significantly ( $p<0.001$ ) fewer comments and lower rating scores, appear to be more likely to be embraced by communities, evidenced by significantly ( $p<0.001$ ) higher download and favorite counts. In terms of permission levels, there is a notable preference among consumers for rentable models. Overall, “Rentable” models garner significantly ( $p<0.001$ ) more downloads, favorites, comments, and a higher rating score compared to both “Sellable” and “None” models. GT: Any explanations for the above patterns?

In summary, deepfake models face lower promotion in communities compared to NSFW models, which tend to encourage more social interaction among consumers, resulting in increased popularity, reception, engagement, and better evaluation. Regarding commercial use permission for models, consumers exhibit less active interaction with those lacking credit requirements and derivatives allowance. These findings indicate that the creation of NSFW content models has gained public promotion, emphasizing the necessity for content moderation in NSFW models. Additionally, it is recommended for model creators to implement credit requirements and open derivatives allowance to enhance the visibility and acceptance of their models within communities. Finally, creators are recommended to assign commercial permission for renting services to their models, as consumers respond more actively towards rentable models.

8. Profiling image creation and consumption

8.1. Image creation

	“Prompts”			“Negative prompts”
	Count	Top-10 entities with highest frequency	Exemplar prompt	Count	Top-10 entities with highest frequency	Exemplar prompt
Person	149,576	Jeremy Mann (3,676), Atey Ghailan (3,318), Greg Manchess (3,040), Antonio Moro (2,942), Greg Rutkowski (2,908), Ed Blinkey (2,853), Ilya Kuvshinov (2,114), Flora Borsi (1,957), Pino Daeni (1,254), Ruan Jia (1,247)	“Jeremy Mann, oil painting, broad brush strokes dark, colorful, portrait of man and woman kissing at futuristic cyberpunk bar”	1,717	Lowres (57), Greg Rutkowski (35), Finger (21), Pablo Picasso (20), Frida Kahlo (20), Junji Ito (20), Clive Barker (18), Picasso (18), Giuseppe Arcimboldo (17), John Wilhelm (17)	“deformed hands, worst quality, Lowres, error, text, low quality, normal quality, jpeg artifacts”
Standort	66,831	Hollywood (1,614), Caucasian (1,269), Tokyo (1,184), Sharp (712), Paris (634), London (453), New York (397), Egypt (389), China (368), Manhattan (368), Light (302)	“Hollywood star 1950. Portrait in the style of Paul Kenton, Jim Mahfood, Henry Asensio.”	2,428	Mari (51), Fu**nari (42), Less (41), Caucasian (41), Marijuana (37), Siam (34), White Fur (33), Hollywood (26), Siamese (24), High (17)	“anime, Caucasian, busty, lipstick, bruise, paint, headphones, yawn, umbrella”
Organization	58,868	Fuji (7,815), Kodak (4,743), CGSociety (3,296), Studio Ghibli (3,081), Sharp (2,617), ArtStation (2,426), Fujifilm (2,167), Sony (1,975), Nikon (1,951), Canon (1,396)	“hyper realistic photograph, film grain, Kodak portra 800, f1.8, a fluffy white woman with heterochromia hovering in zero gravity inside a space station”	1,072	Photos (449), NSFW (346), BadDream (217), Watermark (115), BadD (74), EasyNegative (55), EasyN (49), Worst Quality (49), Glasses (46), Fuji (45)	“low quality, low resolution, beautiful wet old skin, high detailed skin, 8k uhd, dslr, cinematic lighting, high quality, muted colors, Fujifilm XT3, low resolution, beautiful”

Table 6. A summary of distribution named entities about person, location, and organization in among “prompts” and “negative prompts”. The “Count” columns present the volume of prompts containing corresponding type of named entities.

GT: Seems odd that we dive straight into abusive image creation, considering the overall section is about image creation more generally. Perhaps we could start with an overview of all entities, and then deep dive into abuse? Abusive image creation. Previous research has suggested that creation of images can be highly related to the mainstream themes of shared models on AIGC platforms . Implied by popular abusive creation among models, we first inspect whether NSFW content and deepfakes are similarly prevalent among image creators. For this, we measure the distribution of NSFW levels and frequency of used deepfake models in images.

Usage of prompts. We next explore how image creators utilize models on Civitai to create derivative images by analyzing prompt used for image generation. We refer to the same strategy in (wang2022diffusiondb) and profile creators’ usage of prompts from three significant perspectives – prompt’s length, contained named entities, and topics.

First, for a prompt’s length, we measure it by the number of phrases divided by the commas in text. Figure 8 presents the distribution of the length of “prompts” and “negative prompts”. We observe that, first, both the length of “prompts” ( $\mu=17.55,\leavevmode\nobreak\ SD=21.11,\leavevmode\nobreak\ min=1,\leavevmode% \nobreak\ max=15,248$ ) and “negative prompts” ( $\mu=24.68,\leavevmode\nobreak\ SD=39.62,\leavevmode\nobreak\ min=1,\leavevmode% \nobreak\ max=3,158$ ) follow long-tail distributions. This suggests using prompts with few phrases (e.g., 80% prompts contain less than 28 phrases) are more prevalent among image creators. Additionally, according to the cdf-lines in Figure 8, we find that “negative prompts” tend to contain more phrases than “prompts”. A one-sided two-sample Kolmogorov-Smirnov test evidences this with statistical significance ( $D=0.040,\leavevmode\nobreak\ p<0.001$ ) by implying that “negative prompts” has a higher distribution of the number of phrases than “prompts”. Our analysis suggests that image creators tend to input more phrases on content removal and modification than its original generation when creating images on Civitai. GT: Anything interesting from the above? Could we constrast across different models? Or different model users? Any patterns of note?

GT: I’d suggest we turn the below into a dedicated section. Second, for contained named entities, we focus on three primary types of entities: person, location, and organization. These entities have been identified as crucial elements influencing the generated content of images (dehouche2023s). We employ a pre-trained named entity recognition model introduced by (tedeschi-etal-2021-wikineural-combined) to extract these entities. Subsequently, we refine the set of named entities by eliminating those with names that contain “#” symbols, any digits, or are fewer than three characters in length. In general, our analysis reveals that 247,017 (19.44%) “prompts” and 5,104 (1.87%) “negative prompts” include named entities. GT: define negative here This indicates that while named entities are frequently employed for content generation, they are infrequently utilized for content elimination and modification.

We evaluate the distribution of three types of named entities across prompts and present the summarized results in Table 6. GT: Could we have a few plots here, e.g. CDF of num per image etc The most frequently used named entities fall under the category of Person, accounting for 11.77% in “prompts” and 0.63% in “negative prompts”. The top-10 person entities with the highest frequency predominantly consist of names of professional artists (e.g., Jeremy Mann, Antonio Moro, and Greg Rutkowski) and illustrators (e.g., Atey Ghailan, Greg Manchess, and Ilya Kuvshinov). GT: Would be great to get these into generic categories, e.g. celebs, artists, politicians etc Creators primarily employ these artists’ names to fine-tune the artistic styles of generated images. For instance, as depicted in Figure 9, a prompt such as ”Selfie of Super Mario, laughing and drinking at a poolside party, surrounded by girls, art by Jeremy Mann, art by Agnes Cecile” utilizes the named entities “Jeremy Mann” and “Agnes Cecile” to blend the image’s style into a mix of these two artists’.

The second most frequently used named entities belong to the Location category, comprising 5.26% in “prompts” and 0.89% in “negative prompts”. The top-10 location entities predominantly include names of famous tourist countries (e.g., Tokyo, Egypt, and China) and cities (e.g., Hollywood, Paris, and London). Creators leverage these location entities to craft scenes set in these specific locations. For example, a prompt like “Dystopian futuristic Paris, photorealistic, great shot, year 2300, cyberpunk” involves the entity “Paris” to create a future Cyberpunk scene depicting the city. GT: Apart from person, location and organisation, what other things could be included?

The third most frequently used named entities fall under the Organization category, representing 4.63% in “prompts” and 0.39% in “negative prompts”. The top-10 organization entities predominantly include brands of photography (e.g., Fuji, Kodak, and Sharp) and animation (e.g., CGSociety, Studio Ghibli, and ArtStation) companies. Creators employ these organization entities to incorporate filming or CG effects into the generated images. For instance, a prompt such as “Kodak Portra 400, 8k, highly detailed, Britt Marling style, color studio - portrait of a handsome 8-year-old ((sks man)) holding a sandwich, muted colors, up face with 1920s hair style and cloth style, asymmetrical, Hasselblad” aims to add Kodak Portra 400’s film effects to enhance the image’s performance. GT: So, this seems a bit different to ’organisation’. Instead, it looks like they’re using camera brands/features to shape the images. Could this be fleshed out.

JZ: I’m not sure whether this part of named entities should go like this. since most of above analyses are just simple content analysis and I found it hard to go further on them. and also I feel this part a bit out of track. above for model we are talking about NSFW and deepfake. would it be better to relate the named entity analysis to deepfake topic? also following with analysis of prompts’ NSFW degree would be better choice rather than randomly topic analysis? GT: I think the NSFW thing is the strongest angle, and could be split off into a separate section. We could start with a big correlation analysis to show co-location of entities within prompts/creators. We could then use that to motivate a focus on NSFW.

8.2. Image consumption

JZ: distribution of social properties, correlation analysis between social properties and above creation features, content analysis on comment

9. Profiling creators’ activities

9.1. Creator Types

^{inline,color=gray!50}^{inline,color=gray!50}todo: inline,color=gray!50Yiluo: distribution of only-model, only-image, both creators.

9.2. Creation Pattern

Productivity. ^{inline,color=gray!50}^{inline,color=gray!50}todo: inline,color=gray!50Yiluo: How many models/images? in how many days? and upload pattern

Consistency/Diversity. ^{inline,color=gray!50}^{inline,color=gray!50}todo: inline,color=gray!50Yiluo: Do they focus on a single type of model, or a few themes of model/image?

9.3. Social Behaviour

Following and Interaction. ^{inline,color=gray!50}^{inline,color=gray!50}todo: inline,color=gray!50Yiluo: Guess image creators are likely to follow model creators, but model creators may just follow other model creators.

Monetization. ^{inline,color=gray!50}^{inline,color=gray!50}todo: inline,color=gray!50Yiluo: External links to Patreon kofi etc. ^{inline,color=gray!50}^{inline,color=gray!50}todo: inline,color=gray!50Yiluo: Maybe buzz

Open-Source-AI-Generated Images: Characterizing the Civitai Ecosystem.

Abstract.

1. Introduction

2. Primer on Civitai

3. Methodology

3.1. Data collection

3.2. GPT-assisted Qualitative Analysis

3.3. Prompt comparison with mainstream AIGC platforms.

4. RQ3: User activities with abusive AIGC

4.1. Profiling creation of abusive AIGC

4.2. Profiling consumption on abusive content

4.3. Answers and implications for RQ3

5. RQ4: Creators of abusive content

6. Related Work

7. Profiling model creation and consumption

7.1. Prevalent theme of model creation

7.2. Model consumption

8. Profiling image creation and consumption

8.1. Image creation

8.2. Image consumption

9. Profiling creators’ activities

9.1. Creator Types

9.2. Creation Pattern

9.3. Social Behaviour

Open-Source-AI-Generated Images:
Characterizing the Civitai Ecosystem.