3D4ALL: Toward an Inclusive Pipeline to Classify 3D Contents
33D4ALL: Toward an Inclusive Pipeline to Classify 3DContents
Nahyun Kwon a , Chen Liang a and Jeeeun Kim a a HCIED Lab, Texas A&M University
Abstract
Algorithmic content moderation manages an explosive number of user-created content shared online everyday. Despite amassive number of 3D designs that are free to be downloaded, shared, and 3D printed by the users, detecting sensitivity withtransparency and fairness has been controversial. Although sensitive 3D content might have a greater impact than othermedia due to its possible reproducibility and replicability without restriction, prevailed unawareness resulted in proliferationof sensitive 3D models online and a lack of discussion on transparent and fair 3D content moderation. As the 3D contentexists as a document on the web mainly consisting of text and images, we first study the existing algorithmic efforts basedon text and images and the prior endeavors to encompass transparency and fairness in moderation, which can also be usefulin a 3D printing domain. At the same time, we identify 3D specific features that should be addressed to advance a 3Dspecialized algorithmic moderation. As a potential solution, we suggest a human-in-the-loop pipeline using augmentedlearning, powered by various stakeholders with different backgrounds and perspectives in understanding the content. Ourpipeline aims to minimize personal biases by enabling diverse stakeholders to be vocal in reflecting various factors to interpretthe content. We add our initial proposal for redesigning metadata of open 3D repositories, to invoke users’ responsible actionsof being granted consent from the subject upon sharing contents for free in the public spaces.
Keywords
3D printing, sensitive contents, content moderation
1. Introduction
To date, many social media platforms observed an ex-plosive number of user-created content posted every-day from Twitter to YouTube to Instagram and more.Following the acceleration of online contents whichbecomes even faster partly due to COVID-19, it hasalso become easier for people to access sensitive con-tent that may not be appropriate for the general pur-pose. Owing to the scale of these content and users’abilities to share and repost them in a flash, it be-comes extremely costly to detect the sensitive contentsolely by manual work. Current social media plat-forms have adopted various (semi)automated contentmoderation methods including a deep learning-basedclassification (e.g., Microsoft Azure Content Modera-tor [1], DeepAI’s Nudity Detection API [2], AmazonRekognition Content Moderation [3]).Meanwhile, since desktop 3D printers have beenflooded into the consumer market, 3D printing spe-cific social platforms such as Thingiverse [4] have also
Joint Proceedings of the ACM IUI 2021 Workshops, April 13-17, 2021,College Station, USA " [email protected] (N. Kwon); [email protected] (C.Liang); [email protected] (J. Kim) ~ (cid:18) © 2021 Copyright © 2021 for this paper by its authors. Use permitted underCreative Commons License Attribution 4.0 International (CC BY 4.0). CEUR
WorkshopProceedings http://ceur-ws.orgISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org) gained popularity, contributing to the proliferation ofshared 3D contents that are easily downloadable andreplicable among community users. Despite a massivenumber of 3D contents shared for free to date—As of2020 2Q, there are near 1.8 million 3D models avail-able for download, excluding empty entries due to postdeletion—, there has been relatively little attention tosensitive 3D contents. This might result in not only alack of a dataset to be used as a bench mark, but alsoa lack of discussion on fair rationales to be utilized inbuilding a algorithmic 3D content moderation that in-tegrates everyone’s perspectives with a different back-ground. Along with significant advances in technol-ogy of machine mechanisms and materials (e.g., 3Dprinting in metals), the 3D printing community maypresent an even greater impact from the spread of con-tent due to its limitless potential for replication andreproduction. In view of various stakeholders whohave different perspectives in consuming and inter-preting contents—from K-12 teachers who may seek3D files online to design curricula to artists who depicttheir creativity in digitized 3D sculptures—, moderat-ing 3D content with fairness becomes more challeng-ing. 3D contents online often consist of images andtext that are possibly useful to adopt existing modera-tion schemes including text (e.g., [5, 6, 7, 8]) or imagebased (e.g., [9, 10, 11]) approaches. However, there ex-ist 3D printing specific features (e.g., print support toavoid overhangs, uni-colored outcome, segmented in a r X i v : . [ c s . H C ] F e b arts, etc.) that may prevent direct adoption of thoseschemes, requiring further consideration about imple-menting advanced 3D content moderation techniques.In this work, we first study the existing contentmoderation efforts that has potential to be used in3D content moderationand discuss shared concerns inexamining transparency and fairness issues in algo-rithmic content moderation. As a potential solution,we propose a semi-automated human-in-the-loop val-idation pipeline using augmented learning that incre-mentally trains the model with the input from the hu-man workforce. We highlight potential biases that arelikely to be propagated from different perspectives ofhuman moderators who provide final decisions and la-beling for re-training a classification model. To miti-gate those biases, we propose an image annotation in-terface to develop an explainable dataset and the sys-tem that reflects various stakeholders’ perspectives inunderstanding the 3D content. We conclude with ini-tial recommendations for metadata design to (1) re-quire consent and (2) inform previously unaware usersof consent for publicizing the content which might in-vade copyright or privacy.
2. Algorithmic ContentModeration
Manual moderation relying on a few trusted humanworkforce and voluntary reports has been commonsolutions to review shared contents. Unfortunately,it becomes increasingly difficult to meet the demandsof growing volumes of users and user-created content[12]. Algorithmic content moderation has taken animportant place in popular social media platforms toprevent various sensitive content in real-time, includ-ing graphic violence, sexual abuse, harassment, andmore. As with other media posts, 3D contents avail-able online appear as web documents that consist ofimages and text. For example, to attract audiences andhelp others understand the design project, creators inThingiverse voluntarily include various informationsuch as written descriptions of the model, tags, as wellas photos of a 3D printed design; thus, 3D content canprovide us an ample opportunity to employ the exist-ing text and image based moderation schemes.Among various text-based solutions, sentimentanalysis is one traditionally popular approach thatcategorizes input text into either two or more cate-gories: positive and negative, or more detailed n -pointscales (e.g., highly positive, positive, neutral, nega-tive, highly negative) [5, 6]. Moderators can consider categorization results in deciding whether the con-tent is offensive or discriminatory [13]. Various classi-fiers, such as Logistic Regression Model, Support Vec-tor Machine, and random forest, are actively used indetecting misogynistic posts on Twitter (e.g., [7, 8]).Jigsaw and Google’s Counter Abuse Technology sug-gested Perspective API [14] provide a score on how toxic (i.e., rude, disrespectful, or unreasonable) the textcomment is, using a machine learning (ML) model thatwas trained by people’s rating of internet comments.With the rapid improvement of Computer Vision(CV) technologies with machine learning, several im-age datasets (e.g., NudeNet Classifier dataset[15]) andmoderation APIs enable developers to apply theseready-to-use mechanisms to their applications. For ex-ample, Microsoft Azure Content Moderator [1] classi-fies adult images into several categories, such as ex-plicitly sexual in nature, sexually suggestive, or gory.DeepAI’s Nudity Detection API [2] enables automaticdetection of adult images and adult videos. AmazonRekognition content moderation [3] detects inappro-priate or offensive features in images and provides de-tected labels and prediction probabilities. However,many off-the-shelf services and APIs are often ob-scured, because it is hard for users to expect that themodels are trained with fair ground-truths that can of-fer reliable results to various stakeholders with differ-ent cultural or social backgrounds without any biases,which we will discuss more in a detailed way in thefollowing section. As we noted earlier, 3D contents appear as web docu-ments that consist of text descriptions, auto-generatedpreview images, and user-uploaded images to helpothers comprehend the content at a glance. Althoughit is technically possible to utilize existing text andimage based moderation schemes, 3D models haveunique features that make it hard to directly adoptthe existing CV techniques to their rendered imagesor photos.
We identified four characteristics that make sensitiveelements undetectable by the existing algorithms.
Challenge 1. Difficulties in Locating Featuresfrom Images of the Current Placement.
Thingi-verse automatically generates rendered images of the3D model when a 3D file is uploaded, and this is used a) Rotated model (b) Support structure (c) Texture on surface (d) Divided into parts
Figure 1:
Example images for the mainly 4 characteristics that make it hard to use the existing CV techniques; each thingis reachable using its unique ID through the url of https://thingiverse.com/thing:ID as a representative image if the designer does not pro-vide any photos of real 3D prints. In many cases, thesefiles are placed in the best orientation that guaran-tees print-success in FDM (Fused Deposition Model-ing) printers, aligning the design to minimize over-hangs. As the preview is taken in a fixed angle, so itmight not be in a perfect angle that shows the mainpart of the model thoroughly (e.g., Fig 1(a)). It hindersthe existing image-based APIs from accurate detectionof sensitivity in the preview images, because sensitiveparts might not be visible.
Challenge 2. Support Structure that Occludes theFeatures.
Following the model alignment strategy ofFDM printing, designers often include a custom sup-port structure to prevent overhangs and to avoid print-ing failures and deterring surface textures with auto-generated supports from slicers (i.e., 3D model com-piler) such as Cura [16]. These special structures easilyocclude the design’s significant features (e.g., Fig 1(b)).Since the model is partly or completely occluded, theexisting CV techniques barely detect sensitivity of thedesign.
Challenge 3. Texture and Colors.
Current 3D print-ing technologies enable users to use various print set-tings and other postprocessing techniques. Accord-ingly, the printed model may present unique appear-ances compared to general real-world entities. Oftenthe model is single-colored and can have a unique tex-ture such as linear lines on the surface (e.g., Fig 1(c))due to the nature of 3D printing mechanisms of accu-mulating materials layer-by-layer, which might let theexisting CV algorithms overlook the features.
Challenge 4. Models Separated into Parts forPrinting.
As one common 3D printing strategy tominimize printing failures from a complex 3D designssuch as a human body, many designers divide theirmodels into several parts to ease the printing process,and let users post-assemble as shown in Fig 1(d). In this case, it is hard for the existing CV techniques toget the whole assembled model, resulting in a failureto recognize its sensitivity.
3. Transparency and FairnessIssues in Content Moderation
Content moderation has long been controversial dueto its non-transparent and secretive process [17],resulting from lacking explanations for communitymembers about how the algorithm works. To meetthe growing demands for transparent and accountablemoderation practice as well as to elevate public trust,recently, popular social media platforms have begun todedicate their efforts to make their moderation processmore obvious and candid [17, 18, 19, 20]. As a reason-able starting point, those services provided detailedterms and policies (e.g., Facebook’s Community Stan-dards [21]) describing the bounds of acceptable behav-iors on the platform [17]. In 2018, as a collective ef-fort, researchers and practitioners proposed the SantaClara Principles on Transparency and Accountabilityin Content Moderation (SCP) [22]. SCP suggests onerequirement that social media platforms should pro-vide detailed guidance to the members about whichcontent and behaviors are discouraged, including ex-amples of permissible and impermissible content, aswell as an explanation of how automated tools areused across each category of content. It also recom-mends for content moderators to give users a rationale for content removal to assure about what happens be-hind the content moderation.Making the moderation process transparent and ex-plainable is crucial to the success of the community[23], in order not only to maintain its current scale butalso to invite new users, because it may affect users’ubsequent behaviors. For example, given no expla-nation about the content removal, users are less likelyto upload new posts in the future or leave the com-munity, because they may believe that their contentwas treated unfairly thus get frustrated owing to anabsence of communication [24]. Reddit [25], which isone of the most popular social media, has equippedvolunteer-based moderation schemes resulting in theremoval of almost one fifth of all posts every day [26]due to violation of their community policy [27] (e.g.,Rule 4:
Do not post or encourage the posting of sex-ual or suggestive content involving minors. ) or indi-vidual rules of the subreddits (i.e., subcommunity ofReddit that has a specific individual topic) according totheir own objectives (e.g., One of the rules in 3D print-ing subreddit: “Any device/design/instructions whichare intended injure people or damage property will beremoved.”). Users being aware of community guide-lines or receiving explanations for content removalare more likely to perceive that the removal was fair[24] and showcase more positive behaviors in the fu-ture. As many social platforms including 3D opencommunities such as Thingiverse highly rely on vol-untary posting of the user-created content [28], therole of a transparent system in content moderation be-comes more significant in maintaining the communi-ties themselves.Even if many existing social media platforms havetheir full gears to implement artificial intelligence (AI)in content moderation, it has long been in the blackbox [23], thus not understandable for users due to thecomplexity of the ML model. To address the issue ofthe uninterpretable model that hinders the users fromunderstanding how it works, researchers shed lightson the blind spot by studying various techniques tomake the model explainable (e.g., [29, 30, 31]). Ex-plainability has been on the rise to be an effectiveway of enhancing transparency of ML models [32].In order to secure explainability, the system must en-able stakeholders to understand the high-level con-cepts of the model, the reasoning used by the model,and the model’s resulting behavior [33]. For example,as shown in the Fairness, Accountability, and Trans-parency (FAT) model, supporting users to know whichvariables are important in the prediction and how theywill be combined is one powerful way to enable themto understand and finally trust the decision made bythe model [34].
People often overlook fairness of the moderation al-gorithm and tend to believe that the systems automat-ically make unbiased decisions [35]. In fact, the hu-man adjudication of user-generated content has beenoccurred in secret and for relatively low wages byunidentified moderators [36]. In some platforms, usersare even unable to know the presence of moderators orwho they are [37], and thus it is hard for them to knowwhat potential bias, owing to different reasoning pro-cesses, has been injected into the moderation proce-dure. For example, there have been worldwide actionsthat strongly criticize the sexualization of women’sbodies without inclusive inference (e.g., ‘My BreastsAre Not Obscene’ protest by the global feminist groupFemen [38] to denounce a museum’s censorship of nu-dity.). Similarly, Facebook’s automatic turning downof postings and selfies that include women’s toplessphoto by tagging them as
Sexual/Porn ignited ‘MyBody is not Porn’ movement [39, 40]. The differentpoints of view in perceiving and reasoning towards thesame piece of work makes it yet hard to decide the ab-solute sensitivity. It is nearly impossible that the solegroup of users represent all, therefore, it is difficult forusers to expect a ground-truth in the decision-makingprocess, and trust the result while believing expertsmade the final decisions based on thoughtful consid-eration with an unbiased rationale.Subsequently, many studies (e.g., [41, 42]) have ex-plored potential risks of algorithmic decision-makingthat are potentially biased and discriminatory to a cer-tain group of people such as underrepresented groupsof gender, race, disability. Classifier has been one com-mon approach in content moderation, but developinga perfectly fair set of classifiers in content moderationis complex compared to those in common recommen-dation or ranking systems, as classifiers tend to in-evitably embed a preference to the certain group overothers to decide whether the content is offensive ornot [17].
Through a text feature based classification, we iden-tified there are three main categories of sensitive3D content: (1) sexual/suggestive, (2) dangerousweaponry, and (3) drug/smoke. Due to the capabilityof unlimited replication and reproduction in 3D print-ing, unawareness of these 3D contents could be cru-cial. We noticed that Thingiverse limits access to some f sensitive things that are currently labeled as NSFW(Not Safe for Work) by replacing their thumbnail im-ages with the black warning images. It is a secretiveprocess because there are no clear rationale or expla-nations offered to users behind this process. There-fore, users cannot expect whether Thingiverse oper-ates based on an unbiased and fair set of rules.While the steep acceleration of increments of 3Dmodels [43] is making automatic detection of sensitive3D content imperative, moderating 3D content alsofaces fairness issues and users are suffering from lack-ing explanations. We need to take our account intovarious stakeholders’ points of view that affect theirdecision on potentially sensitive 3D content, as well asfurther discussions to mitigate bias and discriminationof the algorithmic decision-making system. Here wepropose an explainable human-in-the-loop 3D contentmoderation system to enable various users who havedistinct rules to participate in calibrating algorithmicdecisions to decrease bias or discrimination of the al-gorithm itself. Although we focus on specific issues inshared 3D content online, our proposed pipeline gen-erally applies to advancing a semi-automatic processtoward an explainable and fair content moderation forall.
4. Towards Explainable 3DModeration System
A potential solution to examine 3D contents’ sensitiv-ity with fairness is employing the human workforcewith ample experiences in observing and perceivingwith various perspectives. We suggest a human-in-the-loop pipeline, based on the idea of incrementallearning [44] that the human workforce can collabo-rate with an intelligent system, concurrently classify-ing data input and annotate features with the explana-tion for the decision.
Making decisions on the sensitivity of a 3D model canbe subjective due to various factors such as culturaldifferences, the nature of the community, and the pur-pose of navigating 3D models. To reflect different an-gles in discerning the nature and intention of contents,we need to deliberate various interpretations takenfrom various groups of people. For example, there arelots of 3D printable replicas of artistic statues or Greeksculptures that are reconstructed by 3D scanning of the original in the museums [45]. Speculative K-12teachers designing their STEAM education curriculumusing 3D models are not likely to want any NSFW de-signs revealed to their search results. On the otherhand, there are many activists and artists who maywant to investigate the limitless potential of the tech-nology, sharing a 3D scanned copy of the naked bodyof herself [46] or digitizing nude sculptures availablein the museum to make the intellectual assets accessi-ble to everyone, etc. The nude sculpture has been onepopular form of artistic creation in history, and it is notsimple to stigmatize these works as ‘sensitive’. Every-one has their own right to ‘leave the memory of self’in a digital form. Forcing to adapt a preset threshold ofsensitivity and filter these wide array of user-createdcontents could unfairly treat one’s creative freedom.As the extent that various stakeholders perceive thesensitivity could be distinct, our objective is to designan inclusive process in accepting and adopting the sen-sitivity.
Automated content moderation could help review ofa vast amount of data and provide filtered cases forhumans to support a decision-making process [24], ifwe well-echo diverse perspectives in understandingcontents. In our proposal of the human-in-the-looppipeline (Fig 2(a)), an input image dataset of 3D mod-els will be used for the initial model training, then theresult will be reviewed by multiple human modera-tors step by step. We trained the model with 1,077things that are already labeled as NSFW by Thingi-verse and 1,077 randomly selected non-NSFW things.All input images are simply categorized as NSFW ornot, with no annotation for specific image features toprovide the reasoning. Human moderators recruitedfrom various groups of people now review the classi-fication results whether they agree. They are asked toannotate image segments using a bounding box wherethey referred to make the final decision with the cat-egory. At the same time, they provide the rough levelof how much the part affected the entire sensitivityand a written rationale for the decision. These fea-tures will enhance the data quality so to be used tofine-tune the model with the weighted score, thus themodel becomes able to recognize previously unknownsensitive models based on the similarity and now can explain sensitive features.When two different groups of people with differentstandards do not agree on the same model’s classifica-tion results, the model uses their decision, annotated a) Human-in-the-loop pipeline (b) User interface mockup
Figure 2: (a) Overview of the human-in-the-loop pipeline powered by human moderators to acknowledge various per-ceptions of sensitivity and (b) an user interface mockup for the moderators to validate prediction results and provideannotations regarding their rationale, thus to augment the model. features, and levels of sensitivity to differentiate theextent of perceived sensitivity and reflect to the differ-ent threshold. For example, one moderator thinks thatthe model is sensitive while the other does not, themodel will have a higher threshold in categorizing thecontent. Different decisions on the same model finallycould be brought to the table for further discussion ifneeded, for example, to regulate policy guidelines, orused as search criteria for other community users whohave similar goals in viewing and unlocking analogous3D contents. To summarize, one iteration contains thefollowing steps:1. The pre-trained model presents prediction re-sults.2. The human moderator can enter disagree-ment/agreement with the results and annotatesensitive parts with a sensitivity level and a de-cision rationale.3. The annotated image is used to fine-tune themodel.4. If the decision for the image is different fromother moderators, annotations and sensitivitylevels are used to set the different threshold.We elaborate more on feedback from the modera-tors by showing three possible scenarios: (1) the mod-erator’s agreement with the prediction results, (2) sen-sitive parts not detected, and (3) false-classification ofinsensitive features sensitive.
Case 1. Agreement with the Prediction Result
In case that the moderators agree with the decision,they can either finalize it or reject the classification,by selecting provided top-level categories (e.g., sex-ual/suggestive, weaponry, drug/smoke) and second-level categories (e.g., under sexual/suggestive, explicitnudity, adult toys, sexual activity, etc.). We currently refer to a two-level hierarchical taxonomy of AmazonRekognition to label categories of inappropriate or of-fensive content.
Case 2. Sensitive Parts Ignored by the Algorithm
Another possible case is that the specific feature inthe image that the moderator perceives as sensitive ismissing in the detection results. In this case, humanmoderators can label that part and provide rationalesusing enter the level of sensitivity field from 1 (slightlysensitive) to 5 (highly sensitive), how each specific partaffects the entire sensitivity of the model.
Case 3. False Negative
It is also possible that someparts detected by the model are not sensitive for themoderator due to the higher tolerance to sensitivity.The moderator can either submit the disagreement orprovide more detailed feedback by excluding specificresults.Different degrees of sensitivity perception from var-ious stakeholders can reflect distinct points of view,which may manifest fairness in algorithmic modera-tion through multiple iterations of this process. In ourinterface for the end-users that assists searching 3Ddesigns, we let users set their desired threshold. Forthose who might find it difficult to decide a thresholdthat perfectly fits their need, we show several randomexample images that have detected sensitive labelswith the corresponding threshold. This pipeline alsohelps obtain the explainable moderation algorithm.Our model can help users understand the rationalesof the model by locating detected features/predictionprobabilities in the image and providing written de-scriptions that the moderators entered for data classi-fication. .3. Solution 2: New Metadata Designto Avoid Auto-Filtering
Another potential problem in open 3D communitiesis copyright or privacy-invasive contents that are im-mediately marked as NSFW by Thingiverse indicat-ing they are inappropriate . Currently, Thingiverselacks notification and explanation for content removal,while a majority of them might invade copyrights. Itsobscurity results in a negative impact on the user’s fu-ture behaviors. For example, creators are frustrated atthe un-notified removal of their content thus decidedto quit their membership (e.g., [47]), which might nothappen if they saw an informative alert when theypost the content. Along with advanced 3D scanningtechnologies [48], many creators are actively sharing3D scanned models (e.g., As of December 2020, Thin-giverse has 1150 things that tagged with ‘3D_scan’ and308 things with the tag ‘3D_scanning’). With arisingconcerns over possible privacy invasion in sensitive3D designs, what caught our attention is 3D scannedreplicas of human bodies. Many of them do not in-clude an explicit description of whether the creatorreceived the consent from the subject (e.g., [49, 50]).Some designers quoted the subject’s permission, forexample, one creator describes that the subject, Nova,has agreed to share her scanned body on Thingiverse[51]. Still, this process relies on the users’ voluntaryaction given no official guidelines, resulting in a lackof awareness that the users must be granted the con-sent to upload possibly privacy-invasive contents atthe time of posting those content in public spaces re-gardless of the commercial purpose. Without explicitconsent, the content is very likely to be auto-filteredby Thingiverse, which decreases fairness by hamper-ing artistic/creative freedom. To iron out a bettercontent-sharing environment in the these open com-munities, redesigning of metadata must be consideredand adapted by system admins that invoke responsibleactions. For example, providing a checkbox that asks “If the design is made of 3D scanned human subject, Igot an agreement from the subject” can inform previ-ously unaware users about the need for permission topost potentially privacy-breaching contents. Includ-ing the subject’s consent can also protect creative free-dom from auto-filtering, by adding that the content isnot breaching copyright or privacy and can be sharedin the public spaces. In addition, it can enable usersto understand that an absence of consent could be thereason for filtering.
5. Conclusion
As an inclusive process to develop transparent and fairmoderation procedure in 3D printing communities,our study proposes to build an explainable human-in-the-loop pipeline. We aim to employ diverse group ofhuman moderators to collect their rationales, whichcan be used to enhance the model’s incremental learn-ing. Our objective is not to censor 3D content butto build a pleasant 3D printing community for all, bysafeguarding search as well as guaranteeing creativefreedom, through the pipeline and new metadata de-sign that has potential to minimize issues related withprivacy or copyright.
References .[13] M. Taboada, J. Brooke, M. Tofiloski, K. Voll,M. Stede, Lexicon-based methods for senti-ment analysis, Comput. Linguist. 37 (2011)267–307. URL: https://doi.org/10.1162/COLI_a_00049. doi:10.1162/COLI_a_00049