A survey of comics research in computer science
NNoname manuscript No. (will be inserted by the editor)
A survey of comics research in computer science
Olivier Augereau · Motoi Iwata · Koichi Kise
Received: date / Accepted: date
Abstract
Graphical novels such as comics and mangasare well known all over the world. The digital transitionstarted to change the way people are reading comics,more and more on smartphones and tablets and lessand less on paper. In the recent years, a wide vari-ety of research about comics has been proposed andmight change the way comics are created, distributedand read in future years. Early work focuses on low leveldocument image analysis: indeed comic books are com-plex, they contains text, drawings, balloon, panels, ono-matopoeia, etc. Different fields of computer science cov-ered research about user interaction and content gener-ation such as multimedia, artificial intelligence, human-computer interaction, etc. with different sets of values.We propose in this paper to review the previous re-search about comics in computer science, to state whathave been done and to give some insights about themain outlooks.
Keywords
Comics · Multimedia · Analysis · Userinteraction · Content generation
Research on comics have been done independently inseveral research fields such as document image analysis,multimedia, human-computer interaction, etc. with dif-ferent sets of values. We propose to review the researchof all of these fields and to organize them in order to
F. Authorfirst addressTel.: +123-45-678910Fax: +123-45-678910E-mail: [email protected]. Authorsecond address understand what is possible to do about comics withthe state of the art methods. We also give some ideasabout the future possibility of comics research.We introduced a brief overview of comics researchin computer science [5] during the second edition of theinternational workshop on coMics ANalysis, Process-ing and Understanding (MANPU). The first edition ofMANPU workshop took place during ICPR 2016 (In-ternational Conference on Pattern Recognition) and thesecond one took place during ICDAR 2017 (Interna-tional Conference on Document Analysis and Recogni-tion). It shows that comics can interest a large variety ofresearchers from pattern recognition to document anal-ysis. We think that the multimedia and interface com-munities could have some interest too, so we proposeto present the research about comics analysis with abroader view.In the next part of the introduction we will explainthe importance of comics and its impact on the societywith a brief overview of the open problems.1.1 Comics and societyComics in the USA, mangas in Japan or bandes dessin´eesin France and Belgium are graphic novels which havea worldwide audience. They are respectively an impor-tant part of the American, Japanese and Francophonecultures. They are often considered as a soft power ofthese countries, especially mangas for Japan [39,27]. InFrance, bandes dessin´ee is considered as an art, and iscommonly refereed as the “ninth art” [67] (as comparedto cinema which is the seventh art).However, several years ago it was not the case. Comicswas considered as “children literature” or “sub-literature”as it contains a mixture of images and text. But more a r X i v : . [ c s . MM ] A p r Olivier Augereau et al.
Fig. 1
We arranged the comics research into three inter-dependent categories: 1) content analysis, 1) content genera-tion, and 3) user interaction. lately comics got a great deal of interest when peoplerecognized it as a complex form of graphic expressionthat can convey deep ideas and profound aesthetics [12].The market of comics is large. According to a re-port published in February 2017 by “The All JapanMagazine and Book Publisher’s and Editor’s Associa-tion” (AJPEA), the sale of mangas in Japan represents445.4 billion yens (around 4 billion dollars) in 2016 .In this report, we can see that the market is stable be-tween 2015 and 2014, but a large progression of thedigital market can be observed: it almost doubled from2014 to 2016. The digital format has several advantagesfor the readers: it can be displayed on smartphones ortablets and be read anytime, anywhere. For the editors,the cost of publication and distribution is much loweras compared to the printed version.However, even if the format changed from paper toscreen, no added value has been proposed to the cus-tomer. We think that the democratization of digital for-mat is a good opportunity for the researchers from allcomputer science fields to propose new services such asaugmented comics, recommendation systems, etc.1.2 Research and open problemsThe research about comics is quite challenging becauseof the nature of this medium. Comics contains a mix-ture of drawings and text. To fully analyze and under-stand the content of comics, we need to consider natu-ral language processing to understand the story and the dialogues; and computer vision to understand the linedrawings, characters, locations, actions, etc. A high-level analysis is also necessary to understand events,emotions, storytelling, the relations between the char-acters, etc. A lot of related research has been done forcovering similar aspect for the case of natural images(i.e. photographic imagery) and videos by classic com-puter vision. However, the hight variety of drawings andlow availability of labeled dataset make the task harderthan natural images.We organized the research about comics in the threefollowing main categories, as illustrated in Fig. 1:1. content analysis: getting information about raw im-ages and extracting from high to low-level struc-tured descriptions,2. content generation: comics can be used as an inputor output to generate new contents. Content con-version and augmentation are possible from comicsto comics, comics to other media, other media tocomics;3. user interaction: analyzing human reading behaviorand internal states (emotions, interests) based oncomics contents, and reciprocally, analyzing comicscontents based on human behavior and interactions.Research about comics in computer science has beendone covering several aspects but is still an emergingfield. Much research has been done by researchers fromthe DIA (Document Image Analysis) and AI (ArtificialIntelligence) communities and focuses on content anal-ysis, understanding, and segmentation. Another part ofthe research is addressed by graphics and multimediacommunities and consists in generating new contentsor enriching existing contents such as adding colors toblack and white pages, creating animation, etc. Thelast aspect concerns the interaction between users andcomics which is mainly addressed by the HCI (Human-Computer Interaction) researchers. All these three partsare inter-dependent: segmenting an area of a comic pageis important if we want to manipulate and modify it,or if we want to know which area the user is interactingwith. Analyzing the user behavior can be used to drivethe content changes or to measure the impact of thesechanges on the user.In Section 3 We will state in more detail the currentstate-of-the-art and discuss the open problems. Largedatasets with ground truth information such as layout,characters, speech balloon, text, etc. are not availableso using deep learning is hardly possible in such condi-tions and most of the researchers proposed handcraftedfeatures or knowledge-driven approaches until the veryrecent years. The availability of tools and datasets thatcan be accessed and shared by the research communi-ties is another very important aspect to transform the survey of comics research in computer science 3 Fig. 2
Example of two double comic pages. The largest black rectangle encloses two pages which are represented by bluerectangles. The red rectangles are examples of panel frames. They can be small, large, overlapping each other and have differentshapes. Some panels do not have frames and some others can be drawn on more than a page. The green rectangles are examplesof gutters, a white area used for separating two panels. The yellow ellipses are examples of dialogue balloons. They can havedifferent shapes to represent the feelings of the speaker. The purple triangles are examples of onomatopoeias, they representthe sound made by people (such as footsteps) or object (water falling, metal blades knocking each other), etc. Source: imagesextracted from the Manga109 dataset [22], c (cid:13)
Sasaki Atsushi. research about comics, we will talk about the majorexisting tools and datasets in Section 4.In the next parts of the paper all the research whichare applied to comics, mangas, bande dessin´ees or anygraphics novels will be referred as “comics” in order tosimplify the reading. We start the next section of thepaper with general information about comics.
The term comics (as a singular uncountable noun) refersto the comics medium; such as television, radio, etc.comics is a way to transfer information. We can alsorefer to a comic (as a countable noun), in this case, werefer to the instance of the medium such as a comicbook or a comic page.As for any art, there are strictly no rules for creatingcomics. The authors are free to draw whatever and how-ever they want. Still, some classic layouts or patternsare usually used by the author as they want to tell astory, transmit feelings and emotions, and drive the at-tention of the readers [31]. The author needs experienceand knowledge to drive smoothly the attention of thereaders through the comics [8]. Furthermore, the layoutof comics is evolving over time [52], moving away fromconventional grids to a more decorative and dynamicway.Usually, comics are printed on books and can beseen as a single or double pages. When the book isopened, the reader can see both pages so some au-thors use this physical layout as part of the story: some drawings can be spread in two pages, and when thereader turn one page something might happen in thenext page. Figure 2 illustrates a classic comics content.A page is usually composed of a set of panels defin-ing a specific action or situation. The panels can beenclosed in a frame and separated by a white spacearea named gutter. The reading order of the panels de-pends on the language. For example, in Japanese (seeFig. 4), the reading order is usually from right to leftand top to bottom. Speech balloons and captions areincluded in the panel to describe conversations or thenarration of the story. The dialog balloons also have aspecified reading order which is usually the same as thereading order of the panels. Some sound effects or ono-matopoeias are often included to give more sensationsto the reader such as smell or sound. Japanese comicsoften contains “manpu” (see Fig. 3) which are symbolsused to visualized feelings and sensations of the charac-ters such as sweating marks on the head of a characterto show that he feels uncomfortable even if he is notactually sweating.The authors are free to draw the characters as theywant, so they can be deformed or disproportioned asillustrated in Fig. 7. In some genres such as fantasy,the characters can also be non-human which makes thesegmentation and recognition task challenging. Thereare also many drawing effects such as speed lines, fo-cusing lines, etc. For example, in Fig 2, a texture sur-rounding the female character in the lower-right panelrepresents her warm atmosphere as contrasted with thecold weather.
Olivier Augereau et al.
Fig. 3
Examples of “manpu”: a mark used to inten-sify the emotions of the characters such as concentration,anger, surprise, embarrassment, confidence, etc. The origi-nal images are extracted from the Manga109 dataset [22],c (cid:13)
Yoshi Masako, c (cid:13)
Kobayashi Yuki, c (cid:13)
Arai Satoshi, c (cid:13)
OkudaMomoko, c (cid:13)
Yagami Ken.
Fig. 4
Example of the reading order of a Japanesemanga. Image under GNU Free Documentation Li-cense: https://commons.wikimedia.org/wiki/File:Manga_reading_direction.svg
Fig. 5
The digital comic “Protanopia” created by An-dre Bergs. The reader can control the camera angleby tilting the screen. The panel are animated withcontinuous loops. Image extracted from the video on:http://andrebergs.com/protanopia
Even if more and more digitized versions of theprinted version are available few comics are produceddigitally and taking advantage of the new technology.Figure 5 illustrates an example of digital comics takingadvantage of tablet functions: the images are animatedcontinuously and the user can tilt the tablet to con-trol the camera angle. This comics is created by AndreBergs and is freely available on App store and GooglePlay. We imagine that in the future, it could be possibleto create such interactive comics automatically. We organized the studies done about comics in com-puter science into three main categories that we will http://andrebergs.com/protanopia present in this section. One of the main research fieldsfocuses on analyzing the content of comics images, ex-tracting the text, the characters, segmenting the panels,etc. Another category is about generating new contentfrom or for comics. The last category is about analyzingthe reader’s behavior and interaction with comics.3.1 Content analysisIn order to understand the content of comics and toprovide services such as retrieval or recommender sys-tems, it is necessary to extract the content of comics.The DIA community started to cover this problem withclassic approaches. Images can be analyzed from thelow levels such as screentones [29] or text [3] to thehigh level such as style [13] or genre [19] recognition.Some elements are interdependent; for example find-ing the text and speech balloons, as one can contain theother. But also the positions can be relative to eachother, as the speech balloon is usually coming fromthe mouth of a character. These elements are usuallygrouped inside a panel, but not necessarily. As the au-thors are free to draw whatever and however they want,there is a wide disparity among all comics which makethe analysis complex. For example, some authors exag-gerate the facial deformation of the face of a characterto make him angrier or more surprised.We present the related work from the low level tohigh-level analysis as follow. Textures, screentones, and structural lines
Black and white textures are often used to enrich thevisual experience of non-colored comics. It is especiallyused for creating an illusion of shades or colors. How-ever, the identification and segmentation of the tex-tures is challenging as they can have various forms andare sometimes mixed with the other parts of the draw-ing. Ito et al. proposed a method for separating thescreentones and line drawings [29]. More recently, Liuet al. [43] proposed a method for segmenting the tex-tures in comics.Extracting the structural lines of comics is anotherchallenging problem which is related to the analysis ofthe texture. The result of such an analysis is displayedin Fig. 6. The difference between structural lines andarbitrary ones must be considered carefully. Li et al. [41]recently proposed a deep network model to handle thisproblem. Finding textures and structural lines is an im-portant analysis step to generate colorized and vector-ized comics. survey of comics research in computer science 5
Fig. 6
Structural line extraction. For each pair of im-ages, the one on the left is the original image, the oneon the right is obtained after removing the textures anddetecting the structural lines by Li et al. algorithm [41].Downloaded from: http://exhibition.cintec.cuhk.edu.hk/exhibition/project-item/manga-line-extraction/ . Text
The extraction of text (such as Latin or Chinese) char-acters has been investigated by several researchers butis still a difficult problem as many authors write thetext by hand.Arai and Tolle [3] proposed a method to extractframes, balloon, and text based on connected compo-nents and fixed thresholds about their sizes. This is asimple approach which works well for “flat” comics, i.e.conventional comics where each panel is defined by ablack rectangle and has no overlapping parts.Rigaud et al. also proposed a method to recognizethe panels and text based on the connected compo-nents [63]. By adding some other features such as thetopological and spatial relations, they successfully in-creased the performance of [3].More recently, Aramaki et al. combined connectedcomponent and region-based classifications to make abetter text detection system [4]. A recent method alsoaddresses the problem of speech text recognition [58].In order to simplify the problem, Hiroe and Hottahave proposed to detect and count the number of excla-mation marks in order to represent a comic book by itsdistribution of exclamation marks or to find the scenechanges [28].
Faces and pose
One of the most important elements of comics is thecharacters (persons) of the story. However, identifyingthe characters is challenging because of the posture, oc-clusions, and other drawing effects. Also, the characterscan be humans, animals, robots or anything with vari-ous drawing representations. Sun et al. [70] proposed tolocate and identify the characters in comics pages by us-ing local feature matching. New methods have recentlybeen proposed to recognize the face and characters incomics based on deep neural networks [14,53,50].Estimating the pose of the character is another chal-lenge. As we can see in Fig. 8, if the characters havehuman proportion and are not too deformed, they canbe well recognized by a popular approach such as Open
Fig. 7
Different examples of comics character faces. Somepart of the face such as the nose, eyes, or mouth can be de-formed to emphasize the emotion of the character. The orig-inal images are extracted from the Manga109 dataset [22],c (cid:13)
Kurita Riku, c (cid:13)
Yamada Uduki, c (cid:13)
Tenya.
Fig. 8
Example of application of Open Pose on comics [10].This model works well for comics as long as the drawingsare realistic, so it fail in most cases. Source: images extractedfrom the Manga109 dataset [22], c (cid:13)
Yoshi Masako, c (cid:13)
KannoHiroshi.
Pose [10]. Knowing the character poses could lead toactivity recognition, but a method such as Open Posewill fail on almost all comics.
Balloons
The balloons are an important component of comicswhere most of the information is conveyed by the dis-cussion between the protagonists. So one important stepis to detect the balloons [18] and then to associate theballoons to the speaker [61].The shape of the balloon conveys also informationabout the speaker feelings [77]. For example, a balloonwith wavy shape represents anxiety, an explosion shaperepresents the anger, a cloudy shape represents joy, etc.
Panel
The layout of a comics page is described by Tanakaet al. as a sequence of frames named panels [71]. Sev-eral methods have been proposed to segment the pan-els, mainly based on the analysis of connected compo-nents [2], [63] or on the page background mask [51].As these methods based on heuristics rely on whitebackgrounds and clean gutters, Iyyer et al. recently pro-posed an approach based on deep learning [30] to pro-cess eighty-year-old American comics.
Olivier Augereau et al.
Fig. 9
Example application of illustration2vec [64]. Themodel recognize several attributes of the character such asher haircut and clothes. The web demo used to generate thisimage is not online anymore.
High level understanding
Rigaud et al. proposed a knowledge-driven system thatunderstands the content of comics by segmenting all thesub-parts [59]. But understanding the narrative struc-ture of comics is much more than simply segmenting itsdifferent sub-parts. Indeed, the reader makes inferencesabout what is happening from one frame to another bylooking at all graphical and textual elements [46].Iyyer et al. introduced some methods to explore howreaders connect panels into a coherent story [30]. Theyshow that both text and images are important to guesswhat is happening in a panel by knowing the previousones.Daiku et al. [19] proposed to analyze the comicsstorytelling by analyzing the genre of each page of thecomics. Then the story of a comic book is representedas a sequence of genres such as: “11 pages of action”,“5 pages of romance”, “8 pages of comedy”, etc.Analyzing the text of the dialogues and stories hasnot been investigated yet specifically for comics. Similarresearch as sentiment analysis [48] could be applied toanalyze the psychology of the characters or to analyzeand compare the narrative structure of different comics.From the cognitive point of view, Cohn proposeda theory of “Narrative Grammar” based on linguisticsand visual language which are leading the understand-ing process [16]. A lot of information is inferred by thereader who is constructing a representation of the de-picted pictures in his mind. This is how we can recog-nize that two characters drawn slightly in a differentway are the same, or that a character is doing an ac-tion by looking at a still image. These concepts mustbe inferred by the computer too, in order to obtain ahigh-level representation of comics.
Applications
From these analyses, retrieval systems can be built, andsome have already been proposed in the literature such as sketch [45,49] or graphs based [40] retrieval. Thedrawing style has also been studied [13]. The possibleapplications are artist retrieval, art movement retrieval,and artwork period analysis.Saito and Matsui proposed a model for building afeature vector for illustrations named illustration2vec [64].As showed on Fig.9, this model can be used to predictthe attributes of a character such as its hair or eye color,the size of the hair, the clothes worn by the charac-ter, etc. and to research specific illustrations. Vie et al.proposed a recommender system using the illustrationcomics covers based on illustration2vec in a cold-startscenario [72].
Conclusion (content analysis)
Segmenting the panels or reading the text of any comicsis still challenging because of the complexity of somelayouts and the diversity of the content. Figure 2 il-lustrates the difficulty of segmenting the panels. Mostof the current methods focus on using handcrafted fea-tures for the segmentation and analysis and will fail onan unusual layout.The segmentation of faces and body of the charac-ters is still an open problem and a large amount of la-beled data will be necessary to adapt the deep learningapproaches.Even if the text contains very rich information, sur-prisingly few methods have been proposed to analyzethe storyline or the content of comics based on the text.Also, some parts of comics has not been addressed atall, such as the detection of onomatopoeias.The future research about high-level informationshould be more considered as it can be used to rep-resent information that could interest the reader suchas the style or genre, the storytelling, etc.3.2 Content generationThe aim of content generation or enrichment is to usecomics to generate new content either based on comicsor other media.
Vectorization
As most of comics are not created digitally, vectoriza-tion is a way to transform scanned comics to a vec-tor representation for real-time rendering with arbitraryresolution [78]. Generating vectorized comics is neces-sary for visualizing them nicely in digitized environ-ments. This is also an important step for editing thecontent of comics and one of the basic step of comicsenrichment [80]. survey of comics research in computer science 7
Fig. 10
Example of colorization process based onstyle2paints. Image downloaded from https://github.com/lllyasviel/style2paints . Colorization
Several methods have been proposed for automatic col-orization [54,66,15,23,79] and color reconstruction [37],as comics with colors can be more attractive for somereaders. Colorization is quite a complex problem as thedifferent parts of a character such as his arms, hands,fingers, face, hair, clothes, etc. must be retrieved tocolor each part in a correct way. Furthermore, the posesof a character can be very different from each other:some parts can appear, disappear or be deformed. Anexample of colorization is displayed in Fig. 10.Recently, deep learning based colorization approachhas been used for creating color version manga bookswhich are distributed by professional companies in Japan . Comics and character generation
One problem for generating comics is to create the lay-out and to place the different components such as thecharacters, text balloons, etc. at a correct position toprovide a fluid reading experience. Cao et al. proposeda method for creating stylistic layout automatically [7]and then another one for placing and organizing the el-ements in the panels according to high-level user spec-ification [8].The relation between real-life environment or situa-tions and the one represented in comics can be used togenerate or augment comics. Wu and Aizawa proposeda method to generate a comics image directly from aphotograph [76].At the end of 2017, Jin et al. [33] presented a methodto generate automatically comics characters. An exam-ple of a generated character by their online demo isdisplayed Fig. 11. The result of the generation is not http://make.girls.moe/ Fig. 11
Example of random character generation based Jinet al. method [33]. In this example, we set some attributessuch as green hair color, blue eyes, smile and hat.
Fig. 12
Example of smiling animation in the conceptualspace [73]. Similar animation could be obtained for comics im-ages. Image source: https://vusd.github.io/toposketch/ always visually perfect, but still, this is a powerful toolas an unlimited number of characters can be generated.
Animation
As comics are still images, a way to enhance the visual-ization of comics is to generate animations. Recently,some researchers proposed a way for animating stillcomics images through camera movements [9,32]. Sev-eral animation movies and series have been adapted incomics paper book and vice versa. Some possible out-look could be to generate an animated movie from a pa-per comics or a paper comics from an animated movie.For the natural images, some methods have beenproposed to animate the face of people by using latentspace interpolations. As illustrated in Fig. 12 the latentvectors can be computed for a neutral and smiling faceto generate a smiling animation [73].Another application is to use extract the facial key-points and to use another source (text, speech, or face)to animate the mouth of the character. For example,this has been done for generating photorealistic videoof Obama speech based on a text input [38].
Media conversion
More broadly, we can imagine to convert text, videos,or any content into comics, and vice-versa. This prob-lem can be seen as media conversion. For example, Jinget al. proposed a system to convert videos to comics[34]. There are many challenges to do a successful con-version: summarizing the videos, stylizing the images,
Olivier Augereau et al. generating the layout of comics and positions of textballoons.An application which as not been done to comicsbut to natural videos is to add generated sound to avideo [81]. No application has been done for comics,but we could imagine a similar application to generatesound effects (swords which are banging to each other,a roaring tailpipe, etc.) or atmosphere sounds (village,countryside, crowd, etc.).Creating a descriptive text based on comics or gen-erating comics based on descriptive text could be pos-sible in the future, as it has been done for the naturalimages. Reed et al. [56] proposed a method for auto-matic synthesis of realistic natural images from text.We can also imagine changing the content, addingor removing some parts, changing the genre or styledepending on the user or author preference.
Conclusion (content generation)
In order to generate contents, some model or labeleddata are necessary. In order to generate automaticallycharacters, Jin et al. used around 42000 images. Deeplearning approaches such as Generative Adversarial Net-works (GAN) [24] has been widely used for natural im-age applications such as style transfer [83], reconstruct-ing 3D models of objects from images [75], generatingimages from text [56], editing pictures [82], etc. Theseapplications could be done for comics too.Another possibility to enhance comics is to add othermodes such as sound, vibrations, etc. Adding soundsshould be easily possible by using the soundtracks fromanimation movies. But, in order to be able to producethese effects at a correct timing, information about theuser interactions is necessary. This is possible by us-ing an eye tracker or detecting when the user turns aspecific page in real time.3.3 User interactionApart from the content analysis and generation, wehave identified another category of research based onthe interaction between users and comics. One partconsists of analyzing the user himself instead of ana-lyzing comics. For example, we would like to under-stand or predict what the user feels or how he behaveswhile reading comics. Another part consists in creatingnew interfaces or interactions between the readers andcomics. Also, new technology can be used to improvethe access for impaired people.
Fig. 13
One the left: eye gaze fixations (blue circles) andsaccades (segment between circles) of one reader. One theright: heat map accumulated over several readers; the redcolor corresponds to longer fixation time.
Eye gaze and reading behavior
In order to know where and when a user is looking atsome specific parts of a comic, researchers are using eyetracking systems. By using eye trackers it is possible todetect how long a user spends to read a specific part ofa comic page.Knowing the user reading behavior and interest is animportant information that can be used by the authoror editors as a feedback. It also can be used to provideother services to readers such as giving more detailsabout the story of a character that a specific user likes,removing part of battle if he does not likes violence, etc.Carroll et al. [11] showed that the readers tend tolook at the artworks before reading the text. Rigaudet al. found that, in France, the readers spend most ofthe time at reading the text and looking at the faceof the characters [60]. The same experiment repeatedin Japan lead to the same conclusion, as illustrated inFig. 13.Another way to analyze how the readers understandthe content of comics is to ask them to manually orderthe panels. Cohn presented different kinds of layoutswith empty panels and showed that various manipu-lations to the arrangement of panels push readers tonavigate panels in alternate routes [17]. Some cognitivetricks can ensure that most of the readers will followthe same reading path.In order to augment comics with new multimediacontents such as sounds, vibration, etc. it is importantto trigger these effects at a good timing. In this case, de-tecting when the user turns a page or estimating whichposition he is looking at will be useful.
Emotion
Comics contains exciting contents. Many different gen-res of comics exist such as comedy, romance, horror,etc. and trigger different kinds of emotions to the read- survey of comics research in computer science 9
Fig. 14
The user wear the E4 wristband, measuring his phys-iological signals such as heartbeat, skin conductance and skintemperature. ers. Much research has been done on emotion detectionbased on face image and physiological signals such aselectroencephalogram (EEG) while watching videos [36,69,68]. However such research has not been conductedwhile reading comics. We think that analyzing the emo-tion while reading might be more challenging as moviecontain animations and sounds that might stimulatemore the emotions of the user.By recording and analyzing the physiological signalsof the readers as illustrated in Fig. 14; Lima Sanches etal. showed that it is possible to estimate if the user isreading a comedy, a romance or a horror comics, basedon the emotions felt by the readers [42]. For example,when reading a horror comic book, the user feels stress-ful and his skin temperature is decreasing.Emotions are usually represented as two axes: arousaland valence, where the arousal represents the strengthof the emotion and the valence relates to a positiveor negative emotion. Matsubara et al. showed that byanalyzing the physiological signals of the reader, it ispossible to estimate the reader’s arousal [44].Both experiments are using the E4 wristband whichcontains a photoplethysmogram sensor (to analyze theblood volume pulse), an electrodermal activity sensor(to analyze the amount of sweat), an infrared ther-mopile sensor (to read the peripheral skin temperature),and a 3-axis accelerometer (to captures motion-basedactivity). Such device is commonly used for stress de-tection [35,25].Still, each reader has is own preferences and feelsemotions in a different way while reading so these anal-yses are quite challenging. Depending on the user stateof mind or mood, he might prefer to read content thatis eliciting specific kind of emotions. Emotion detectioncould be used by author or editors to analyze whichcontent stimulate more the readers. Fig. 15
Tanvas tablet enable the user to feel different tex-tures. This could be used to enhance the interaction withcomics. Source: https://youtu.be/ohL_B-6Vy6o?t=19s
Visualization and interaction
Comics can be read on books, tablets, smartphonesor any other devices. Visualization and interaction onsmartphones can be difficult, especially if the screen issmall [6]. The user needs to zoom and do many op-erations which can be inconvenient. Some researchersare also trying to use more interactive devices such asmulti-touch tables to attract the users [1].Another important challenge is to make comics ac-cessible to impaired people. Rayar [55] explained that amultidisciplinary collaboration between Human-ComputerInteractions, Cognitive Science, and Education Researchis necessary to fulfill such a goal. Up to now, the threemain ways to access images for visually impaired peo-ple are: audio description, printed Braille descriptionand printed tactile pictures (in relief). Such way couldbe generated automatically thanks to new research andtechnology.New haptic feedback tablet such as the one pro-posed by Meyer et al. [47] illustrated in Fig. 15 couldhelp visually impaired people to access comics. Othersapplication such as detecting and magnifying the textor moving the comics automatically could be helpful forimpaired people.
Education
It has been proven that the representation of knowl-edge as comics can be a good way to attract studentsto read [21] or to learn language [65]. It could be inter-esting to measure the impact on the representation ofthe knowledge.Comics could be, for some students, a more interest-ing way to learn, so using comics in education might bea way to augment their attention level and memory ifcomics are nicely designed. A challenge related to mediaconversion is then to transform normal textbooks into comics and to compare the interactions of the studentswith both books.
Conclusion (user interaction)
The interactions between the user and comics have notbeen analyzed deeply yet. Many sensors can be used toanalyze the user with respect to brain activity, muscleactivity, body movement and posture, heart rate, sweat-ing, breath, pupil dilation, eye movement, etc. Collect-ing such information can give more information aboutthe readers and comics.
In this section, we present some tools and datasets whichare publicly available for the research on comics.4.1 ToolsSeveral tools for comics image segmentation and analy-sis are available on the Internet and can be freely usedby anybody, such as: – Speech balloon segmentation [57], – Speech text recognition [62], – Automatic text extraction cbrTekStraktor , – Annotation tool to create ground truth label , – Semi-Automatic Manga Colorization [23] , – Deep learning library for estimating a set of tagsand extracting semantic feature vectors from illus-trations [64], .The speech balloon [57] and text segmentation [62]algorithms are available on the author’s Github .As we can see, even if many papers have been pub-lished about comics segmentation and understanding,still few tools are available on the Internet. To improvethe algorithms significantly and being able to comparethem, making the code available is an important stepfor the community.4.2 DatasetsFew dataset has been made publicly available becauseof copyright issues. Indeed, it is not possible for re-searchers to use and share large dataset of copyrighted https://sourceforge.net/projects/cbrtekstraktor/ https://github.com/DwangoMediaVillage/Comicolorization https://github.com/rezoo/illustration2vec https://github.com/crigaud Fig. 16
Example of Japanese manga cover page containedin the Manga109 dataset [22]. materials. So making competition and reproducible re-search is not easy. Hopefully, recently, several datasetshave been made available.The Graphic Narrative Corpus (GNC) [20] providemetadata information for 207 titles such as the au-thors, number of pages, illustrators, genres, etc. Un-fortunately, the corresponding images are not availablebecause of copyright protections. So the usefulness ofthis dataset is very limited. Still, the authors are will-ing to share segmentation ground truth and eye gazedata. However such data has not been released yet.eBDtheque [26] contains 100 comic pages, mainlyin French language. The following elements have beenlabeled on the dataset: 850 panels, 1092 balloons, 1550characters and 4691 text lines. Even if the number ofimages is limited, creating such detailed labeled data istime-consuming and very useful for the community.Manga109 [22] is illustrated in Fig. 16. This datasetwhich contains 109 manga volumes from 93 different au-thors. On average, a volume contains 194 pages. Thesemangas were published between the 1970’s and 2010’sand are categorized into 12 different genres such as fan-tasy, humor, sports, etc. Only a limited labeled dataare available for now such as the text for few volumes.The strong point of this dataset is to provide all pagesof one volume which allows analyzing the sequences ofpages.COMICS [30] contains 1,229,664 panels paired withautomatic textbox transcriptions from 3,948 Americancomics books published between 1938 and 1954. Thedataset includes ground truth labeled data such as therectangular bounding boxes of panels on 500 pages and1,500 textboxes. http://ebdtheque.univ-lr.fr/registration/ https://obj.umiacs.umd.edu/comics/index.html survey of comics research in computer science 11 Fig. 17
Example of comics contained in the BAM!dataset [74]. From left to right we selected two images con-taining the following label: bicycles, birds, buildings, cars,dogs, and flowers.
BAM! [74] contains around 2.5 million artistic im-ages such as: 3D com- puter graphics, comics, oil paint-ing, pen ink, pencil sketches, vector art, and watercolor.The images contain emotion labels (peaceful, happy,gloomy, and scary) and object labels (bicycles, birds,buildings, cars, cats, dogs, flowers, people, and trees).Figure 17 shows a sample of the dataset containingcomics. The dataset is interesting due to the labels andlarge variety of content and languages. However, theimages are just examples provided by the authors andcannot always be understood without the previous orfollowing pages.BAM!, COMICS, Manga109, and eBDtheques arethe four main comics datasets that have been madeavailable with the corresponding images. Building suchdatasets is a time and money consuming task, especiallyfor building the ground truth and labeled data.The main problem to create such dataset comesfrom the legal and copyright protection which preventthe researchers to make publicly available image datasets.The content of the dataset is also important dependingon the research to proceed. For example, it is interest-ing to have a variety of comics from different countries,with different languages and genres. It is also interest-ing to have several continuous pages from the same vol-umes and several volumes from the same series in orderto analyze the evolution of the style of an author, thementality of the character, or the storyline. The research about comics in computer science has beendone about several aspects. We organized the researchinto three inter-dependent categories: content analy-sis, content generation, and user interaction. A mutualanalysis of the reader and comics is necessary to under-stand more about how can we augment comics. https://bam-dataset.org A large part of previous work is focusing on the low-level image analysis by using handcrafted features andknowledge-driven approaches. Recent research focusesmore on deep learning and high-level image understand-ing. Still, many applications have been done for naturalimage and the research about artworks and comics getmore attention only very recently [74].A lot of unexplored fields remain, especially, thecontent generation and augmentation. Only few compa-nies started to use research for automatic colorizationfor example, but it is clear that it could be possibleto help the authors with content automatic (or semi-automatic) generation of content or animation.The analysis of the behavior and emotions of thereaders have been superficially covered. However, usingthe opportunity given by new technologies and sensorscould be helpful to create the next age of comics. Ifcould be also a way to help the access of comics toimpaired people.For now, few tools and dataset have been madeavailable. Making publicly available copyrighted imagesis a problem but it would greatly contribute to the im-provement of comics research.
Acknowledgements
The authors would like to thanks thestudents of the Intelligent Media Processing Group of Os-aka Prefecture University who made some of the presentedresearch and illustrations: Yuki Daiku, Mizuki Matsubara,Charles Lima Sanches, Seiichiro Hara, and Yusuke Maeda.This work is in part supported by JST CREST (JPMJCR16E1),JSPS Grant-in-Aid for Scientific Research (15K12172), JSPSKAKENHI Grant Number (16K16089) and the Key ProjectGrant Program of Osaka Prefecture University.
References
1. Andrews, D., Baber, C., Efremov, S., Komarov, M.: Cre-ating and using interactive narratives: reading and writ-ing branching comics. In: Proceedings of the SIGCHIConference on Human Factors in Computing Systems,pp. 1703–1712. ACM (2012)2. Arai, K., Tolle, H.: Automatic e-comic content adapta-tion. International Journal of Ubiquitous Computing (1), 1–11 (2010)3. Arai, K., Tolle, H.: Method for real time text extractionof digital manga comic. International Journal of ImageProcessing (IJIP) (6), 669–676 (2011)4. Aramaki, Y., Matsui, Y., Yamasaki, T., Aizawa, K.: Textdetection in manga by combining connected-component-based and region-based classifications. In: Image Pro-cessing (ICIP), 2016 IEEE International Conference on,pp. 2901–2905. IEEE (2016)5. Augereau, O., Iwata, M., Kise, K.: An overview of comicsresearch in computer science. In: 2017 14th IAPR Inter-national Conference on Document Analysis and Recog-nition (ICDAR), pp. 54–59. IEEE (2017)6. Augereau, O., Matsubara, M., Kise, K.: Comic visualiza-tion on smartphones based on eye tracking. In: Proceed-2 Olivier Augereau et al.ings of the 1st International Workshop on coMics ANal-ysis, Processing and Understanding, p. 4. ACM (2016)7. Cao, Y., Chan, A.B., Lau, R.W.: Automatic stylisticmanga layout. ACM Transactions on Graphics (TOG) (6), 141 (2012)8. Cao, Y., Lau, R.W., Chan, A.B.: Look over here:Attention-directing composition of manga elements.ACM Transactions on Graphics (TOG) (4), 94 (2014)9. Cao, Y., Pang, X., Chan, A.B., Lau, R.W.: Dynamicmanga: Animating still manga via camera movement.IEEE Transactions on Multimedia (1), 160–172 (2017)10. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In:CVPR, vol. 1, p. 7 (2017)11. Carroll, P.J., Young, J.R., Guertin, M.S.: Visual analysisof cartoons: A view from the far side. In: Eye movementsand visual cognition, pp. 444–461. Springer (1992)12. Christiansen, H.C.: Comics & culture: analytical andtheoretical approaches to comics. Museum TusculanumPress (2000)13. Chu, W.T., Cheng, W.C.: Manga-specific features andlatent style model for manga style analysis. In: Acous-tics, Speech and Signal Processing (ICASSP), 2016 IEEEInternational Conference on, pp. 1332–1336. IEEE (2016)14. Chu, W.T., Li, W.W.: Manga facenet: Face detection inmanga based on deep neural network. In: Proceedings ofthe 2017 ACM on International Conference on Multime-dia Retrieval, pp. 412–415. ACM (2017)15. Cinarel, C., Zhang, B.: Into the colorful world ofwebtoons: Through the lens of neural networks. In: 2ndInternational Workshop on coMics Analysis, Processing,and Understanding, 14th IAPR International Conferenceon Document Analysis and Recognition, ICDAR 2017,Kyoto, Japan, November 9-15, 2017, pp. 35–40 (2017)16. Cohn, N.: Visual narrative structure. Cognitive science (3), 413–452 (2013)17. Cohn, N., Campbell, H.: Navigating comics ii: Con-straints on the reading order of comic page layouts. Ap-plied Cognitive Psychology (2), 193–199 (2015)18. Correia, J.M., Gomes, A.J.: Balloon extraction from com-plex comic books using edge detection and histogramscoring. Multimedia Tools and Applications (18),11367–11390 (2016)19. Daiku, Y., Augereau, O., Iwata, M., Kise, K.: Comicstory analysis based on genre classification. In: 2017 14thIAPR International Conference on Document Analysisand Recognition (ICDAR), pp. 60–65. IEEE (2017)20. Dunst, A., Hartel, R., Laubrock, J.: The graphic narra-tive corpus (gnc): Design, annotation, and analysis forthe digital humanities. In: 2017 14th IAPR InternationalConference on Document Analysis and Recognition (IC-DAR), pp. 15–20. IEEE (2017)21. Eneh, A., Eneh, O.: Enhancing pupils’ reading achieve-ment by use of comics and cartoons in teaching reading.Journal of Applied Science (3), 8058–62 (2008)22. Fujimoto, A., Ogawa, T., Yamamoto, K., Matsui, Y., Ya-masaki, T., Aizawa, K.: Manga109 dataset and creationof metadata. In: Proceedings of the 1st InternationalWorkshop on coMics ANalysis, Processing and Under-standing, p. 2. ACM (2016)23. Furusawa, C., Hiroshiba, K., Ogaki, K., Odagiri, Y.:Comicolorization: semi-automatic manga colorization.In: SIGGRAPH Asia 2017 Technical Briefs, p. 12. ACM(2017)24. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.:Generative adversarial nets. In: Advances in neural in-formation processing systems, pp. 2672–2680 (2014) 25. Greene, S., Thapliyal, H., Caban-Holt, A.: A survey ofaffective computing for stress detection: Evaluating tech-nologies in stress detection for better health. IEEE Con-sumer Electronics Magazine (4), 44–56 (2016)26. Gu´erin, C., Rigaud, C., Mercier, A., Ammar-Boudjelal,F., Bertet, K., Bouju, A., Burie, J.C., Louis, G., Ogier,J.M., Revel, A.: ebdtheque: a representative database ofcomics. In: Document Analysis and Recognition (IC-DAR), 2013 12th international conference on, pp. 1145–1149. IEEE (2013)27. Hall, I., Smith, F.: The struggle for soft power in asia:Public diplomacy and regional competition. Asian Secu-rity (1), 1–18 (2013)28. Hiroe, S., Hotta, S.: Histogram of exclamation marksand its application for comics analysis. In: 2017 14thIAPR International Conference on Document Analysisand Recognition (ICDAR), pp. 66–71. IEEE (2017)29. Ito, K., Matsui, Y., Yamasaki, T., Aizawa, K.: Separationof manga line drawings and screentones. In: Eurographics(Short Papers), pp. 73–76 (2015)30. Iyyer, M., Manjunatha, V., Guha, A., Vyas, Y., Boyd-Graber, J., Daum´e III, H., Davis, L.: The amazing mys-teries of the gutter: Drawing inferences between panelsin comic book narratives. In: IEEE Conference on Com-puter Vision and Pattern Recognition (2017)31. Jain, E., Sheikh, Y., Hodgins, J.: Inferring artistic inten-tion in comic art through viewer gaze. In: Proceedings ofthe ACM Symposium on Applied Perception, pp. 55–62.ACM (2012)32. Jain, E., Sheikh, Y., Hodgins, J.: Predicting moves-on-stills for comic art using viewer gaze data. IEEE Com-puter Graphics and Applications (4), 34–45 (2016)33. Jin, Y., Zhang, J., Li, M., Tian, Y., Zhu, H., Fang,Z.: Towards the automatic anime characters creationwith generative adversarial networks. arXiv preprintarXiv:1708.05509 (2017)34. Jing, G., Hu, Y., Guo, Y., Yu, Y., Wang, W.: Content-aware video2comics with manga-style layout. IEEETransactions on Multimedia (12), 2122–2133 (2015)35. Kalimeri, K., Saitis, C.: Exploring multimodal biosignalfeatures for stress detection during indoor mobility. In:Proceedings of the 18th ACM International Conferenceon Multimodal Interaction, pp. 53–60. ACM (2016)36. Koelstra, S., Muhl, C., Soleymani, M., Lee, J.S., Yazdani,A., Ebrahimi, T., Pun, T., Nijholt, A., Patras, I.: Deap: Adatabase for emotion analysis; using physiological signals.IEEE Transactions on Affective Computing (1), 18–31(2012)37. Kopf, J., Lischinski, D.: Digital reconstruction ofhalftoned color comics. ACM Transactions on Graphics(TOG) (6), 140 (2012)38. Kumar, R., Sotelo, J., Kumar, K., de Brebisson, A., Ben-gio, Y.: Obamanet: Photo-realistic lip-sync from text.arXiv preprint arXiv:1801.01442 (2017)39. Lam, P.E.: Japans quest for soft power: attraction andlimitation. East Asia (4), 349–363 (2007)40. Le, T.N., Luqman, M.M., Burie, J.C., Ogier, J.M.: Re-trieval of comic book images using context relevance in-formation. In: Proceedings of the 1st International Work-shop on coMics ANalysis, Processing and Understanding,p. 12. ACM (2016)41. Li, C., Liu, X., Wong, T.T.: Deep extraction of mangastructural lines. ACM Transactions on Graphics (TOG) (4), 117 (2017)42. Lima Sanches, C., Augereau, O., Kise, K.: Manga con-tent analysis using physiological signals. In: Proceedings survey of comics research in computer science 13of the 1st International Workshop on coMics ANalysis,Processing and Understanding, p. 6. ACM (2016)43. Liu, X., Li, C., Wong, T.T.: Boundary-aware texture re-gion segmentation from manga. Computational VisualMedia (1), 61–71 (2017)44. Matsubara, M., Augereau, O., Sanches, C.L., Kise, K.:Emotional arousal estimation while reading comics basedon physiological signal analysis. In: Proceedings of the 1stInternational Workshop on coMics ANalysis, Processingand Understanding, MANPU ’16, pp. 7:1–7:4. ACM, NewYork, NY, USA (2016)45. Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa,T., Yamasaki, T., Aizawa, K.: Sketch-based manga re-trieval using manga109 dataset. Multimedia Tools andApplications pp. 1–28 (2016)46. McCloud, S.: Understanding comics: The invisible art.Northampton, Mass (1993)47. Meyer, D.J., Wiertlewski, M., Peshkin, M.A., Colgate,J.E.: Dynamics of ultrasonic and electrostatic frictionmodulation for rendering texture on haptic surfaces. In:Haptics Symposium (HAPTICS), 2014 IEEE, pp. 63–67.IEEE (2014)48. Mohammad, S.M.: Sentiment analysis: Detecting valence,emotions, and other affectual states from text. In: Emo-tion measurement, pp. 201–237. Elsevier (2016)49. Narita, R., Tsubota, K., Yamasaki, T., Aizawa, K.:Sketch-based manga retrieval using deep features. In:2017 14th IAPR International Conference on DocumentAnalysis and Recognition (ICDAR), pp. 49–53. IEEE(2017)50. Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic charactersdetection using deep learning. In: 2017 14th IAPR Inter-national Conference on Document Analysis and Recog-nition (ICDAR), pp. 41–46. IEEE (2017)51. Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robustpanel extraction method for manga. In: Proceedings ofthe 22nd ACM international conference on Multimedia,pp. 1125–1128. ACM (2014)52. Pederson, K., Cohn, N.: The changing pages of comics:Page layouts across eight decades of american superherocomics. Studies in Comics (1), 7–28 (2016)53. Qin, X., Zhou, Y., He, Z., Wang, Y., Tang, Z.: A fasterr-cnn based method for comic characters face detection.In: 2017 14th IAPR International Conference on Docu-ment Analysis and Recognition (ICDAR), pp. 1074–1080.IEEE (2017)54. Qu, Y., Wong, T.T., Heng, P.A.: Manga colorization.In: ACM Transactions on Graphics (TOG), vol. 25, pp.1214–1220. ACM (2006)55. Rayar, F.: Accessible comics for visually impaired people:Challenges and opportunities. In: 2017 14th IAPR Inter-national Conference on Document Analysis and Recog-nition (ICDAR), vol. 03, pp. 9–14 (2017). DOI 10.1109/ICDAR.2017.28556. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B.,Lee, H.: Generative adversarial text to image synthesis.arXiv preprint arXiv:1605.05396 (2016)57. Rigaud, C., Burie, J.C., Ogier, J.M.: Text-IndependentSpeech Balloon Segmentation for Comics and Manga, pp.133–147. Springer International Publishing, Cham (2015)58. Rigaud, C., Burie, J.C., Ogier, J.M.: Segmentation-freespeech text recognition for comic books. In: 2017 14thIAPR International Conference on Document Analysisand Recognition (ICDAR), pp. 29–34. IEEE (2017)59. Rigaud, C., Gu´erin, C., Karatzas, D., Burie, J.C., Ogier,J.M.: Knowledge-driven understanding of images in comic books. International Journal on Document Anal-ysis and Recognition (IJDAR) (3), 199–221 (2015)60. Rigaud, C., Le, T.N., Burie, J.C., Ogier, J.M., Ishimaru,S., Iwata, M., Kise, K.: Semi-automatic text and graph-ics extraction of manga using eye tracking information.In: Document Analysis Systems (DAS), 2016 12th IAPRWorkshop on, pp. 120–125. IEEE (2016)61. Rigaud, C., Le Thanh, N., Burie, J.C., Ogier, J.M., Iwata,M., Imazu, E., Kise, K.: Speech balloon and speaker as-sociation for comics and manga understanding. In: Doc-ument Analysis and Recognition (ICDAR), 2015 13th In-ternational Conference on, pp. 351–355. IEEE (2015)62. Rigaud, C., Pal, S., Burie, J.C., Ogier, J.M.: Towardspeech text recognition for comic books. In: Proceed-ings of the 1st International Workshop on coMics ANal-ysis, Processing and Understanding, MANPU ’16, pp.8:1–8:6. ACM, New York, NY, USA (2016). DOI10.1145/3011549.3011557. URL http://doi.acm.org/10.1145/3011549.3011557
63. Rigaud, C., Tsopze, N., Burie, J.C., Ogier, J.M.: Robustframe and text extraction from comic books. In: GraphicsRecognition. New Trends and Challenges, pp. 129–138.Springer (2013)64. Saito, M., Matsui, Y.: Illustration2vec: a semantic vectorrepresentation of illustrations. In: SIGGRAPH Asia 2015Technical Briefs, p. 5. ACM (2015)65. Sarada, P.: Comics as a powerful tool to enhance englishlanguage usage. IUP Journal of English Studies (1),60 (2016)66. Sato, K., Matsui, Y., Yamasaki, T., Aizawa, K.:Reference-based manga colorization by graph correspon-dence using quadratic programming. In: SIGGRAPHAsia 2014 Technical Briefs, p. 15. ACM (2014)67. Screech, M.: Masters of the ninth art: bandes dessin´eesand Franco-Belgian identity, vol. 3. Liverpool UniversityPress (2005)68. Soleymani, M., Asghari-Esfeden, S., Fu, Y., Pantic, M.:Analysis of eeg signals and facial expressions for contin-uous emotion detection. IEEE Transactions on AffectiveComputing (1), 17–28 (2016)69. Soleymani, M., Lichtenauer, J., Pun, T., Pantic, M.: Amultimodal database for affect recognition and implicittagging. IEEE Transactions on Affective Computing (1), 42–55 (2012)70. Sun, W., Burie, J.C., Ogier, J.M., Kise, K.: Specific comiccharacter detection using local feature matching. In: Doc-ument Analysis and Recognition (ICDAR), 2013 12th In-ternational Conference on, pp. 275–279. IEEE (2013)71. Tanaka, T., Shoji, K., Toyama, F., Miyamichi, J.: Layoutanalysis of tree-structured scene frames in comic images.In: IJCAI, vol. 7, pp. 2885–2890 (2007)72. Vie, J.J., Yger, F., Lahfa, R., Clement, B., Cocchi, K.,Chalumeau, T., Kashima, H.: Using posters to recom-mend anime and mangas in a cold-start scenario. arXivpreprint arXiv:1709.01584 (2017)73. White, T., Loh, I.: Generating animations by sketchingin conceptual space. In: Eighth International Conferenceon Computational Creativity, ICCC, Atlanta (2017)74. Wilber, M.J., Fang, C., Jin, H., Hertzmann, A., Collo-mosse, J., Belongie, S.: Bam! the behance artistic mediadataset for recognition beyond photography. In: Proc.ICCV, vol. 1, p. 4 (2017)75. Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum,J.: Learning a probabilistic latent space of object shapesvia 3d generative-adversarial modeling. In: Advances inNeural Information Processing Systems, pp. 82–90 (2016)4 Olivier Augereau et al.76. Wu, Z., Aizawa, K.: Mangawall: Generating manga pagesfor real-time applications. In: Acoustics, Speech and Sig-nal Processing (ICASSP), 2014 IEEE International Con-ference on, pp. 679–683. IEEE (2014)77. Yamanishi, R., Tanaka, H., Nishihara, Y., Fukumoto,J.: Speech-balloon shapes estimation for emotional textcommunication. Information Engineering Express (2),1–10 (2017)78. Yao, C.Y., Hung, S.H., Li, G.W., Chen, I.Y., Adhitya, R.,Lai, Y.C.: Manga vectorization and manipulation withprocedural simple screentone. IEEE transactions on visu-alization and computer graphics (2), 1070–1084 (2017)79. Zhang, L., Ji, Y., Lin, X.: Style transfer for animesketches with enhanced residual u-net and auxiliary clas-sifier gan. arXiv preprint arXiv:1706.03319 (2017)80. Zhang, S.H., Chen, T., Zhang, Y.F., Hu, S.M., Martin,R.R.: Vectorizing cartoon animations. IEEE Transac-tions on Visualization and Computer Graphics15