[PDF] Neural Style Difference Transfer and Its Application to Font Generation

Abstract

Designing fonts requires a great deal of time and effort. It requires professional skills, such as sketching, vectorizing, and image editing. Additionally, each letter has to be designed individually. In this paper, we will introduce a method to create fonts automatically. In our proposed method, the difference of font styles between two different fonts is found and transferred to another font using neural style transfer. Neural style transfer is a method of stylizing the contents of an image with the styles of another image. We proposed a novel neural style difference and content difference loss for the neural style transfer. With these losses, new fonts can be generated by adding or removing font styles from a font. We provided experimental results with various combinations of input fonts and discussed limitations and future development for the proposed method.

Full PDF

NNeural Style Diﬀerence Transfer and ItsApplication to Font Generation

Gantugs Atarsaikhan, Brian Kenji Iwana, and Seiichi Uchida

Kyushu University, Fukuoka, Japan { gantugs.atarsaikhan, brian, uchida } @human.ait.kyushu-u.ac.jp Abstract.

Designing fonts requires a great deal of time and eﬀort. It re-quires professional skills, such as sketching, vectorizing, and image edit-ing. Additionally, each letter has to be designed individually. In thispaper, we introduce a method to create fonts automatically. In our pro-posed method, the diﬀerence of font styles between two diﬀerent fonts istransferred to another font using neural style transfer. Neural style trans-fer is a method of stylizing the contents of an image with the styles ofanother image. We proposed a novel neural style diﬀerence and contentdiﬀerence loss for the neural style transfer. With these losses, new fontscan be generated by adding or removing font styles from a font. We pro-vided experimental results with various combinations of input fonts anddiscussed limitations and future development for the proposed method.

Keywords:

Convolutional neural network · Style transfer · Style diﬀer-ence.

Digital font designing is a highly time-consuming task. It requires professionalskills, such as sketching ideas on paper and drawing with complicated software.Individual characters or letters has many attributes to design, such as line width,angles, stripes, serif, and more. Moreover, a designer has to design all letterscharacter-by-character, in addition to any special characters. For example, theJapanese writing system has thousands of Japanese characters that needs to bedesigned individually. Therefore, it is beneﬁcial to create a method of designingfonts automatically for people who have no experience in designing fonts. It isalso beneﬁcial to create a way to assist font designers by automatically drawingfonts.On the other hand, there are a large number of fonts that have alreadydesigned. Many of them have diﬀerent font styles, such as bold , italic , serif and sans serif . There are many works done to create new fonts by using alreadydesigned fonts [36,27]. In this paper, we chose an approach to ﬁnd the diﬀerencebetween two fonts and transfer it onto a third font in order to create a new font.For example, using a font with serifs and a font without serifs to transfer theserif diﬀerence to a third font that originally lacked serifs, as shown in Fig. 1. a r X i v : . [ c s . C V ] J a n G. Atarsaikhan et al.

Style image 1 Style image 2 Content image

Generated image

Fig. 1.

An example results of the proposed method. Style diﬀerence between the styleimage 1 and 2 is transferred onto the content image by equalling to the style diﬀerencebetween the newly generated image and the content image.

Content image Style image

Generated image

Fig. 2.

An example of the NST. Features of the style image are blended into thestructure of the content image in the generated result image.

In recent years, the style transfer ﬁeld has progressed rapidly with the helpof Convolutional Neural Networks (CNN) [20]. Gatys et al. [12] ﬁrst used a CNNto synthesize an image using the style of an image and the content of anotherimage using Neural Style Transfer (NST). In the NST, the content is regardedas feature maps of a CNN and the style is determined by the correlation offeature maps in a Gram matrix. The Gram matrix calculates how correlatedthe feature maps are to each other. An example of the NST is shown in Fig. 2.The content image and style image are mixed by using features from the CNNto create a newly generated image that has contents (buildings, sky, trees, etc.)from the content image and style (swirls, paint strokes, etc.) from the style image.There are also other style transfer methods, such as a ConvDeconv network forreal-time style transfer [17] and methods that utilize Generative AdversarialNetworks (GAN) [15].The purpose of this paper is to explore and propose a new method to createnovel fonts automatically. Using NST, the contents and styles of two diﬀerentfonts have been found and their diﬀerence is transferred to another font to gen-erate a new synthesized font. Fig. 1 shows an example results of our proposedmethod. We provided experimental results and inspected the performance of ourmethod with various combinations of content and style images.The main contributions of this paper are as follows. eural Style Diﬀerence Transfer and Its Application to Font Generation 3

1. This is the ﬁrst attempt and trial on transferring the diﬀerence betweenneural styles onto the content image.2. Proposed a new method to generate fonts automatically to assist font de-signers or non-professionals.The remaining of this paper is arranged as follows. In Section 2, we discussprevious works on font generation, style transfer ﬁelds, and font generation usingCNN. The proposed method is explained in Section 3 in detail. Then, Section 4examines the experiments and the results. Lastly, we conclude in Section 5 withconcluding remarks and discussions for improvements.

Various attempts have been made to create fonts automatically. One approachis to generate font using example fonts. Devroye and McDougall [10] createdhandwritten fonts from handwritten examples. Tenenbaum and Freeman [34]clustered fonts by its font styles and generated new fonts by mixing font styleswith other fonts. Suveeranont and Igarashi [32,33] generated new fonts fromuser-deﬁned examples. Tsuchiya et al. [35] also used example fonts to determinefeatures for new fonts. Miyazaki et al. [27] extracted strokes from fonts andgenerated typographic fonts.Another approach in generating fonts is to use transformations or interpo-lations of fonts. Wada and Hagiwara [38] created new fonts by modifying someattributes of fonts, such as slope angle, thickness, and corner angle. Wand etal. [39] transformed strokes of Chinese characters to generate more characters.Campbell and Kautz [7] created new fonts by mapping non-linear interpolationbetween multiple existing fonts. Uchida et al. [36] generated new fonts by ﬁndingfonts that are simultaneously similar to existing fonts.Lake et al. [19] generated handwritten fonts by capturing example patternswith the Bayesian program learning method. Baluja et al. [6] learned stylesfrom four characters to generate other characters using CNN-like architecture.Recently, many studies use machine learning for font design. Atarsaikhan etal. [3] used the NST to synthesize a style font and a content font to generatea new font. Also, GANs have been used to generate new fonts [1,24,31]. Lastly,fonts have been stylized with patch-based statistical methods [40], NST [4] andGAN methods [5,43,41].

The ﬁrst example-based style transfer method was introduced in ”Image Analo-gies” by Hertzmann et al. [14]. Recently, Gatys et al. developed the NST [12] byutilizing a CNN. There are two types of NST methods, i.e. image optimization-based and network optimization-based. NST is the most popular image optimization-based method. However, the original NST [12] was introduced for artistic style

G. Atarsaikhan et al.

Pre-trained VGGNet !"

Pre-trained VGGNet

Style image-1 + , - ._/0, ._/0, - - - /_/0, /_/0, - ,_/0, ,_/0, !" Pre-trained VGGNet

Style image-2 + / - ._/0/ ._/0/ - - - /_/0/ /_/0/ - ,_/0/ ,_/0/ !" Generated image - ._/567 ._/567 - - - /_/567 /_/567 - ,_/567 ,_/567 !" Pre-trained VGGNet

Content image - ._/9:7 ._/9:7 - - - /_/9:7 /_/9:7 - ,_/9:7 ,_/9:7 ℒ ℒ ℒ <: Fig. 3.

The overview font generation with neural style diﬀerence transfer. Yellow blocksshow feature maps from one layer and purple blocks show Gram matrices calculatedusing the feature maps. transfer and works have been done for photorealistic style transfer [23,26]. Newlosses are introduced for stable results and preserving low-level features [28,21].Additionally, improvements, such as semantic aware style transfer [25,8], con-trolled content features [11,13] were proposed. Also, Mechrez et al. [25] and Yinet al. [42], achieved semantic aware style transfer.In network optimization-based style transfer, ﬁrst, neural networks are trainedfor a speciﬁc style image and then this trained network is used to stylize a con-tent image. A ConvDeconv neural network [17] and a generative network [37]are trained for style transfer. They improved for photorealistic style transfer [22]and semantic style transfer [8,9].There are many applications of the NST due to its non-heuristic style transferqualities. It has been used for video style transfer [2,18], portrait [29], fashion [16],and creating doodles [8]. Also, character stylizing techniques have been proposedby improving the NST [4] or using NST as part of the bigger network [5].

The overview of the proposed method is shown in Fig. 3. A pre-trained CNNcalled Visual Geometry Group (VGGNet) [30] is used to input the images and eural Style Diﬀerence Transfer and Its Application to Font Generation 5 extract their feature maps on various layers. VGGNet is trained for naturalscene object recognition, thus making it extremely useful for capturing variousfeatures from various images. The feature maps from higher layers show globalarrangements of the input image and the feature maps from lower layers expressﬁne details of the input image. Feature maps from speciﬁed content layers areregarded as content representations and correlations of feature maps on speciﬁedstyle layers are regarded as style representations.There are three input images: style image-1

SSS , style image-2 SSS and thecontent image CCC . XXX is the generated image that is initiated as the contentimage or as a random image. The diﬀerence of content representation and stylerepresentation between style images are transferred onto the content image byoptimizing the generated image. First, the content diﬀerence and style diﬀerenceof style images are calculated and stored. Also, the diﬀerences are calculatedbetween the generated image and the content image. The content diﬀerenceloss is calculated as the sum of the layer-wise mean squared errors between thediﬀerences of the features maps in the content representation. The style diﬀerenceloss is calculated in the same way but between the diﬀerences of the Grammatrices in the style representation. Then, the content diﬀerence loss and stylediﬀerence loss are accumulated into the total loss. Lastly, the generated image isoptimized through back-propagation to minimize the total loss. By repeatedlyoptimizing the generated image with this method, the style diﬀerence betweenthe style images is transferred to the content image.

Before explaining the proposed method in detail, we will brieﬂy discuss NST. Inthe NST, there are two inputs: a content image

CCC and a style image

SSS . The imageto be optimized is the generated image

XXX . It also uses a CNN, i.e. VGGNet tocapture features of the input images and create a Gram matrix for the stylerepresentation from the captured feature maps. A Gram matrix is shown inEq. 1, where

DDD l is a matrix that consists of ﬂattened feature maps of layer l as shown in Eq. 2. The Gram matrix calculates the correlation value of featuremaps from one layer to each other and stores it into a matrix. GGG l = DDD l ( DDD l ) (cid:62) , (1)where, DDD l = { FFF , ..., FFF n l , ..., FFF N l } . (2)First, the style image SSS is input to the VGGNet. Its feature maps F style ongiven style layers are extracted and their Gram matrices G style are calculated.Next, the content image CCC input to the VGGNet and its feature maps F content on given content layers are extracted and stored. Lastly, the generated image XXX is input to the network. Its Gram matrices G generated on style layers and featuremaps F generated from content layers are found. G. Atarsaikhan et al.

Then, by using the feature maps and Gram matrices, the content loss andstyle loss are calculated as, L content = L c (cid:88) l w content l N l M l N l (cid:88) n l M l (cid:88) m l ( F generated n l ,m l − F content n l ,m l ) , (3)and L style = L s (cid:88) l w style l N l M l N l (cid:88) i N l (cid:88) j ( G generated l,i,j − G style l,i,j ) , (4)where L c and L s are the number of layers, N l is the number of feature maps, M l is the number of elements in one feature map, w content l and w style l are weights forlayer l . Lastly, the content loss L content and the style loss L style are accumulatedinto the total loss L total with weighting factors α and β : L total = α L content + β L style . (5)Once the total loss L total is calculated, the gradients of content layers, stylelayers, and generated image XXX are determined by back-propagation. Then, tominimize the total loss L total , only the generated image XXX is optimized. By re-peating these steps multiple times, the style from the style image are transferredto the content image in the form of a generated image.In the NST, the goal of the optimization process is to match the styles of thegenerated image to those of the style image, and feature maps of the generatedimage to those of the content image. However, in the proposed method, thegoal of the optimization process is to match the style diﬀerences between thegenerated image and the content image to those of style images, as well as,diﬀerences of content diﬀerence between the generated image and the contentimage to those of style images.

Let

GGG style1 l and GGG style2 l be the Gram matrices of feature maps on layer l , whenstyle image-1 SSS and style image-2 SSS are input respectively. Then, the stylediﬀerence between the style images on layer l is deﬁned as, ∆GGG style l = GGG style1 l − GGG style2 l , (6)Similarly, the style diﬀerence between the generated image XXX and the contentimage

CCC is deﬁned as, ∆GGG generated l = GGG generated l − GGG content l . (7)Consequently, the style loss is the mean squared error between the style diﬀer-ences: L style diﬀ = L (cid:88) l w style l N l M l N l (cid:88) i N l (cid:88) j ( ∆G generated l,i,j − ∆G style l,i,j ) , (8) eural Style Diﬀerence Transfer and Its Application to Font Generation 7 where w style l is a weighting factor for an individual layer l . Note, it can be set tozero to ignore a speciﬁc layer. The style diﬀerence loss in the proposed methodmeans that the diﬀerence of correlations of feature maps (Gram matrix) of thegenerated and content images ( XXX and

CCC ) are forced to match that of styleimages (

SSS and SSS ) through optimization of the generated image, so that thestyle diﬀerence (e.g. bold and light fonts styles, italic or regular fonts styles) aretransferred. Extracting the feature maps on a layer l as FFF style1 l and FFF style2 l for the styleimages, the content diﬀerence between the style images on layer l are deﬁned asfollows, ∆FFF style l = FFF style1 l − FFF style2 l . (9)Using the same rule, the content diﬀerence between the generated and contentimages on layer l is deﬁned as, ∆FFF generated l = FFF generated l − FFF content l , (10)where FFF generated l is the feature maps on layer l when the generated image XXX isinput, and

FFF content l is the feature maps when the content image CCC is input. Byusing content diﬀerences of the two style images and the generated and contentimages, the content diﬀerence loss is calculated as, L content diﬀ = L (cid:88) l w content l N l M l N l (cid:88) n l M l (cid:88) m l ( ∆F generated l,n l ,m l − ∆F content l,n l ,m l ) , (11)where w content l is weighting factor for layer l . Layers also can be ignored bysetting w content l to zero. The content diﬀerence loss captures the diﬀerence inglobal feature of the style images, e.g, diﬀerence in serifs. For style transfer, a generated image

XXX is optimized to simultaneously matchthe style diﬀerence on style layers and the content diﬀerence on content layers.Thus, a loss function is created and minimized: The total loss L total which is theaccumulation of the content diﬀerence loss L content diﬀ and the style diﬀerenceloss L style diﬀ written as, L total = L content diﬀ + L style diﬀ . (12)With the total loss, the gradients of pixels on the generated image are calculatedusing back-propagation and used for an optimization method, such as L-BFGSor Adam. We found from experience that L-BFGS requires fewer iterations andproduces better results. Also, we use the same image size for each input image G. Atarsaikhan et al.

Style images-1Content image Style images-2 conv conv conv conv conv (a) Individual content layers(b) Individual style layers conv conv conv conv conv

Fig. 4.

Various weights for content and style layers. for comparable feature sizes. Due to plain subtracting operation in Eq. 6, Eq.7,Eq.9, and Eq.10, input images have to be chosen carefully. Font styles of

SSS must be similar to that of CCC . Because of the subtraction process, the generatedimage

XXX is most likely optimized to have similar font styles with style image-1

SSS . Moreover, contrary to the NST, we do not use weighting factors for contentloss and style loss, instead, individual layers are weighted. In the experiments below except speciﬁed, feature maps for the style diﬀerenceloss are taken from the style layers, conv conv conv conv conv style = { , , , , } , and feature maps forthe content diﬀerence loss is taken from content layer conv w conv = 10 on VGGNet. Generated image XXX is initialized with thecontent image

CCC . The optimization process is stopped at 1,000th step with morethan enough convergence. Also, due to black pixels having a zero value, inputimages are inverted before inputting to the VGGNet and inverted back for thevisualization. eural Style Diﬀerence Transfer and Its Application to Font Generation 9

Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference

Fig. 5.

Results of transferring missing parts to another font. In the style diﬀerenceimage, red shows parts that exist in the style image-1 but does not exist in the styleimage-2. Blue shows the reverse. Moreover, parts that are transferred onto the contentimage are visualized in red, and parts that are erased from the content image areshown blue in generated-content diﬀerence images. As a clariﬁcation, style diﬀerenceand generated-content diﬀerence are mere visualizations from the input images andgenerated image, they are not input or results images themselves.

Fig. 4 shows results using various content and style layers individually. Weused a sans-serif font for the content image, and tried to transfer the hor-izontal line style diﬀerence between style fonts. In Fig. 4a, we experimentedon using each content layers while the weights for the style layers are ﬁxed as style = { , , , , } . Using w conv and w conv as content lay-ers resulted the style diﬀerence appear in random places. Moreover, results using w conv and w conv has too ﬁrm of a style diﬀerence. On the other hand, usingcontent layer w conv resulted in not too ﬁrm or not random style diﬀerence. InFig. 4, we experimented on individual style layers while ﬁxing the content layerto w conv with weight of w conv = 10 . Each of the results show not incom-plete but not overlapping style diﬀerence on the content image. Thus, we usedall of the style layers to capture complete style diﬀerence of the style images. We experimented on transferring missing partsof a font. As visualized in red in style diﬀerence images of Fig. 5, style image-2lacks some parts compared to the style image-1. We will try to transfer this

Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference

Fig. 6.

Examples of generating serifs on the content font using the diﬀerence betweenserif font and sans-serif font from the same font family. diﬀerence of parts to the content image. By using the above parameter settings,the proposed method was able to transfer the missing parts onto the contentimage as shown in the ﬁgure. The transferred parts are visualized in red inthe ”Generated-content diﬀerence” images of the Fig. 5. The proposed methodtransfers the style diﬀerence while trying to match the style of the content font.Thus, the most appended part of the content image is connected to the contentfont. The best results were achieved when the missing parts are relatively smallor narrower than the content fonts. Using wider fonts works most of the time.However, the proposed method struggled to style transfer when the diﬀerencepart is too large or separated. Moreover, missing parts do not only transfer ontothe content image, but the style of the missing part is changed to match thestyle of the content image.

Generating Serifs.

Fig. 6 shows the experiments on generating serifs on thecontent image. Both style images are taken from the same font family. Styleimage-1 includes serif font, style image-2 includes sans-serif fonts and the contentimage includes a sans-serif font from diﬀerent the font family of the style fonts.As shown in the ﬁgure, serifs are generated on the content image successfully.Moreover, not only the content font is extended by the serif, parts of it aremorphed to include the missing serif as shown in the lower right corner of theﬁgure. eural Style Diﬀerence Transfer and Its Application to Font Generation 11

Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference

Fig. 7.

Experiments for removing serifs from the content image by style diﬀerence.

Fig. 7 shows the experiments on removing serifs from the content image fonts.Style image-1 includes a font that does not have serif, and style image-2 includesa font that has serifs. By using this diﬀerence in serifs, we experimented onremoving serifs from the content font. As shown in the ﬁgure, serifs have beenremoved from the content image. However, the font styles of the generated imagebecame diﬀerent than those of the content image.

Fig. 8 shows transferring diﬀerence of font line width between the style imagesto the content image to change the font line width from narrow to wide or fromwide to narrow. The proposed method was able to change the content font with anarrow line to a font that has a wider line in most cases and vice-versa. However,it struggled to change the wide font line to a narrow font line, when the contentfont line is wider than the font line in the style image-2.

Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle differenceStyle image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference

Fig. 8.

Transferring font line width diﬀerence. The ﬁrst two columns show experimentsfor widening the font line width of the content font. The third column shows erodingthe content font by style diﬀerence.

Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference Style image-1 Style image-2Content image Generated image Generated-content differenceStyle difference

Fig. 9.

Failure cases. The ﬁrst experiment shows a failure when styles of the contentimage and style image-2 are not similar. The second experiment shows a failure casewhen the input characters are diﬀerent.

Fig. 9 shows failure cases for font generation. On the left side of Fig. 9, fontsfrom the content image and style image-2 do not have a similar font style. Thefont in the style image-2 has wide lines, whereas, the font in the content imageas narrow lines. The style diﬀerence between the style fonts is serif in the footof the font. Consequently, the proposed method tried to transfer both widenessand serifs, so it resulted in an incomplete font. Moreover, the right side of Fig. 9 eural Style Diﬀerence Transfer and Its Application to Font Generation 13 shows input images that have diﬀerent characters. Although the font style matchbetween content image and style image-2, the content image is not suitable toreceive the font diﬀerence between the style images.

In this paper, we introduced the idea of transferring the style diﬀerence betweentwo fonts to another font. Using the proposed neural font style diﬀerence, weshowed that it is possible to transfer the diﬀerences between styles to createnew fonts. Moreover, we showed that style diﬀerence transfer can be used forboth the complementing and erasing from the font in the content image withexperimental results. However, the input font images must be chosen carefully inorder to achieve plausible results. Due to the simple subtraction operation in thecontent and style diﬀerence calculation, font styles of the content image must besimilar to font styles of style image-2. So, the font styles of the generated imagewill become similar to those of style image-1. Another limitation of the proposedmethod is the processing time due to the back-propagation in each stylizationstep. These issues can be solved by utilizing the encoder-decoder style transfermethod [17] or adversarial method [5]. However, these methods will have to betrained for the style transfer process ﬁrst is contrary to the proposed method.

Acknowledgment

This work was supported by JSPS KAKENHI Grant Number JP17H06100.

References

1. Abe, K., Iwana, B.K., Holm´er, V.G., Uchida, S.: Font creation using class discrimi-native deep convolutional generative adversarial networks. In: Asian Conf. PatternRecognition. pp. 232–237 (2017). https://doi.org/10.1109/acpr.2017.992. Anderson, A.G., Berg, C.P., Mossing, D.P., Olshausen, B.A.: Deepmovie: Usingoptical ﬂow and deep neural networks to stylize movies. CoRR (2016), http://arxiv.org/abs/1605.08153

3. Atarsaikhan, G., Iwana, B.K., Narusawa, A., Yanai, K., Uchida, S.: Neural fontstyle transfer. In: Int. Conf. Document Anal. and Recognition. pp. 51–56 (2017).https://doi.org/10.1109/icdar.2017.3284. Atarsaikhan, G., Iwana, B.K., Uchida, S.: Contained neural style transfer for deco-rated logo generation. In: Int. Workshop Document Anal. Syst. pp. 317–322 (2018).https://doi.org/10.1109/das.2018.785. Azadi, S., Fisher, M., Kim, V.G., Wang, Z., Shechtman, E., Darrell,T.: Multi-content GAN for few-shot font style transfer. In: IEEE Conf.Comput. Vision and Pattern Recognition. pp. 7564–7573 (June 2018).https://doi.org/10.1109/cvpr.2018.007896. Baluja, S.: Learning typographic style: from discrimination to synthesis. MachineVision and Applications (5-6), 551–568 (2017)4 G. Atarsaikhan et al.7. Campbell, N.D.F., Kautz, J.: Learning a manifold of fonts. ACM Trans. Graphics (4), 1–11 (2014). https://doi.org/10.1145/2601097.26012128. Champandard, A.J.: Semantic style transfer and turning two-bit doodles into ﬁneartworks. CoRR (2016), http://arxiv.org/abs/1603.01768

9. Chen, Y.L., Hsu, C.T.: Towards deep style transfer: A content-aware perspective.In: British Mach. Vision Conf. pp. 8.1–8.11 (2016). https://doi.org/10.5244/c.30.810. Devroye, L., McDougall, M.: Random fonts for the simulation of handwriting.Electronic Publishing (EPODD) , 281–294 (1995)11. Gatys, L.A., Bethge, M., Hertzmann, A., Shechtman, E.: Preserving color in neuralartistic style transfer. CoRR (2016), http://arxiv.org/abs/1606.05897

12. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutionalneural networks. In: IEEE Conf. Comput. Vision and Pattern Recognition. pp.2414–2423 (2016). https://doi.org/10.1109/cvpr.2016.26513. Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controllingperceptual factors in neural style transfer. In: IEEE Conf. Comput. Vision andPattern Recognition. pp. 3730–3738 (2017). https://doi.org/10.1109/cvpr.2017.39714. Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analo-gies. In: Proceedings of the 28th annual conference on Computer graphics andinteractive techniques. pp. 327–340. ACM (2001)15. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with condi-tional adversarial networks. In: IEEE Conf. Comput. Vision and Pattern Recogni-tion. pp. 1125–1134 (2016). https://doi.org/10.1109/cvpr.2017.63216. Jiang, S., Fu, Y.: Fashion style generator. In: Int. Joint Conf. Artiﬁcial Intell. pp.3721–3727 (aug 2017). https://doi.org/10.24963/ijcai.2017/52017. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transferand super-resolution. In: European Conf. Comput. Vision. pp. 694–711 (2016).https://doi.org/10.1007/978-3-319-46475-6 4318. Joshi, B., Stewart, K., Shapiro, D.: Bringing impressionism to life with neuralstyle transfer in come swim. In: ACM SIGGRAPH Digital Production Symp. p. 5(2017). https://doi.org/10.1145/3105692.310569719. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learningthrough probabilistic program induction. Science (6266), 1332–1338 (2015)20. LeCun, Y., Bottou, L., Bengio, Y., Haﬀner, P.: Gradient-based learningapplied to document recognition. Proc. IEEE (11), 2278–2324 (1998).https://doi.org/10.1109/5.72679121. Li, S., Xu, X., Nie, L., Chua, T.S.: Laplacian-steered neural styletransfer. In: ACM Int. Conf. Multimedia. pp. 1716–1724 (2017).https://doi.org/10.1145/3123266.312342522. Li, Y., Liu, M.Y., Li, X., Yang, M.H., Kautz, J.: A closed-form solution to pho-torealistic image stylization. In: Proceedings of the European Conf. on Conmput.Vision (ECCV). pp. 453–468 (2018)23. Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. In:IEEE Conf. Comput. Vision and Pattern Recognition. pp. 4990–4998 (2017).https://doi.org/10.1109/cvpr.2017.74024. Lyu, P., Bai, X., Yao, C., Zhu, Z., Huang, T., Liu, W.: Auto-encoder guided GANfor chinese calligraphy synthesis. In: Int. Conf. Document Anal. and Recognition.pp. 1095–1100 (2017). https://doi.org/10.1109/icdar.2017.18125. Mechrez, R., Talmi, I., Zelnik-Manor, L.: The contextual loss for image transfor-mation with non-aligned data. CoRR (2018), http://arxiv.org/abs/1803.02077

26. Mechrez, Roey, S.E., Zelnik-Manor, L.: Photorealistic style transfer with screenedpoisson equation. British Mach. Vision Conf. (2017)eural Style Diﬀerence Transfer and Its Application to Font Generation 1527. Miyazaki, T., Tsuchiya, T., Sugaya, Y., Omachi, S., Iwamura, M., Uchida, S., Kise,K.: Automatic generation of typographic font from a small font subset. CoRR abs/1701.05703 (2017), http://arxiv.org/abs/1701.05703

28. Risser, E., Wilmot, P., Barnes, C.: Stable and controllable neural texture synthesisand style transfer using histogram losses. CoRR (2017), http://arxiv.org/abs/1701.08893

29. Selim, A., Elgharib, M., Doyle, L.: Painting style transfer for head portraits us-ing convolutional neural networks. ACM Trans. Graphics (4), 129 (2016).https://doi.org/10.1145/2897824.292596830. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scaleimage recognition. In: Int. Conf. Learning Representations (2015)31. Sun, D., Zhang, Q., Yang, J.: Pyramid embedded generative adversarial networkfor automated font generation. CoRR (2018), http://arxiv.org/abs/1811.08106

32. Suveeranont, R., Igarashi, T.: Feature-preserving morphable model for automaticfont generation. In: ACM SIGGRAPH ASIA 2009 Sketches. p. 7. ACM (2009)33. Suveeranont, R., Igarashi, T.: Example-based automatic font generation. In: Int.Symp. Smart Graphics. pp. 127–138 (2010). https://doi.org/10.1007/978-3-642-13544-6 1234. Tenenbaum, J.B., Freeman, W.T.: Separating style and content with bilinear mod-els. Neural computation (6), 1247–1283 (2000)35. Tsuchiya, T., Miyazaki, T., Sugaya, Y., Omachi, S.: Automatic generation of kanjifonts from sample designs. In: Tohoku-Section Joint Conv. Inst. Elect. and Inform.Engineers. p. 36 (2014)36. Uchida, S., Egashira, Y., Sato, K.: Exploring the world of fonts fordiscovering the most standard fonts and the missing fonts. In: Int.lConf. on Document Anal. and Recognition. pp. 441–445. IEEE (2015).https://doi.org/10.1109/icdar.2015.733380037. Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.: Texture networks: Feed-forward synthesis of textures and stylized images. In: Int. Conf. Mach. Learning.pp. 1349–1357 (2016)38. Wada, A., Hagiwara, M.: Japanese font automatic creating system reﬂecting user’skansei. In: IEEE Int. Conf. on Systems, Man and Cybernetics. Conf. Theme-SystemSecurity and Assurance. vol. 4, pp. 3804–3809. IEEE (2003)39. Wang, Y., Wang, H., Pan, C., Fang, L.: Style preserving chinese character synthesisbased on hierarchical representation of character. In: IEEE Int. Conf. on Acoustics,Speech and Signal Processing. pp. 1097–1100. IEEE (2008)40. Yang, S., Liu, J., Lian, Z., Guo, Z.: Awesome typography: Statistics-based texteﬀects transfer. In: IEEE Conf. Comput. Vision and Pattern Recognition. pp. 2886–2895 (07 2017). https://doi.org/10.1109/CVPR.2017.30841. Yang, S., Liu, J., Wang, W., Guo, Z.: TET-GAN: text eﬀects transfer via stylizationand destylization. CoRR (2018), http://arxiv.org/abs/1812.06384

42. Yin, R.: Content aware neural style transfer. CoRR (2016), http://arxiv.org/abs/1601.04568

43. Zhao, N., Cao, Y., Lau, R.W.: Modeling fonts in context: Font predic-tion on web designs. Comput. Graphics Forum37

Related Researches

Is Space-Time Attention All You Need for Video Understanding?

by Gedas Bertasius

RMOPP: Robust Multi-Objective Post-Processing for Effective Object Detection

by Mayuresh Savargaonkar

Classification of Handwritten Names of Cities and Handwritten Text Recognition using Various Deep Learning Models

by Daniyar Nurseitov

DetCo: Unsupervised Contrastive Learning for Object Detection

by Enze Xie

Residue Density Segmentation for Monitoring and Optimizing Tillage Practices

by Jennifer Hobbs

On the Robustness of Multi-View Rotation Averaging

by Xinyi Li

UVTomo-GAN: An adversarial learning based approach for unknown view X-ray tomographic reconstruction

by Mona Zehni

Deep learning architectural designs for super-resolution of noisy images

by Angel Villar-Corrales

An underwater binocular stereo matching algorithm based on the best search domain

by Yimin Peng

Dynamic Neural Networks: A Survey

by Yizeng Han

SG2Caps: Revisiting Scene Graphs for Image Captioning

by Subarna Tripathi

RODNet: A Real-Time Radar Object Detection Network Cross-Supervised by Camera-Radar Fused Object 3D Localization

by Yizhou Wang

Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval

by Soravit Changpinyo

Ensembling object detectors for image and video data analysis

by Kateryna Chumachenko

The Role of the Input in Natural Language Video Description

by Silvia Cascianelli

Deep Multilabel CNN for Forensic Footwear Impression Descriptor Identification

by Marcin Budka

An application of a pseudo-parabolic modeling to texture image recognition

by Joao B. Florindo

Robust Motion In-betweening

by Félix G. Harvey

How Unique Is a Face: An Investigative Study

by Michal Balazia

Negative Data Augmentation

by Abhishek Sinha

Flow-Mixup: Classifying Multi-labeled Medical Images with Corrupted Labels

by Jintai Chen

Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning

by Yu Liu

Online Clustering-based Multi-Camera Vehicle Tracking in Scenarios with overlapping FOVs

by Elena Luna

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

by Aojun Zhou

Fast and Reliable Probabilistic Face Embeddings in the Wild

by Kai Chen

«

1

2

3

4

»

Submitted on 21 Jan 2020 Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar