Learned Interpolation for 3D Generation
Austin Dill, Songwei Ge, Eunsu Kang, Chun-Liang Li, Barnabas Poczos
LLearned Interpolation for 3D Generation
Austin Dill, Songwei Ge, Eunsu Kang, Chun-Liang LiBarnabas Poczos
Carnegie Mellon UniversityPittsburgh, PA, United States{abdill, songweig, eunsuk, chunlial, bapoczos}@andrew.cmu.edu
Merging abstract concepts, also known as creative blending , is frequently seen as a fundamentalcomponent of creativity, as this merging can allow novel concepts to emerge from simple components[10]. One computational way to approach this is interpolation , a process that smoothly transitionsfrom one instance to another.In order to generate novel 3D shapes with machine learning, one must allow for such interpolations.The typical approach for incorporating this creative process is to interpolate in a learned latent spaceso as to avoid the problem of generating unrealistic instances by exploiting the model’s learnedstructure. In 2D images, this often utilizes the trained Generative Adversarial Network [11, 4] orAutoencoder [9, 13], which has shown promising results in creative generation [5]. As for the basicrequirement, the process of the interpolation is supposed to form a semantically smooth morphing [2].While this approach is sound for synthesizing realistic media such as lifelike portraits or new designsfor everyday objects, it subjectively fails to directly model the unexpected, unrealistic, or creative[3, 8].In this work, we present a method for learning how to interpolate point clouds. By encoding priorknowledge about real-world objects, the intermediate forms are both realistic and unlike any existingforms. We show not only how this method can be used to generate "creative" point clouds, but howthe method can also be leveraged to generate 3D models suitable for sculpture.
Figure 1: Learned interpolation between an airplane and a chair.
Consider two point clouds X a = { p i } ni =1 and X b = { p (cid:48) i } ni =1 , where each p represent a 3-dimensionalpoint. To generate a point cloud that semantically lies in between the inputs X a and X b , a straightfor-ward method is to generate a point cloud X ab as follows: X ab = { αp i + (1 − α ) p (cid:48) i } ni =1 Intuitively, this represents drawing a line between pairs of points in X a and X b and returning a points α percent of the distance between them. While this is simple to implement, it does not produce resultsthat are semantically in between the objects the point clouds represent, as can be seen in Figure 2. a r X i v : . [ c s . G R ] J a n .2 Learned Point Cloud Interpolation While prior algorithms for generating creative point clouds have relied on a pretrained model, ourmethod directly learns a transformation from point clouds to point clouds by using interpolation asthe guiding framework. We parameterize interpolation from a start point cloud X a to a goal pointcloud X b where for each time step, we apply a learned transformation for each point independently,given an encoding of X b , denoted by E ( X b ) . p t +1 i = p ti + h t ( p ti , E ( X b )) In each of the above transformations, the function h t ( · ) is a multilayer perceptron [12]. All of thetransformations are trained to produce a set ˆ X b after T transformations so that ˆ X b and X b are closein Chamfer Distance, an error metric frequently used with set generation tasks [1]. The encodingnetwork is parameterized as a Deep Sets model [14]. Figure 2: Naive interpolationfails to produce an interestingmidpoint.While this formulation allows us to visualize the trajectory of eachpoint as it is transformed and allows us enough expressivity to ap-proximate the goal point clouds, it does not enforce the requirementthat each intermediate point cloud is realistic. For example it couldallow a mapping to a completely meaningless intermediate state thatwould not be recognized as a plausible (if unusual) 3D object.For this reason we introduce an additional loss term motivated bycomputer graphics [6]. L ( ˆ X b , X b ) = CD ( ˆ X b , X ∗ )+ (cid:88) i (cid:88) j ∈ N ( i ) ( p i − p j ) − ( φ ( p i ) − φ ( p j )) With this added term, we are able to maintain the topology of thebeginning object, causing the network to find the most plausiblecorrespondence between the source object and the target function.This loss function only penalizes the output of the algorithm but hasthe side effect of ensuring each intermediate step is topologicallyconsistent as well.
Toilet and Plant Airplane and Person Piano and Bowl Person and PlantFigure 3: Generated meshes from interpolating between pairs of objects.This technique allows one to use the vertices from a mesh as input, providing us with the corre-spondence needed to mesh the output. This removes the problem of meshing for creative sculpturegeneration, the motivating factor for creative sculpture generating algorithms [7]. In the author’sopinion, the conflict between the representational advantage of point clouds for machine learningtools and the artist’s frequent need for solid 3D shapes has limited the adoption of generative modelsfor sculptural art. Our method therefore represents an important step forward for creative AI.2 eferences [1] P. Achlioptas, O. Diamanti, I. Mitliagkas, and L. Guibas. Learning representations and generative modelsfor 3d point clouds. arXiv preprint arXiv:1707.02392 , 2017.[2] D. Berthelot, C. Raffel, A. Roy, and I. Goodfellow. Understanding and improving interpolation inautoencoders via an adversarial regularizer. arXiv preprint arXiv:1807.07543 , 2018.[3] A. Bidgoli and P. Veloso. Deepcloud: The application of a data-driven, generative model in design. 012018.[4] A. Brock, J. Donahue, and K. Simonyan. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 , 2018.[5] S. Carter and M. Nielsen. Using artificial intelligence to augment human intelligence.
Distill , 2(12):e9,2017.[6] D. Ezuz, J. Solomon, and M. Ben-Chen. Reversible harmonic maps between discrete surfaces. arXivpreprint arXiv:1801.02453 , 2018.[7] S. Ge, A. Dill, E. Kang, C.-L. Li, L. Zhang, M. Zaheer, and B. Poczos. Developing creative ai to generatesculptural objects, 2019.[8] D. P. Kingma and P. Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In
Advances inNeural Information Processing Systems , pages 10215–10224, 2018.[9] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 , 2013.[10] F. C. Pereira.
Creativity and artificial intelligence: a conceptual blending approach , volume 4. Walter deGruyter, 2007.[11] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutionalgenerative adversarial networks. arXiv preprint arXiv:1511.06434 , 2015.[12] F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain.
Psychological review , 65(6):386, 1958.[13] I. Tolstikhin, O. Bousquet, S. Gelly, and B. Schoelkopf. Wasserstein auto-encoders. arXiv preprintarXiv:1711.01558 , 2017.[14] M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. R. Salakhutdinov, and A. J. Smola. Deep sets. In
Advances in neural information processing systems , pages 3391–3401, 2017., pages 3391–3401, 2017.