Proceedings of the 29th ACM International Conference on Multimedia | 2021
MeronymNet: A Hierarchical Model for Unified and Controllable Multi-Category Object Generation
Abstract
We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimately, the object depictions themselves. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation of 2-D objects in a controlled manner. The performance scores for generated objects reflect MeronymNet s superior performance compared to multiple strong baselines and ablative variants. We also showcase MeronymNet s suitability for controllable object generation and interactive object editing at various levels of structural and semantic granularity.