DD RAW YOUR N EURAL N ETWORKS
Jatin Sharma
Microsoft ResearchRedmond, WA 98052 [email protected]
Shobha Lata
Microsoft CorporationRedmond, WA 98052 [email protected] A BSTRACT
Deep Neural Networks are the basic building blocks of modern Artificial Intelligence. They areincreasingly replacing or augmenting existing software systems due to their ability to learn directlyfrom the data and superior accuracy on variety of tasks. Existing Software Development Life Cycle(SDLC) methodologies fall short on representing the unique capabilities and requirements of AIDevelopment and must be replaced with Artificial Intelligence Development Life Cycle (AIDLC)methodologies. In this paper, we discuss an alternative and more natural approach to develop neuralnetworks that involves intuitive GUI elements such as blocks and lines to draw them instead ofcomplex computer programming. We present Sketch framework, that uses this GUI-based approachto design and modify the neural networks and provides interoperability with traditional frameworks.The system provides popular layers and operations out-of-the-box and could import any supported pre-trained model making it a faster method to design and train complex neural networks and ultimatelydemocratizing the AI by removing the learning curve. K eywords Graphical User Interfaces · Deep Neural Networks · AI-Development Life Cycle
Deep Neural Networks have high expressibility and learn directly from the data without any feature engineering.This makes them highly versatile and the foundational blocks for the modern Artificial Intelligence. These networkscomprise of dozens and sometimes hundreds of self-learning layers stacked one after another. Each of these layershave weights, biases and several other learnable parameters that are adjusted through the back-propagation algorithm[1][2] during the network’s training phase to learn useful patterns. Creation of these networks is generally complex andrequires expertise in computer programming, mathematics and application-specific domain. For example, deep neuralnetworks for computer vision generally deploy convolutional layers [3] which look at spatial patterns, however, thenetworks for time-series analysis or language modeling require recurrent layers (e.g. RNN [1][4][5], LSTM [6][5],BERT [7]) which gather temporal patterns and long-term dependencies. A network for video processing may use bothconvolutional and recurrent layers. Some systems may deploy attention [8] based layers that look out for correlationsbetween input and desired output. Therefore, it requires a high learning curve, in-depth technical background and yearsof experience before making any meaningful contribution in designing novel architectures. There exist numerous deeplearning frameworks such as PyTorch [9], Tensorflow [10], Keras [11], Caffe [12], Torch [13], MXNet [14] and soon that provide extensive capabilities to design and train neural networks but they are programming-based and aregenerally not interoperable. Recently, ONNX [15] project has initiated a formal discussion and collaboration across theindustry to encourage interoperability between these platforms. There is also a lack of change management system thatcould track the modifications in the existing networks and provide a version control such that it becomes possible to goback and forth between the generations of the trained neural networks.This paper proposes Sketch framework that try to bridge this gap in the technology and provides an alternative approachfor the AI development. Instead of computer programming which requires language specific syntax and structure, thesystem uses intuitive GUI elements such as blocks and lines to represent layers and their interconnections. Conceptually,it is similar to flow-charts where a user can just draw a neural network instead of programming it and the resultingneural network is a visual sketch instead of a long monolithic piece of code. Thus providing an intuitive and much fasterapproach to neural network development. The system can also import and export models from major deep learningframeworks (e.g. PyTorch, Torch, ONNX) providing a standard interface for interoperability. Most of the applications a r X i v : . [ c s . L G ] D ec equire some standard layers (e.g. convolution layer, fully-connected layer, pooling, non-linearity), loss functions (e.g.mean square error, absolute loss) and other operations (e.g. batch normalization, dropout, identity). The proposedsystem come up with these standard layers and operations out-of-the-box and thus provides a much faster method todesign and train complex neural networks without the technical complexities. Computer programming based neural network development has been the prominent approach in AI-Development sofar. There are several deep learning frameworks which are publicly available. Some of them include PyTorch [9],Tensorflow [10], Keras [11], Caffe [12], Torch [13] and MXNet [14]. PyTorch and Tensorflow has been the mostpopular for their ease-of-use and large community support. Another leading reason for the popularity has been attributedto their support to Python as primary development language which is widely used in the research community due to itsuser-friendly syntax and faster learning curve. These frameworks have their custom implementations and generally lackinteroperability which presents a huge roadblock moving from prototyping to production phase. In this direction, OpenNeural Network Exchange (ONNX) [15] is the first multi-organization attempt to encourage framework-interoperabilityand shared optimization. Today more than 40 organizations support the ONNX initiative which highlights the necessityfor interoperability in AI.However, there has not been significant development in GUI-based AI development and interoperability by design.There exist some well supported graphical interfaces like Caffe GUI Tool [16], Nvidia Digits [17], Sony Neural NetworkConsole [18], Mathwork nnstart [19] and Expresso [20] but they lack capabilities of a comprehensive framework. Theauthors in [21] developed Barista as an Open Source tool for designing and training neural networks. However, it isbased on Caffe and therefore limited in the user base. Researchers at MIT developed Elegant Neural Network UserInterface (ENNUI) [22] - a browser-based neural network editor that presents a drag and drop interface to create themodels and export to python code. However, this also lacks general usability through the ability to import pre-existingand pre-trained models and export to interoperable formats. Recently, another tool – Deep Recognition [23] – islaunched as the first commercially available GUI based editor for neural network development which provides a morecomprehensive set of functionality including support for autoML and multi-GPU training however it is not open sourceand limited to PyTorch and Keras only. We believe that all these tools are paving a path towards a comprehensivegraphical interface based neural network editor that could cater the needs of the AI-Development Life Cycle.
Software Development Life Cycle (SDLC) is a well established methodology in the field of computer science. Itrecommends clearly defined processes for creating high quality software. With the rise of Artificial Intelligence (AI)and its increasing applications in software development, progress towards an AI-Development Life Cycle (AIDLC) isnatural and roughly follows the same underlying principles as SDLC. However, it also poses unique challenges in 1)Method of development, 2) Interoperability, 3) Explainability and Debugging, 4) Collaborative Development and 5)Version Control and Maintenance. A comprehensive AI-Development framework should provide basic functionalityto support these areas. In our current work, we focus on Method of development, Interoperability and Maintenance.We develop Sketch – a graphical interface based AI development framework where a user can draw neural networksinstead of writing complex computer code. The system supports importing existing model architectures along withtheir pre-trained weights to further improve them by training or benchmarking against new datasets. The system alsosupports a plug and play kernel based approach where the user-drawn model is stored as an abstract graph representationand could be converted to any framework of choice including PyTorch, Torch (Lua) and ONNX. Thus, providinginteroperability by design. The proposed system has five major components as depicted in figure 1. Next, we would discuss each of thesecomponents in detail.
This component provides the primary interaction between the user and the system. It encompasses all the UI elementsthat help in drawing the neural networks and compiling them. Figure 2 demonstrates a screenshot of the system’s Please visit https://github.com/jatinsha/sketch for further details and code related to Sketch project.
A user draws a neural network on a canvas in the Graphical Editor using various layers from the Toolbox and linkingthem together through inter-connections. To create an inter-connection, the user can click anywhere on the borderof a layer and drag it to another layer. As the user draws the network, an abstract Graph representation of the sameit additively created by the Graph Abstraction component. To accomplish this an Adjacency List representation ismaintained where the layers and their inter-connections are used as graph nodes and edges. Each node contains itsnodeId and lists for prior and next connections. An extensive set of processing happens in the background for variousevents such as key bindings for shortcuts, selection and grouping of multiple elements, their positioning and deletion –making it intuitive for the user to work with the system. Figure 3 illustrates an example adjacency list for the networkshown in figure 2.Figure 3: An example of adjacency list representation used in Sketch’s Graph Representation
The binder is the most crucial component of the Sketch system. It acts similar to a compiler and converts the high-levelabstract graph representation of the deep neural network into a framework-specific model architecture. The choice oftarget framework is defined through a kernel which could be changed by the user at any time. For an instance, if wedraw a sequential neural network containing Convolution → BatchNormalization → ReLU → MaxPool → Convolutionlayers and pick PyTorch as the kernel then compiling the canvas would generate a PyTorch model as *.pth file alongwith a textual PyTorch representation in textual editor. If we switch the kernel to ONNX and re-compile the samecanvas then it would output an *.onnx model and ONNX representation of the underlying neural network.The Binder interface exposes two methods – exportModel and importModel. The exportModel method provides basiccapability to convert the Graph Abstraction into framework-specific representation. On the other hand, importModelenables loading any pre-existing model from the supported frameworks and converting them into the system’s GraphAbstraction and a sketch on the canvas. This provides a tremendous capability to this tool to not only create newmodel architectures but also to use and modify huge number of pre-trained models available in the research community.This Binder interface is implemented by various kernels which provide the actual low-level conversion logic andweight transfer between the frameworks. The Sketch system currently supports PyTorch, Torch (Lua) and ONNXthrough implementing corresponding kernels. Similar kernels could be implemented for TensorFlow, Caffe or any otherframework pretty easily as a plug and play method.
The State Manager is responsible for tracking the system state. At all time, the crucial state information such as systemversion, current working directory path and list of open tabs, canvas state and so on is tracked. When the editor is closedthis state information is saved to the disk and reloaded whenever the edited is reopened. Using this mechanism, Sketcheditor try to restore the open files and canvases which were being worked on in the previous session. Every modificationon the canvas is recorded by a checkpointing mechanism under the state manager and is used to provide the undo andredo functionalities. Figure 4 illustrates they key steps in Sketch-based AI development.4igure 4: A flowchart diagram illustrating the key steps in Sketch-based AI development
To provide a debugging and maintenance mechanism crucial programmatical events (e.g. loading/saving files, changingkernel, compiling canvas) are logged. During an exception, the appropriate stack trace is logged as well. This providesa sequential trace of various actions and system state which could be used to debug and enhance the system.
In the present work, we have discussed the need for research towards AI-Development Life Cycle as existing SoftwareDevelopment Life Cycle Strategies may not completely represent the unique challenges faced by the AI development.From computer coding based method to framework-specific implementations and from trace based debugging to versioncontrol, there exist several areas that need to be reworked in accordance with the unique characteristics of the AI. Wedeveloped Sketch framework which – (1) provides an alternative and natural approach to develop neural networks, and(2) brings interoperability by design through providing import and export capabilities for major frameworks in a plug &play fashion. This removes dependency upon the extensive knowledge of underlying deep learning framework andcomputer programming language syntax and provides a lego-like functionality to build over existing operators. Webelieve such an approach could democratize the AI development. In future, we intend to work towards implementingIn-built Training, Version Control, Collaborative Development and Visual Debugging towards a comprehensive andone-stop framework for AI Development. 5 eferences [1] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagatingerrors. nature , 323(6088):533–536, oct 1986.[2] Yann LeCun. A theoretical framework for back-propagation. In D. Touretzky, G. Hinton, and T. Sejnowski,editors,
Proceedings of the 1988 Connectionist Models Summer School, CMU, Pittsburg, PA , pages 21–28. MorganKaufmann, 1988.[3] Yann LeCun and Yoshua Bengio.
Convolutional Networks for Images, Speech, and Time Series , page 255–258.MIT Press, Cambridge, MA, USA, 1998.[4] M I Jordan. Serial order: a parallel distributed processing approach. technical report, june 1985-march 1986. 51986.[5] Alex Sherstinsky. Fundamentals of recurrent neural network (rnn) and long short-term memory (lstm) network.
Physica D: Nonlinear Phenomena , 404:132306, Mar 2020.[6] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.
Neural computation , 9:1735–80, 12 1997.[7] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectionaltransformers for language understanding. In
Proceedings of the 2019 Conference of the North American Chapterof the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and ShortPapers) , pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.[8] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, undefinedukaszKaiser, and Illia Polosukhin. Attention is all you need. In
Proceedings of the 31st International Conferenceon Neural Information Processing Systems , NIPS’17, page 6000–6010, Red Hook, NY, USA, 2017. CurranAssociates Inc.[9] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen,Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito,Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala.Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelz-imer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,
Advances in Neural Information Processing Systems 32 ,pages 8024–8035. Curran Associates, Inc., 2019.[10] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, SanjayGhemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore,Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, andXiaoqiang Zheng. Tensorflow: A system for large-scale machine learning. In
Proceedings of the 12th USENIXConference on Operating Systems Design and Implementation , OSDI’16, page 265–283, USA, 2016. USENIXAssociation.[11] François Chollet and others. Keras: The Python Deep Learning library, June 2018.[12] Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama,and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. In
Proceedings of the 22ndACM International Conference on Multimedia , MM ’14, page 675–678, New York, NY, USA, 2014. Associationfor Computing Machinery.[13] R. Collobert, K. Kavukcuoglu, and C. Farabet. Torch7: A matlab-like environment for machine learning. In
BigLearn, NIPS Workshop , 2011.[14] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang,and Zheng Zhang. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems.
CoRR , abs/1512.01274, 2015.[15] Junjie Bai, Fang Lu, Ke Zhang, et al. Onnx: Open neural network exchange. https://github.com/onnx/onnx ,2019.[16] Caffe gui tool. https://github.com/Chasvortex/caffe-gui-tool . Accessed: 2020-12-03.[17] Nvidia digits. https://developer.nvidia.com/digits . Accessed: 2020-12-03.[18] Sony neural network console. https://dl.sony.com/ . Accessed: 2020-12-03.[19] Mathwork nnstart. . Accessed:2020-12-03.[20] Ravi Kiran Sarvadevabhatla and R. Venkatesh Babu. Expresso : A user-friendly gui for designing, training andexploring convolutional neural networks, 2015. 621] Soeren Klemm, Aaron Scherzinger, Dominik Drees, and Xiaoyi Jiang. Barista - a graphical tool for designing andtraining deep neural networks, 2018.[22] Jesse Michel, Zack Holbrook, Stefan Grosser, and Rikhav Shah. Elegant neural network user interface. https://github.com/martinjm97/ENNUI .[23] Deep cognition. https://deepcognition.ai/https://deepcognition.ai/