Abstract
As deep learning technology continues to evolve, deep neural network (DNN) models have found their way into numerous modern software applications and systems, serving as crucial components. Despite the widespread adoption of DNN models in software, their development process still largely adheres to a craft production model [1]. This craft production approach leads to the creation of unique, highly specialized DNN models that may excel within their target software but prove difficult to standardize or adapt for compatibility with other software systems. In addition, due to the holistic training of these crafted DNN models, they cannot be easily disassembled or reassembled to accommodate new software requirements. Consequently, the reuse of DNN models remains a significant challenge in software engineering, hindering the potential for greater efficiency and adaptability in the development process.
At present, the primary approach to reusing DNN models involves retraining them, either by fine-tuning or training from scratch, in the target domain. This retraining process necessitates the use of a new dataset and incurs substantial costs associated with training the model. Moreover, acquiring a new dataset entails additional data collection and labeling efforts, even when the target domain differs only marginally from the original domain. In some cases, it becomes essential to devise distinct DNN structures tailored to the data characteristics or specific software requirements. These factors highlight the craft production nature of DNN model development, lack of scalability, and adaptability.
In conventional software engineering, software architecture [2] serves as a blueprint for complex software systems and development projects, as proposed and developed by software engineering researchers. This architectural perspective envisions software as a collection of computational components, connectors, and constraints [3], which dictate the interactions between these components. Incorporating DNN models into software architecture, the primary objective of DNN model integration is to establish components, connectors, and constraints for DNN model design. The architecture of DNN models is naturally constructed through multiple layers, functioning as a DNN component. However, directly stitching together different DNN components presents several challenges: 1) The generated output of each DNN component is difficult to comprehend. 2) directly establishing the connection between DNN components usually requires the expensive cost of DNN model retraining or fine-tuning. 3) the constraints currently present in DNN models are primarily structural but not explicitly semantic. Given that developed DNN components are not easily modified, our primary focus is on establishing connections between DNN components to alter the software's functionality without the need for retraining the DNN components. By doing this, the deep neural networks, functioning as components, could operate cohesively within the software system. As a result, any changes to software requirements would impact only the connections between components, eliminating the need for developers to retrain the models.
In this paper, we propose a novel method NeuralNector to solve the problem of DNN component integration. In NeuralNector, we design a programmable semantic connector. The connector could 1) program a clear semantic output for the DNN component's raw output and 2) program a logical rule component meeting the semantic constraints to connect DNN components with the programmed outputs. As shown in Figure 1, by developing an easy-to-establish programmable semantic connector, an effective and adaptable DNN model integration can be achieved, allowing for seamless integration of DNN models into software systems without the need for DNN model retraining. The proposed NeuralNector significantly enhances the efficiency of software development involving deep learning models as integral components. We design comprehensive experiments to evaluate our DNN component integration approach. The evaluation primarily focuses on the classification of transportation and animals from the PASCAL VOC dataset. The concepts from PASCAL are listed in Table 1.
Programmable Semantic Connector The training accuracy of each semantic concept extractor is listed in Table 1. Several selected concepts and their observations are depicted in Figure 2. The accuracy of the logical rule component reaches 97.6%, demonstrating that our concept setting is reasonable and that a logical relationship exists between the concept and the original label.
DNN components integration We use three classic DNN structures to construct components MA and MB, and build the corresponding programmable semantic connector for each component pair. The results of each DNN structure are shown in Table 2. The results show that our approach can be used for DNN component integration without an excessive focus on the DNN architecture, indicating the compatibility of our approach.
Data requirements and transferability of concepts To make the programmable semantic connector practical, we evaluate the performances of concept extractors trained on a smaller dataset (10% of the training data). The results are listed in Table 3. Even with only one-tenth of the original data available, the average accuracy is not significantly affected (93.9% down to 92.0%). We also evaluate the transferability of the concept extractors. The process of this experiment is illustrated in Figure 3. The accuracy of the DNN component MC is 94%. After integrating it with MB, which has a training accuracy of 97.7%, the accuracy for the 12 categories reaches 85.8%. This result indicates that the representations from the DNN component effectively extract common and dataset-independent semantic information from the samples, and the proposed method capitalizes on this advantage to exhibit transferability.
In summary, we presented a novel approach for integrating DNN components via a programmable semantic connector. The extensive evaluation demonstrated the effectiveness and compatibility of our approach across various datasets, DNN architectures, and practical scenarios. The semantic concept extractors can be programmed with limited data and possess strong transferability to other DNN components. To this end, our approach shows new possibilities for efficient model integration and adaptation from a software engineering perspective, pushing the development of DNN components toward a mass production paradigm.