The software process is a series of software engineering practice combinations established to achieve the goals of controlling software development costs, improving software quality, and achieving on-schedule delivery. With the continuous increase of software scale and deposited software process data, software process management has become more and more important. Our software process group has long-term research on software process modeling, software process mining, software process improvement, software process quality measurement, software process data governance and other related work. Through the combination of theory and practice, the work related to software process simulation modeling has been published in leading international software process conferences for many years, and it has been put into practice in famous domestic IT companies.

Research Project 1: Hybrid simulation modeling of software process

We built a hybrid simulation model based on SD+DE paradigms for the Kanban development process in our cooperative enterprise. This project proposed a practice-oriented simulation modeling method based on the experience we learned from the case study. And we further discussed the challenges and pain points of process simulation modeling technology, including data traceability problems, problems that model variables and interrelationships are difficult to be identified and quantified, and problems that simulation models are largely dependent on the modeler’s experience and cognition of the real process of modeling. In response to these problems, we continued to develop process data management and simulation modeling tools to lay a practical foundation for the real implementation and application of software process simulation modeling.

Research Project 2:

To build and apply a high-fidelity software process simulation model, we need to systematically understand and manage the software process data quality problem, explore which intelligent techniques can solve the process-oriented data quality problem, and further explore the improvement effect of the managed data in the software process simulation model. Specifically, this project aims to solve the problem of traceability recovery of data from multiple sources in industrial scenarios by mining the traceability relationships among the main development artifacts such as requirements, code, defects, and test cases, and introducing various enhancement strategies for data pre-processing based on machine learning techniques, such as data unlabeling, semi-supervised learning, and active learning. In real large-scale software projects, the enhanced strategies introduced in this project are more effective at recovering process data correlation relationship data than the traditional open-source project-based traceability recovery techniques.

Research Project 3: Code Quality Measure

Quality, cost and time are the three major elements of R&D management, and quality is its fundamental, determining the R&D efficiency and product quality of the whole project. In this project, we build a code quality assessment system using software process control and result measurement, extending code quality to the process, analyzing and improving code quality through the quantitative process management process. With respect to the traditional software quality model, this project is more focused on process practice as an entry point to improve and control the software process quality, and finally to assess, enhance, and improve code quality. Based on the industry-university-research cooperation project, the project has implemented the measurement algorithms of eight code quality attributes and applied best practice strategies for further improving. It has also been applied in enterprises, which laid a good foundation for the promotion and improvement of the code quality assessment system in the future.

Research Project 4:

This topic defines continuous integration efficiency as the number of successfully integrated code commits per unit of resource expenditure. We proposed a simulation-based continuous integration effectiveness improvement framework, which aims to comprehensively improve the continuous integration effectiveness. Three research contents are integrated in the framework. The first study is the continuous integration result prediction method, which is a method of using code commit logs, continuous integration logs and other data to train machine learning based predictors to predict the continuous integration results. The current mainstream application strategy is to predict the result before performing continuous integration, if the prediction result is passed, then skip the execution, if the result is a failed, then execute the integration. Thereby reducing the frequency of execution. The second study is test case prioritization, which is to sort the use cases using information such as code line coverage, and execute the use cases according to their priority during testing, so as to achieve the purpose of discovering all defects by executing fewer use cases. The third study uses software process simulation technology to simulate the continuous integration process that integrated with continuous integration result predictor (the first study) and test case set optimizer (the second study), so as to evaluate the improvement of effectiveness from the perspective of the entire continuous integration process.

List of Outcomes

Publications

1. Bohan Liu, Guoping Rong, Liming Dong, He Zhang, Danni Chen, Tiange Chen, Yuyan Chen, Tiantian Zhang: What are the factors affecting the handover process in open source development? J. Syst. Softw. 153: 238-254 (2019)

2. Guoping Rong, He Zhang, Bohan Liu, Qi Shan, Dong Shao: A replicated experiment for evaluating the effectiveness of pairing practice in PSP education. J. Syst. Softw. 136: 139-152 (2018)

3. 刘博涵,张贺,董黎明.DevOps中国调查研究[J/OL].软件学报:1-22[2019-10-08].

4. Bohan Liu, He Zhang, Saichun Zhu: An Incremental V-Model Process for Automotive Development. APSEC 2016: 225-232

5. Bohan Liu, He Zhang, Lanxin Yang, Liming Dong, Haifeng Shen, Kaiwen Song: An Experimental Evaluation of Imbalanced Learning and Time-Series Validation in the Context of CI/CD Prediction. EASE 2020: 21-30

6. Guoping Rong, Bohan Liu, He Zhang, Qiuping Zhang, Dong Shao: Towards Confidence with Capture-recapture Estimation: An Exploratory Study of Dependence within Inspections. EASE 2017: 242-251

7. Liming Dong, Bohan Liu, Zheng Li, Bingbing Xue, Danni Chen, Tiange Chen: Mining Handover Process in Open Source Development: An Exploratory Study. APSEC 2017: 378-387

8. Liming Dong, Bohan Liu, Zheng Li, Ou Wu, Muhammad Ali Babar, Bingbing Xue: A Mapping Study on Mining Software Process. APSEC 2017: 51-60

9. Haojie Gong, He Zhang, Dexian Yu, Bohan Liu: A systematic map on verifying and validating software process simulation models. ICSSP 2017: 50-59

10. Yue Li, He Zhang, Liming Dong, Bohan Liu, Jinyu Ma: Constructing a Hybrid Software Process Simulation Model in Practice: An Exemplar from Industry. ICSSP 2020: 135-144

Patents

1、张贺, 刘博涵, 荣国平, 杨岚心. 一种持续集成及部署结果的优化预测方法. 申请号:202010129434.7.

2、宋雪菲, 张贺, 刘博涵, 荣国平, 邵栋. 利用集成学习进行测试用例优先级排序的测试方法和系统. 申请号:202010432137.X.

3、刘博涵, 宋凯文, 荣国平, 张贺. 一种软件持续集成的评估方法、计算机设备及介质. 申请号:202011635238.3

4、顾雪珊, 张贺, 刘博涵. 一种用于测试用例缺陷与用例模糊关联关系的恢复方法. 申请号:20211069722.7

5、陈坚, 张贺, 刘博涵, 荣国平, 邵栋. 一种基于主题模型的微服务关注点识别方法、设备及介质. 申请号:202010431043.0

6、高赞, 张贺, 张晓东, 荣国平, 刘博涵, 邵栋. 一种系统动力学模型转换为XML文件的方法. 申请号:202010373637.0

7、张晓东, 张贺, 高赞, 荣国平, 刘博涵, 邵栋. 一种基于XML语言的系统动力学仿真建模方法及引擎. 申请号:202010382055.9