Methodological Assessment of Data Suitability for Defect Prediction
Abstract
Purpose: This paper provides a domain specific concept to assess data suitability of various data sources along the production chain for defect prediction.
Methodology/Approach: A seven-phase methodology is developed in which the data suitability for defect prediction in interlinked production steps is assessed. For this purpose, the manufacturing process is mapped and potential influencing variables on the origin of defects are identified. The available data is evaluated and quantified with regard to the criteria relevancy, completeness, appropriate amount of data, accessibility and interpretability. The individual assessments are then visualized in an overview, gaps in data acquisition are identified and needs for action are derived.
Findings: The research shows a seven-phase methodology to systematically assess data suitability for defect prediction and identify data gaps in interlinked production steps.
Research Limitation/implication: This research is limited to the analysis of contextual data quality for the use case of defect prediction. Other data analytics applications or processes outside of manufacturing are not included.
Originality/Value of paper: The paper provides a new approach to identify gaps in data acquisition by systematically assessing data suitability for defect prediction and deducting needs for action. The accuracy of predictive defect models is then to be improved by the subsequent optimization of the data basis.Full text article
References
Ardagna, D., Cappiello, C., Samá, W. and Vitali, M., 2018. Context-aware data quality assessment for big data. Future Generation Computer Systems, [e-journal] 89, pp.548-562. DOI: 10.1016/j.future.2018.07.014.
Arif, F., Suryana, N. and Hussin, B., 2013. A Data Mining Approach for Developing Quality Prediction Model in Multi-Stage Manufacturing. International Journal of Computer Applications, [e-journal] 69(22), pp.35-40. DOI: 10.5120/12106-8375.
Backhaus, K., Erichson, B., Plinke, W. and Weiber, R., 2016. Multivariate Analysemethoden: Eine anwendungsorientierte Einführung. Berlin, Heidelberg: Springer. DOI: 10.1007/978-3-662-46076-4.
Bal, H.Ç. and Erkan, Ç., 2019. Industry 4.0 and Competitiveness. Procedia Computer Science, [e-journal] 158, pp.625-631. DOI: 10.1016/j.procs.2019.09.096.
Batini, C., Cappiello, C., Francalanci, C. and Maurino, A., 2009. Methodologies for data quality assessment and improvement. ACM Computing Surveys, [e-journal] 41(3), pp.1-52. DOI: 10.1145/1541880.1541883.
Bauernhansl, T., Krüger, J., Reinhart, G. and Schuh, G., 2016. Wgp-Standpunkt Industrie 4.0. Frankfurt am Main: Wissenschaftliche gesellschaft für produktionstechnik.
Brecher, C., Klocke, F., Schmitt, R. and Schuh, G. eds., 2017. Internet of Production für agile Unternehmen: AWK Aachener Werkzeugmaschinen-Kolloquium 2017. Aachen, Germany, 18-19 May 2017. Aachen: Apprimus Verlag.
Cai, L. and Zhu, Y., 2015. The Challenges of Data Quality and Data Quality Assessment in the Big Data Era. Data Science Journal, [e-journal] 14(2), pp.1-10. DOI: 10.5334/dsj-2015-002.
Eger, F., Coupek, D., Caputo, D., Colledani, M., Penalva, M., Ortiz, J.A., Freiberger, H. and Kollegger, G., 2018. Zero Defect Manufacturing Strategies for Reduction of Scrap and Inspection Effort in Multi-stage Production Systems. Procedia CIRP, [e-journal] 67, pp.368-373. DOI: 10.1016/j.procir.2017.12.228.
Ghimire, S., Melo, R., Ferreira, J., Agostinho, C. and Goncalves, R., 2015. Continuous Data Collection Framework for Manufacturing Industries. In: I. Ciuciu, ed. 2015. On the move to meaningful internet systems: OTM 2015 workshops, Lecture Notes in Computer Science. Cham, Heidelberg, New York, Dordrecht, London: Springer. pp.29-40. DOI: 10.1007/978-3-319-26138-6_5.
Gürdür, D., El-khoury, J. and Nyberg, M., 2018. Methodology for linked enterprise data quality assessment through information visualizations. Journal of Industrial Information Integration, [e-journal] 15, pp.191-200. DOI: 10.1016/j.jii.2018.11.002.
Hildebrand, K., Gebauer, M., Hinrichs, H. and Mielke, M. eds., 2015. Daten- und Informationsqualität: Auf dem Weg zur Information Excellence. Wiesbaden: Springer. DOI: 10.1007/978-3-658-09214-6.
Kacprzyk, J., Gunn, S., Guyon, I., Nikravesh, M. and Zadeh, L.A. eds., 2006. Feature extraction: Foundations and applications, Studies in Fuzziness and Soft Computing. Berlin, Heidelberg, Springer. DOI: 10.1007/978-3-540-35488-8.
Kao, H.-A., Hsieh, Y.-S., Chen, C.-H. and Lee, J., 2017. Quality prediction modeling for multistage manufacturing based on classification and association rule mining. MATEC Web of Conferences, [e-journal] 123(9), p.00029(2017). DOI: 10.1051/matecconf/201712300029.
Lieber, D., Stolpe, M., Konrad, B., Deuse, J. and Morik, K., 2013, “Quality Prediction in Interlinked Manufacturing Processes based on Supervised & Unsupervised Machine Learning. Procedia CIRP, [e-journal] 7, pp.193-198. DOI: 10.1016/j.procir.2013.05.033.
Liu, H. and Motoda, H., 2008. Computational methods of feature selection, Chapman & Hall / CRC data mining and knowledge discovery series. Boca Raton: Chapman & Hall/CRC.
Pennekamp, J., Glebke, R., Henze M., Meisen T., Quix, C., Hai, R., Gleim, L., Niemietz, P., Rudack, M., Knape, S., Epple, A., Trauth, D., Vroomen, U., Bergs, T., Brecher, C., Bührig-Polaczek, A., Jarke, M. and Wehrle, K., 2019. Towards an Infrastructure Enabling. In: The Institute of Electrical and Electronics Engineers (IEEE) IEEE Industrial Electronics Society (IES), Proceedings of the 2nd IEEE International Conference on Industrial Cyber-Physical Systems (ICPS 2019). Taipei, Taiwan, 6-9 May 2019. IEEE. DOI: 10.1109/ICPHYS.2019.8780276.
Raudys, Š., 2001. Statistical and Neural Classifiers: An Integrated Approach to Design, Advances in Pattern Recognition. London: Springer. DOI: 10.1007/978-1-4471-0359-2.
Rawat, T. and Khemchandani, V., 2017. Feature Engineering (FE) Tools and Techniques for Better Classification Performance. International Journal of Innovations in Engineering and Technology, [e-journal] 8(2), pp.169-179. DOI: 10.21172/ijiet.82.024.
Schmitt, J. and Deuse, J., 2018. Similarity-search and Prediction Based Process Parameter Adaptation for Quality Improvement in Interlinked Manufacturing Processes. In: The Institute of Electrical and Electronics Engineers (IEEE), IEEE International Conference on Industrial Engineering and Engineering Management. Bangkok, Thailand, 16-19 December 2018. IEEE. DOI: 10.1109/IEEM.2018.8607361.
Schmitt, R.H., Ellerich, M., Schlegel, P., Ngo, Q.H., Emonts, D., Montavon, B., Buschmann, D. and Lauther, R., 2020. Datenbasiertes Qualitätsmanagement im Internet of Production. In: W. Frenz, ed. 2020. Recht und Technik: Handbuch Industrie 4.0. Berlin, Heidelberg: Springer. DOI: 10.1007/978-3-662-58474-3_25.
Schmitt, R.H., Ngo, Q.H., Groggert, S. and Elser, H., 2016. Datenbasierte Qualitätsregelung. In: R. Refflinghaus, Ch. Kern, and S. Klute-Wenig, eds. 2016. Qualitätsmanagement 4.0 – Status Quo! Quo vadis? - Bericht zur GQW-Jahrestagung 2016. Kassel: Kassel University Press. Ch. 6. DOI: 10.19211/KUP9783737600859.
Schuh, G., Rebentisch, E., Riesener, M., Ipers, T., Tönnes, C. and Jank, M.-H., 2019. Data quality program management for digital shadows of products. Procedia CIRP, [e-journal] 86, pp.43-48. DOI: 10.1016/j.procir.2020.01.027.
Uhlemann, T.H.-J., Schock, C., Lehmann, C., Freiberger, S. and Steinhilper, R., 2017. The Digital Twin: Demonstrating the Potential of Real Time Data Acquisition in Production Systems. Procedia Manufacturing, [e-journal] 9, pp.113-120. DOI: 10.1016/j.promfg.2017.04.043.
Wang, K.-S., 2013. Towards zero-defect manufacturing (ZDM)—a data mining approach. Advances in Manufacturing, [e-journal] 1(1), pp.62-74. DOI: 10.1007/s40436-013-0010-9.
Wang, R.Y. and Strong, D.M., 1996. Beyond Accuracy. What Data Quality Means to Data Consumers. Journal of Management Information Systems, [e-journal] 12(4), pp.5-33. DOI: 10.1080/07421222.1996.11518099.
Wuest, T., Irgens, C. and Thoben, K.-D., 2013. Analysis of Manufacturing Process Sequences, Using Machine Learning on Intermediate Product States. In: C. Emmanouilidis, M. Taisch and D. Kiritsis, eds. 2013. Advances in Production Management Systems, IFIP Advances in Information and Communication Technology, Berlin, Heidelberg: Springer. Vol. 398. DOI: 10.1007/978-3-642-40361-3_1.
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S. and Hitzler, P., 2015. Quality assessment for Linked Data. A Survey. Semantic Web, [e-journal] 7(1), pp.63-93. DOI: 10.3233/SW-150175.
Authors
This is an open access journal which means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles in this journal without asking prior permission from the publisher or the author. This is in accordance with the BOAI definition of open access. This journal is licensed under a Creative Commons Attribution 4.0 License - http://creativecommons.org/licenses/by/4.0.
Authors who publish with the Quality Innovation Prosperity agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.