Talk:Artificial intelligence in industry

Restructuring
I think this article could be an important resource for anyone interested in applications of AI and ML in industrial settings but at the moment lacks in depth and structure. I therefore propose a significant rewrite and restructuring of the article in order to give a broader overview of the field and provide the reader with significantly more detail and additional starting points for further reading. I propose the following structure for the article:

Introduction

General information about the topic.

Areas of Application

Introductory text. (Each of the following areas is considered more in-depth)

- Market & Trend Analysis

- Machinery & Equipment

- Intralogistics

- Production Process

- Supply Chain - Building

- Product

Challenges

Challenges of the application of AI and ML in industrial settings. Industrial Data Science Pipelines

What processes are there to apply AI/ML methods in industrial settings.

Industrial Data Sources

Which open datasets are available for the use in AI/ML applications in industrial settings.

---

I would appriciate your feedback on this proposed restructuring and rewrite. Dinohund (talk) 13:19, 13 June 2023 (UTC)


 * Agreed. The proposed structure seems target-oriented to me.
 * This article has a section "Manufacturing" that links to this article. That section should be edited according to this article. Victor ven Low (talk) 19:05, 13 June 2023 (UTC)
 * As proposed, find a draft for the overhaul of the article can be found below. Please take a look and leave any feedback regarding the updated article.
 * Categories
 * Artificial intelligence (AI) and machine learning (ML) became key enablers to leverage data in production. Possible applications can be divided into seven application areas [1]:
 * Market & Trend Analysis
 * Machinery & Equipment
 * Intralogistics
 * Production Process
 * Supply Chain
 * Building
 * Product
 * Each application area can be further divided into specific application scenarios that describe concrete AI/ML applications in production. An overview is found in Figure X. While some application areas have a direct connection to production processes, others cover production adjacent fields like logistics or the factory building.
 * [Figure]
 * Challenges
 * In contrast to entirely virtual systems, in which ML applications are already widespread today, real-world production processes are characterized by the interaction between the virtual and the physical world. Data is recorded using sensors and processed on computational entities and, if desired, actions and decisions are translated back into the physical world via actuators or by human operators [2]. This poses major challenges for the application of ML in production engineering systems. These challenges are attributable to the encounter of process, data and model characteristics: The production domain’s high reliability requirements, high risk and loss potential, the multitude of heterogeneous data sources and the non-transparency of ML model functionality impede a faster adoption of ML in real-world production processes. In particular, production data comprises a variety of different modalities, semantics and quality [3]. Furthermore, production systems are dynamic, uncertain and complex [3], and engineering and manufacturing problems are data-rich but information-sparse [4]. Besides that, due the variety of use cases and data characteristics, problem-specific data sets are required, which are difficult to acquire, hindering both practitioners and academic researchers in this domain [5].
 * [Figure]
 * 1. Process and Industry Characteristics
 * The domain of production engineering can be considered as a rather conservative industry when it comes to the adoption of advanced technology and their integration into existing processes. This is due to high demands on reliability of the production systems resulting from the potentially high economic harm of reduced process effectiveness due to e.g., additional unplanned downtime or insufficient product qualities. In addition, the specifics of machining equipment and products prevent area-wide adoptions across a variety of processes. Besides the technical reasons, the reluctant adoption of ML is fueled by a lack of IT and data science expertise across the domain.
 * 2. Data Characteristics
 * The data collected in production processes mainly stem from frequently sampling sensors to estimate the state of a product, a process, or the environment in the real world. Sensor readings are susceptible to noise and represent only an estimate of the reality under uncertainty. Production data typically comprises multiple distributed data sources resulting in various data modalities (e.g., images from visual quality control systems, time-series sensor readings, or cross-sectional job and product information). The inconsistencies in data acquisition lead to low signal-to-noise ratios, low data quality and great effort in data integration, cleaning and management. In addition, as a result from mechanical and chemical wear of production equipment, process data is subject to various forms of data drifts.
 * 3. ML-Model Characteristics
 * ML models are considered as black-box systems given their complexity and intransparency of input-output relation. This reduces the comprehensibility of the system behavior and thus also the acceptance by plant operators. Due to the lack of transparency and the stochasticity of these models, no deterministic proof of functional correctness can be achieved complicating the certification of production equipment. Given their inherent unrestricted prediction behavior, ML models are vulnerable against erroneous or manipulated data further risking the reliability of the production system because of lacking robustness and safety. In addition to high development and deployment costs, the data drifts cause high maintenance costs, which is disadvantageous compared to purely deterministic programs.
 * Standard processes for data science in production
 * The development of ML applications – starting with the identification and selection of the use case and ending with the deployment and maintenance of the application – follows dedicated phases that can be organized in standard process models. The process models assist in structuring the development process and defining requirements that must be met in each phase to enter the next phase. The standard processes can be classified into generic and domain-specific ones. Generic standard processes (e.g., CRISP-DM, ASUM-DM, KDD, SEMMA, or Team Data Science Process) describe a generally valid methodology and are thus independent of individual domains [6]. Domain-specific processes on the other hand consider specific peculiarities and challenges of special application areas.
 * The Machine Learning Pipeline in Production is a domain-specific data science methodology that is inspired by the CRISP-DM model and was specifically designed to be applied in fields of engineering and production technology [7]. To address the core challenges of ML in engineering – process, data, and model characteristics – the methodology especially focuses on use-case assessment, achieving a common data and process understanding data integration, data preprocessing of real-world production data and the deployment and certification of real-world ML applications.
 * [Figure]
 * Industrial Data Sources
 * The foundation of most artificial intelligence and machine learning applications in industrial settings are comprehensive datasets from the respective fields. Those datasets act as the basis for training the employed models [3]. In other domains, like computer vision, speech recognition or language models, extensive reference datasets [8–10] and data scraped from the open internet [11] are frequently used for this purpose. Such datasets rarely exist in the industrial context because of high confidentiality requirements [5] and high specificity of the data. Industrial applications of artificial intelligence are therefore often faced with the problem of data availability [5].
 * For these reasons, existing open datasets applicable to industrial applications, often originate from public institutions like governmental agencies [12] or universities [13–15] and data analysis competitions hosted by companies [16, 17]. In addition to this, data sharing platforms like Kaggle [18], Unearthed [19], IEEE DataPort [20] or OpenML [21] exist. However, most of these platforms have no industrial focus and offer limited filtering abilities regarding industrial data sources.
 * Overviews of Open Industrial Data Sources
 * To provide a starting point for industrial applications of machine learning and artificial intelligence, several overviews of available open data sources from the industrial domain exist.
 * [Table]
 * [Figure]
 * Industrial Data Sources
 * The foundation of most artificial intelligence and machine learning applications in industrial settings are comprehensive datasets from the respective fields. Those datasets act as the basis for training the employed models [3]. In other domains, like computer vision, speech recognition or language models, extensive reference datasets [8–10] and data scraped from the open internet [11] are frequently used for this purpose. Such datasets rarely exist in the industrial context because of high confidentiality requirements [5] and high specificity of the data. Industrial applications of artificial intelligence are therefore often faced with the problem of data availability [5].
 * For these reasons, existing open datasets applicable to industrial applications, often originate from public institutions like governmental agencies [12] or universities [13–15] and data analysis competitions hosted by companies [16, 17]. In addition to this, data sharing platforms like Kaggle [18], Unearthed [19], IEEE DataPort [20] or OpenML [21] exist. However, most of these platforms have no industrial focus and offer limited filtering abilities regarding industrial data sources.
 * Overviews of Open Industrial Data Sources
 * To provide a starting point for industrial applications of machine learning and artificial intelligence, several overviews of available open data sources from the industrial domain exist.
 * [Table]


 * 1   References
 * 1.    Krauß J, Hülsmann T, Leyendecker L et al. (2023) Application Areas, Use Cases, and Data Sets for Machine Learning and Artificial Intelligence in Production. In: Liewald M, Verl A, Bauernhansl T et al. (eds) Production at the Leading Edge of Technology. Springer International Publishing, Cham, pp 504–513
 * 2.    Monostori L, Kádár B, Bauernhansl T et al. (2016) Cyber-physical systems in manufacturing. CIRP Annals 65:621–641. https://doi.org/10.1016/j.cirp.2016.06.005
 * 3.    Wuest T, Weimer D, Irgens C et al. (2016) Machine learning in manufacturing: advantages, challenges, and applications. Production & Manufacturing Research 4:23–45. https://doi.org/10.1080/21693277.2016.1192517
 * 4.    Lu SC-Y (1990) Machine learning approaches to knowledge synthesis and integration tasks for advanced engineering automation. Computers in Industry 15:105–120. https://doi.org/10.1016/0166-3615(90)90088-7
 * 5.    Jourdan N, Longard L, Biegel T et al. (2021) Machine Learning For Intelligent Maintenance And Quality Control: A Review Of Existing Datasets And Corresponding Use Cases. Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
 * 6.    Azevedo A, Santos MF (2008) KDD, SEMMA and CRISP-DM: a parallel overview. IADS-DM
 * 7.    Krauß J, Dorißen J, Mende H et al. (2019) Machine Learning and Artificial Intelligence in Production: Application Areas and Publicly Available Data Sets. In: Wulfsberg JP, Hintze W, Behrens B-A (eds) Production at the leading edge of technology. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 493–501
 * 8.    Russakovsky O, Deng J, Su H et al. (2014) ImageNet Large Scale Visual Recognition Challenge. arXiv
 * 9.    Panayotov V, Chen G, Povey D et al. (42015) Librispeech: An ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 5206–5210
 * 10.  Galvez D, Diamos G, Ciro J et al. (2021) The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage. arXiv
 * 11.  OpenAI (2023) GPT-4 Technical Report. arXiv
 * 12.  National Aeronautics and Space Administration Prognostics Center of Excellence Data Set Repository. https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository. Accessed 15 Sep 2023
 * 13.  Lopes Luis C-ML (1999) Robot Execution Failures. http://archive.ics.uci.edu/dataset/138/robot+execution+failures
 * 14.  University of Ljubljana, Visual Cognitive Systems Lab Kolektor Surface-Defect Dataset 2 (KolektorSDD2/KSDD2). https://www.vicos.si/resources/kolektorsdd2/. Accessed 15 Sep 2023
 * 15.  University of Wuppertal (2021) Results of the graph and heuristic based topology optimization for crashworthiness profiles made of fiber reinforced thermoplastics. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/C4YDIF. Accessed 15 Sep 2023
 * 16.  Bosch (2015) Bosch Production Line Performance: Reduce manufacturing failures. https://www.kaggle.com/competitions/bosch-production-line-performance/overview/description. Accessed 15 Sep 2023
 * 17.  Mercedes-Benz Mercedes-Benz Greener Manufacturing: Can you cut the time a Mercedes-Benz spends on the test bench? https://www.kaggle.com/c/mercedes-benz-greener-manufacturing. Accessed 15 Sep 2023
 * 18.  Kaggle Kaggle. https://www.kaggle.com/. Accessed 15 Sep 2023
 * 19.  Unearthed Unearthed. https://unearthed.solutions/. Accessed 15 Sep 2023
 * 20.  IEEE IEEE Dataport. https://ieee-dataport.org/. Accessed 15 Sep 2023
 * 21.  openML OpenML. https://www.openml.org/. Accessed 15 Sep 2023
 * 22.  National Aeronautics and Space Administration, Intelligent Systems Division Prognostics Center of Excellence Data Set Repository. https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository. Accessed 15 Sep 2023
 * 23.  Fraunhofer FFB, Fraunhofer IPT Machine Learning in Production - Application areas and open data records. https://www.bigdata-ai.fraunhofer.de/s/datasets/index.html. Accessed 15 Sep 2023
 * 24.  AndreaPi Open-industrial-datasets. https://github.com/AndreaPi/Open-industrial-datasets. Accessed 15 Sep 2023
 * 25.  nicolasj92 Industrial ML Datasets. https://github.com/nicolasj92/industrial-ml-datasets. Accessed 15 Sep 2023 Dinohund (talk) 14:32, 9 October 2023 (UTC)
 * 25.  nicolasj92 Industrial ML Datasets. https://github.com/nicolasj92/industrial-ml-datasets. Accessed 15 Sep 2023 Dinohund (talk) 14:32, 9 October 2023 (UTC)