Machine vision

Machine vision is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance.

The overall machine vision process includes planning the details of the requirements and project, and then creating a solution. During run-time, the process starts with imaging, followed by automated analysis of the image and extraction of the required information.

Definition
Definitions of the term "Machine vision" vary, but all include the technology and methods used to extract information from an image on an automated basis, as opposed to image processing, where the output is another image. The information extracted can be a simple good-part/bad-part signal, or more a complex set of data such as the identity, position and orientation of each object in an image. The information can be used for such applications as automatic inspection and robot and process guidance in industry, for security monitoring and vehicle guidance. This field encompasses a large number of technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision is practically the only term used for these functions in industrial automation applications; the term is less universal for these functions in other environments such as security and vehicle guidance. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of basic computer science; machine vision attempts to integrate existing technologies in new ways and apply them to solve real world problems in a way that meets the requirements of industrial automation and similar application areas. The term is also used in a broader sense by trade shows and trade groups such as the Automated Imaging Association and the European Machine Vision Association. This broader definition also encompasses products and applications most often associated with image processing. The primary uses for machine vision are automatic inspection and industrial robot/process guidance. In more recent times the terms computer vision and machine vision have converged to a greater degree. See glossary of machine vision.

Imaging based automatic inspection and sorting
The primary uses for machine vision are imaging-based automatic inspection and sorting and robot guidance.;  in this section the former is abbreviated as "automatic inspection". The overall process includes planning the details of the requirements and project, and then creating a solution. This section describes the technical process that occurs during the operation of the solution.

Methods and sequence of operation
The first step in the automatic inspection sequence of operation is acquisition of an image, typically using cameras, lenses, and lighting that has been designed to provide the differentiation required by subsequent processing. MV software packages and programs developed in them then employ various digital image processing techniques to extract the required information, and often make decisions (such as pass/fail) based on the extracted information.

Equipment
The components of an automatic inspection system usually include lighting, a camera or other imager, a processor, software, and output devices.

Imaging
The imaging device (e.g. camera) can either be separate from the main image processing unit or combined with it in which case the combination is generally called a smart camera or smart sensor. Inclusion of the full processing function into the same enclosure as the camera is often referred to as embedded processing. When separated, the connection may be made to specialized intermediate hardware, a custom processing appliance, or a frame grabber within a computer using either an analog or standardized digital interface (Camera Link, CoaXPress). MV implementations also use digital cameras capable of direct connections (without a framegrabber) to a computer via FireWire, USB or Gigabit Ethernet interfaces.

While conventional (2D visible light) imaging is most commonly used in MV, alternatives include multispectral imaging, hyperspectral imaging, imaging various infrared bands, line scan imaging, 3D imaging of surfaces and X-ray imaging. Key differentiations within MV 2D visible light imaging are monochromatic vs. color, frame rate, resolution, and whether or not the imaging process is simultaneous over the entire image, making it suitable for moving processes.

Though the vast majority of machine vision applications are solved using two-dimensional imaging, machine vision applications utilizing 3D imaging are a growing niche within the industry. The most commonly used method for 3D imaging is scanning based triangulation which utilizes motion of the product or image during the imaging process. A laser is projected onto the surfaces of an object. In machine vision this is accomplished with a scanning motion, either by moving the workpiece, or by moving the camera & laser imaging system. The line is viewed by a camera from a different angle; the deviation of the line represents shape variations. Lines from multiple scans are assembled into a depth map or point cloud. Stereoscopic vision is used in special cases involving unique features present in both views of a pair of cameras. Other 3D methods used for machine vision are time of flight and grid based. One method is grid array based systems using pseudorandom structured light system as employed by the Microsoft Kinect system circa 2012.

Image processing
After an image is acquired, it is processed. Central processing functions are generally done by a CPU, a GPU, a FPGA or a combination of these. Deep learning training and inference impose higher processing performance requirements. Multiple stages of processing are generally used in a sequence that ends up as a desired result. A typical sequence might start with tools such as filters which modify the image, followed by extraction of objects, then extraction (e.g. measurements, reading of codes) of data from those objects, followed by communicating that data, or comparing it against target values to create and communicate "pass/fail" results. Machine vision image processing methods include;


 * Stitching/Registration: Combining of adjacent 2D or 3D images.
 * Filtering (e.g. morphological filtering)
 * Thresholding: Thresholding starts with setting or determining a gray value that will be useful for the following steps. The value is then used to separate portions of the image, and sometimes to transform each portion of the image to simply black and white based on whether it is below or above that grayscale value.
 * Pixel counting: counts the number of light or dark pixels
 * Segmentation: Partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze.
 * Edge detection: finding object edges
 * Color Analysis: Identify parts, products and items using color, assess quality from color, and isolate features using color.
 * Blob detection and extraction: inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks.
 * Neural network / deep learning / machine learning processing: weighted and self-training multi-variable decision making Circa 2019 there is a large expansion of this, using deep learning and machine learning to significantly expand machine vision capabilities. The most common result of such processing is classification. Examples of classification are object identification,"pass fail" classification of identified objects and OCR.
 * Pattern recognition including template matching. Finding, matching, and/or counting specific patterns. This may include location of an object that may be rotated, partially hidden by another object, or varying in size.
 * Barcode, Data Matrix and "2D barcode" reading
 * Optical character recognition: automated reading of text such as serial numbers
 * Gauging/Metrology: measurement of object dimensions (e.g. in pixels, inches or millimeters)
 * Comparison against target values to determine a "pass or fail" or "go/no go" result. For example, with code or bar code verification, the read value is compared to the stored target value.  For gauging, a measurement is compared against the proper value and tolerances. For verification of alpha-numberic codes, the OCR'd value is compared to the proper or target value. For inspection for blemishes, the measured size of the blemishes may be compared to the maximums allowed by quality standards.

Outputs
A common output from automatic inspection systems is pass/fail decisions. These decisions may in turn trigger mechanisms that reject failed items or sound an alarm. Other common outputs include object position and orientation information for robot guidance systems. Additionally, output types include numerical measurement data, data read from codes and characters, counts and classification of objects, displays of the process or results, stored images, alarms from automated space monitoring MV systems, and process control signals. This also includes user interfaces, interfaces for the integration of multi-component systems and automated data interchange.

Deep Learning
The term "Deep Learning" has variable meanings, most of which can be applied to techniques used in machine vision for over 20 years. However the usage of the term in Machine Vision began in the later 2010s with the advent of the capability to successfully apply such techniques to entire images in the industrial machine vision space. Conventional machine vision usually requires the "physics" phase of a machine vision automatic inspection solution to create reliable simple differentiation of defects. An example of "simple" differentiation is that the defects are dark and the good parts of the product are light. A common reason why some applications were not doable was when it was impossible to achieve the "simple"; deep learning removes this requirement, in essence "seeing" the object more as a human does, making it now possible to accomplish those automatic applications. The system learns from a large amount of images during a training phase and then executes the inspection during run-time use which is called "inference".

Imaging based robot guidance
Machine vision commonly provides location and orientation information to a robot to allow the robot to properly grasp the product. This capability is also used to guide motion that is simpler than robots, such as a 1 or 2 axis motion controller. The overall process includes planning the details of the requirements and project, and then creating a solution. This section describes the technical process that occurs during the operation of the solution. Many of the process steps are the same as with automatic inspection except with a focus on providing position and orientation information as the result.

Market
As recently as 2006, one industry consultant reported that MV represented a $1.5 billion market in North America. However, the editor-in-chief of an MV trade magazine asserted that "machine vision is not an industry per se" but rather "the integration of technologies and products that provide services or applications that benefit true industries such as automotive or consumer goods manufacturing, agriculture, and defense."