Vision AI: What is it and Why does it Matter?

Dec 17, 2019 6:28:00 PM

computer Vision AI

Vision AI (also known as Computer Vision) is a field of computer science that trains computers to replicate the human vision system. This enables digital devices (like face detectors, QR Code Scanners) to identify and process objects in images and videos, just like humans do. 

Personalized image search on eCommerce stores, 3D model building (Photogrammetry), aeriel images on a map, OCR scanning in retail outlets, face recognition, image detectors,  MRI reconstruction are some of the innovative use cases of computer vision that we have today. 

But, when was this technology introduced? How it evolved? What future possibilities does it bring for businesses, irrespective of the industry? The upcoming segment discusses these three factors and gives a brief about how vision AI works. So, let’s get started.

The History and Evolution of Computer Vision:

The first experiment for computer vision took place in the 1950s wherein neural networks were used to detect edges of an object and sort simple objects like squares and circles. Later in the 1970s, a commercial use case of computer vision was implemented. It was an interpretation of a handwritten text using optical character recognition (OCR). This execution was used to interpret written text for the blind. In 1999s, as the internet matured, facial recognition programs thrived. Later, in 2010 (and beyond), deep learning helped computers to train themselves and self-improve with time.

Today, this technology has found its use cases in various domains, right from automotive, healthcare, retail, smartphones, and more. And what’s contributing to the progress of Vision AI technology is affordable computing power, better hardware, new algorithms like convolutional neural networks, etc. As a result, there is more accuracy in output and improved use cases of the technology.

Owing to its potential, the computer vision AI market is expected to have a worth of 1.6 billion USD by the end of 2019. | Source: Statista

Computer Vision AI: How it Works? 

Consider Computer Vision AI like a jigsaw puzzle. There are several pieces that you have to assemble to create an image. That’s exactly how the neural networks for computer vision work. The neural networks distinguish between different pieces of an image and identify the model of subcomponents.  

Instead of giving hints to recognize an object, the computer is fed with images that can help in the precise identification of an object. Suppose, you have to train a computer to identify a cat. Instead of giving hints like different tails, whiskers, pointy ears, the system is fed with hundreds (or millions) of images of cats. The model thus created learns about different features that make up a cat or differentiates it from other look-alike animals.

Applications of Computer Vision AI : 

1. Image Segmentation

Its a process of partitioning an image from multiple regions and pieces, based on pixel characteristics in an image. Generally used for examining purposes, image segmentation involves separating foreground from background or clustering parts of an image by pixels, based on similarity in color or shape. The image shown below exemplifies image segmentation, where parts of the image are differentiated with colors.

Image Segmentation

                                                                                       Image Courtesy: Research Gate                                                                    

2. Object Detection 

This field of computer vision AI deals with the detection of one or multiple objects in an image or a video. For example, surveillance cameras smartly detect humans and their activities (no movement, objects like guns or knife, etc.) so that caution is passed for suspicious activities. 

Object Detection

                                                   Image Courtesy: Research Gate                                      

3. Facial Recognition

The facial recognition technique aims at detecting an object or human face in the image. It is one of the complex applications of computer vision because of variability in human faces- expression, pose, skin color, the difference in camera quality, position or orientation, image resolution, etc. However, this technique is prominently used. Smartphones use it for user authentication. Facebook uses the same technique when it gives tagging suggestions for people in a picture. 

facial recognization

                                                                                     Image Courtesy: Apple

4. Edge Detection

Edge detection deals with finding the boundaries of objects within an image. This is done by detecting discontinuities in brightness. Edge detection can be a great help in data extraction and image segmentation.

                                Edge Detection




                                                                Image Courtesy: Wikipedia

 5. Pattern Recognition

Pattern recognition is the ability of a system to detect arrangements of characteristics or data. Here, a pattern can be a recurring sequence of data or a set of data added to the system. 

pattern recognization

                  Image Courtesy: Wikipedia

6. Image classification

Image classification involves classifying an image based on the contextual visual content present in it. The process includes focussing on the relationship of nearby pixels. The classification system comprises a database that contains predefined patterns. These patterns are compared with the detected object to classify what it is. Image classification has significant applications in areas such as vehicle navigation, biometry, video surveillance, biomedical imaging, etc.  

Image Classification

The Present and Future of Computer Vision: 

Computer Vision brings endless possibilities for consumers and businesses. Self-driving cars, medical diagnosis, image labeling, cashier-less checkout are some of the benefits of computer vision technology that exemplify its limitless benefits for different industries. 

However, one of the major challenges in implementing computer vision at a larger scale is big data that are needed to be trained. With better resources available to train models, computer vision holds the capability to bypass the human ability to recognize, classify, and detect diverse images/videos.

Archna Oberoi

Written by Archna Oberoi

Content strategist by profession and blogger by passion, Archna is avid about updating herself with the freshest dose of technology and sharing them with the readers. Stay tuned here as she brings some trending stories from the tech-territory of mobile and web.