Digital eyes: The rise of image recognition - University of Wolverhampton

Image recognition is a burgeoning branch of AI. Sarah Harrop explores what it is, how it works, and how it is being used in our daily lives.

What is image recognition?

The human brain has evolved over millennia to recognise objects with ease, but computers have much more of a struggle on their hands to complete this task: it requires significant processing power and some sophisticated artificial intelligence (AI).

Image recognition is part of computer vision: the ability of a computer to ‘see’. Software is used to identify elements within digital images, such as people, places, objects and writing, and to recognise the category to which they belong. For image recognition to be achieved, computers use machine vision technologies along with one or more video cameras and AI software. By converting images into numerical or symbolic information, image recognition can make sense of the world in ways similar to human vision.

With relatively recent AI breakthroughs such as deep learning, image recognition has come a long way. It is used across a broad range of fields, from healthcare to security, retail, and social media, allowing automation of many tasks that once needed human vision and brain processing.

“Today’s machines can recognize diverse images, pinpoint objects and facial features, and even generate pictures of people who’ve never existed,” says Altamira.

There are many services out there that can help to build, train and deploy such machine learning algorithms, including TensorFlow from Google, Python-based library Keras, open-source framework Caffe, and Microsoft Cognitive Toolkit.

How does image recognition work?

Image processing by computers can be done in several different ways, including deep learning models and machine learning models, depending on the task in hand, and its complexity.

Image recognition in computer vision has to tackle four initial problems:

Classification is where artificial neural networks identify objects in the image and assign them to a predefined group or classification.
Object detection is the process of finding the location of an object and classifying it. Once localization of the object is done, a bounding box with the corresponding accuracy is put around it and techniques such as bounding box annotation and semantic segmentation are used for detection.
Tagging further hones the accuracy of classification by performing object recognition on multiple objects in an image and assigning one or more tags, e.g. vehicles, trees or water.
Segmentation locates objects in an image to the nearest pixel, with the algorithm identifying all pixels that belong to each class. This is used in medical imaging, such as for analysing an MRI scan, where precision is very important for an accurate diagnosis.

The actual nuts and bolts of image recognition usually involves creating deep neural networks (a machine learning algorithm which mimics the function and structure of the human brain) which can analyse each image pixel. These networks are trained to recognise related images by being fed as many labelled images as possible.

As Techtarget explains, there are typically three steps to this process:

Gather the dataset: A data set with labelled images is put together. For instance, an apple image needs to be identified as an “apple” or as something that people recognise.
Train a neural network: the image dataset is fed to a neural network to train it. Convolutional neural network (CNN) processors are perfect for these situations because they can automatically detect the significant features. They’re designed to adaptively learn spatial hierarchies of features from input images.

As Altamira describes it: “During training, a CNN learns the values of the filters and weights through a backpropagation algorithm, adjusting them to recognize patterns and features in images, such as edges, textures, or object parts, which then contribute to recognizing the whole object within the image. By stacking multiple convolutional, activation, and pooling layers, CNNs can learn a hierarchy of increasingly complex features.”

Lower layers, for example, may learn to spot different colours and the edges of objects. Intermediate layers might be dedicated to detecting more complex items such as eyes or wheels. Deeper layers can detect bigger-picture whole features like faces, trees or buildings, which is critical for image recognition tasks.

Get predictions: A new image that isn’t in the training set is fed into the system to obtain predictions.

Applications of image recognition

Every day we share vast amounts of image data through apps, social networks, and websites. The number of digital images and videos generated has rapidly accelerated with the advent of smartphones and high-resolution cameras: it’s estimated that a jaw-dropping 50 billion images have been uploaded to Instagram since the platform’s launch. That gives industries a huge volume of digital data to rely upon in delivering ever improved and innovative services.

Image recognition is integral to many modern technologies. Some real-world use cases include:

Facial recognition: Millions of people use facial recognition – one application of image recognition – every day to unlock their mobile phone screens. But image recognition algorithms are also, according to software company Comidor, helping marketers to get information about a person’s identity, gender and even their mood.

Healthcare diagnostics: Image recognition software is used to analyse medical imaging scans such as X-rays, magnetic resonance imaging (MRI) or computational tomography (CT) scans to diagnose diseases and detect abnormalities, such as example cancers. Specifically, it can help to spot patterns or anomalies in the images, sometimes much faster and more accurately than trained human experts, producing more accurate diagnoses and faster access to treatment for patients.

Improving reproductive success: An AI image recognition platform developed by Nix can help to minimise human bias and allow for individually tailored data-driven decision-making when selecting the embryos for transfer into a woman’s womb during in vitro fertilisation (IVF). The platform can identify which embryos have the best chance of going on to become a successful pregnancy, based on the patient’s electronic medical record, genomics, and visual data.

Retail: Self-checkout systems use image recognition to rapidly and easily identify items and make the checkout process smoother and more efficient. Furthermore, customers may want to take photos of things and see where they can purchase them using image recognition software in their phones.

Self-driving cars. Image recognition is essential to help autonomous vehicles to understand their surroundings and drive around safely, including accurately spotting obstacles, traffic signs, and pedestrians.

Education: Online lessons are common these days, but teachers can find it difficult to track students’ reactions through their webcams. Neural networks can help teachers to understand when a student doesn’t understand or is bored or frustrated through image recognition-powered recognition of facial expressions or even body language. Image recognition can also be used for handwriting recognition of students’ work, and to digitise learning resources, monitor student attendance, and enhance the security of school and university campuses.

Join the AI revolution

If you’re keen to gain the AI knowledge and skills needed to take on innovative roles in the creative industries, product design, games industry and more, then the University of Wolverhampton’s MSc Computer Science with Artificial Intelligence is for you. This flexible, 100% online Master’s degree is designed for forward-thinking people who are not necessarily from a computer science background. Studying on this Master’s degree, you will become well-versed in the application of AI in the real world, from chatbots to computer vision and speech processing. You’ll gain the knowledge and skills to recommend solutions and apply existing AI tools to real-world challenges in areas from health to e-commerce. Along the way you’ll leverage unique tools and platforms to deliver the capabilities demanded by this ever-evolving, high-growth domain.