Machine learning algorithms: An explainer - University of Wolverhampton

With artificial intelligence (AI) dominating conversations around technology and data science, there is an ever-increasing interest in machine learning, the subfield of AI that enables computers to learn from data in order to make predictions or decisions.

At its core, machine learning uses mathematical and statistical techniques to teach computers how to recognise patterns and relationships in data. The typical aim of machine learning is to create computer systems that can iteratively improve and adapt their performance, all without being explicitly programmed to do so – and machine learning algorithms form the heart of this technology.

What are machine learning algorithms?

Machine learning algorithms are the building blocks of machine learning models. They will take input data, process it, and then generate an output based on the information, or any patterns, contained within it.

Microsoft Azure describes machine learning algorithms as pieces of code that help people explore, analyse, and find meaning in complex data sets:

“Each algorithm is a finite set of unambiguous step-by-step instructions that a machine can follow to achieve a certain goal. In a machine learning model, the goal is to establish or discover patterns that people can use to make predictions or categorise information.”

Machine learning algorithm types

Machine learning algorithms are diverse and designed for different purposes, from pattern recognition to forecasting. They are categorised into various types, each suited to specific tasks and data characteristics.

Supervised machine learning algorithms

Supervised learning is a type of machine learning technique in which algorithms are trained on a labelled training data set. Labelled data means that the input data is paired with the correct output, allowing the algorithm to learn the relationship between the two. Common supervised learning algorithms include:

- Decision trees. Decision trees break down data into a tree-like structure where each of the tree’s leaves represents a decision based on a feature of the data. These algorithms are a popular choice for classification problems, and the combination of multiple decision trees is known as a random forest.
- Linear regression. Linear regression algorithms are used for regression problems, where the goal is to predict a continuous output. The predictive model aims to find a linear relationship between input data and the target variable.

K-nearest neighbours classification algorithms. A supervised learning method used for classification and regression.

Logistic regression. Logistic regression is typically used for binary classification tasks, where the output is either 0 or 1 (true or false, yes or no, and so on), and it models the probability of a data point belonging to one of these two classes.

Unsupervised machine learning algorithms

Unsupervised learning algorithms, in contrast to supervised learning methods, work with unlabeled data during data analysis. These algorithms aim to find patterns, clusters, or structure within data without any prior knowledge of the output. Examples include:

K-means clustering. K-means groups data points into clusters based on their similarity. It’s widely used for customer segmentation, recommendation systems, and image recognition purposes.
Principal component analysis (PCA). PCA is what’s known as a dimensionality reduction technique, which helps in visualising and understanding large datasets – ones with lots of dimensions or features – by reducing redundancies and compressing data.
Hierarchical clustering. Hierarchical cluster analysis groups similar data points into groups called clusters, and then organises them in a hierarchical order.

Deep learning algorithms

Deep learning is a subset of machine learning that involves artificial neural networks with multiple hidden layers. These networks are capable of learning complex representations from data, and are widely used in areas such as natural language processing.

Artificial neural networks (ANN). ANNs consist of layers of interconnected nodes – similar to neurons in the human brain – that can learn intricate patterns from data.
Convolutional neural networks (CNN). CNNs are specialised neural networks designed for image recognition tasks.
Recurrent neural networks (RNN). RNNs are suitable for sequence data and are used in tasks like text generation and speech recognition.

Reinforcement learning

Reinforcement learning is a type of machine learning where the machine agent learns through feedback, developing knowledge cumulatively as it receives positive feedback for correct decisions and negative feedback for incorrect ones. This type of real-world learning is often used in robotics, gaming, and autonomous systems such as autonomous vehicles.

There are also semi-supervised learning algorithms that can use both labelled and unlabelled data.

What makes one machine learning algorithm different from others?

Several factors differentiate one machine learning algorithm from another:

Task and objective. Machine learning algorithms are chosen based on the task at hand. For instance, if someone needs to classify data into multiple categories, they might choose a classifier like support vector machines (SVM) or Naive Bayes. If they want to find patterns in data, a categorical or clustering algorithm may be more appropriate, while interpreting visual data will typically require computer vision algorithms.
Data characteristics. The nature of the input data will also influence the choice of algorithm. For structured data, traditional algorithms like decision trees might be more suitable, while unstructured data will typically require an algorithm like k-means or hierarchical clustering
Model complexity. The complexity of the problem and the model required will also play a role. For complex tasks, deep learning algorithms are often more effective – but they require large amounts of data and computational resources.
Overfitting. Different algorithms have varying levels of susceptibility to overfitting, a common problem in machine learning where a model becomes too specialised to the training data, so ensuring that a model generalises well to new data may be a key consideration.

Applications for machine learning algorithms

Healthcare

Machine learning algorithms are increasingly used to diagnose diseases, predict patient outcomes, and develop personalised treatment plans.

E-commerce

E-commerce is one of the most well-known use cases for machine learning algorithms. For example, recommendation systems powered by machine learning are improving digital user experiences by providing more accurate product recommendations.

Big data

Machine learning algorithms are essential for boosting and extracting valuable insights from vast datasets, supporting industries such as finance, marketing, and logistics.

Latest trends in machine learning algorithms

Machine learning is a rapidly evolving field, so staying up to date with the latest trends in its algorithms is essential for data scientists and other artificial intelligence experts who work within the technology. Some of the emerging trends include:

Explainable AI. Machine learning models are becoming increasingly complex, which means that it can be more difficult for people to understand – and fact-check – their decisions, results, and behaviours. This is where explainable AI comes in, aiming to make AI and machine learning algorithms and decision-making processes more transparent and interpretable.
Federated learning. With digital privacy concerns on the rise, federated learning algorithms allow models to be trained on decentralised data sources without sharing raw data, which helps safeguard privacy and security.
Transfer learning. Transfer learning leverages pre-trained AI and machine learning models to improve the performance of new models, saving time and resources.
Generative adversarial networks (GANs). GANs are deep learning algorithms that are typically used for generating images, text, and even audio. Because of this, they have widespread applications in art, entertainment, and content generation.
Responsible AI. Responsible AI is a movement that pushes for AI solutions that are fair, protect people’s privacy, are interpretable, and so on. According to TensorFlow, it is “critical to work towards systems that are fair and inclusive to everyone” as the impact of AI increases across sectors and societies.

Launch your career in machine learning and AI

Study the learning and optimisation algorithms needed to develop artificial intelligence systems with the 100%-online MSc Computer Science with Artificial Intelligence at the University of Wolverhampton. This flexible Master’s degree has been developed for forward-thinking individuals who may not have a background in computer science.

Studying this course, you will focus on the theory and implementation of using different programming languages, such as Python, and use optimal machine learning, training, and reasoning algorithms. Other topics of study include:

data science
intelligent agents
data mining
cloud computing
data visualisation