Published on

Machine Learning: The Art of Guiding Computers to Learn and Evolve

Imagine computers that not only perform tasks but also learn and adapt over time. This is the essence of machine learning (ML). At its core, machine learning is a transformative approach that allows machines to learn from data, uncover patterns, and make decisions without being explicitly programmed, all with minimal human intervention. This post will provide an approachable overview of how machine learning works and its key components.

There is a concise yet comprehensive definition of machine learning capturing all of its important components – through which we can dissect the approach. It goes like this:

IMPORTANT

Machine learning is an approach to (1) learn (2) complex patterns in (3) existing data and then utilize this knowledge to (4) make predictions on (5) unseen data.

Now, let's unpack this definition:

(1) Learning: Learning in both humans and machines involves acquiring knowledge. Humans gather information through their senses, which is then processed by the brain to form understanding. Similarly, machine learning models acquire knowledge through processing digital data. There are three main approaches for this: (1) in supervised learning the model learns from example input-output pairs (labeled data); (2) in unsupervised learning the model discovers patterns in unlabeled data; (3) in reinforcement learning the model learns through trial and error within a specific environment, adapting based on rewards or penalties received.

(2) Complex Patterns: Machine learning truly excels when it encounters data imbued with complex, hidden relationships. This is quite different from dealing with random data, which lacks discernible patterns, rendering ML techniques unnecessary. On the other hand, overly simplistic data presents only straightforward patterns, where a linear model or a look-up table might be perfectly adequate. The real power of ML lies in its ability to unearth intricate patterns that are not immediately obvious to the human eye. Importantly, ML is not only for predicting outcomes but also for clarifying the reasons behind these projections. A crucial element of an effective ML system is its explainability, ensuring we can follow and comprehend the logic behind its predictions. An explainable ML model can unveil the hidden patterns within the data, fostering a deeper understanding and enabling us to devise well-informed hypotheses based on these insights.

(3) Training Data: Essential to the success of machine learning is a large set of training data. The ability of a system to learn and recognize patterns is greatly enhanced by the number of examples it's trained on. For example, a machine learning model trained on a dataset of 100,000 images of cats and dogs is typically more accurate than one trained with only 100 images. But it's not just the amount of data that matters; the quality is equally important. The training data must accurately represent the problem we're trying to solve. For instance, in building a model to predict house prices, the training data should reflect the real conditions of the housing market. Using data from an unrelated market would hinder the model's ability to generalize and effectively address the real-world problem it's designed to solve.

(4) Predictions: Machine learning models, particularly those in deep learning, act as universal function approximators. They are adept at estimating outcomes in complex scenarios where traditional exact mathematical models may fall short, either due to the complexity of the problem or the computational resources required. For instance, deep learning models can efficiently approximate solutions in physical simulations, offering speed and practicality that traditional methods often can't match. This capability stems from their ability to learn and model non-linear and high-dimensional relationships within data. Additionally, the process behind the observable data is often unknown, meaning that we don't have a mathematical model at hand with which we can describe the phenomenon. In such scenarios, machine learning becomes invaluable. It attempts to approximate this unknown data generation process using observable data, aiming to capture the underlying dynamics and patterns that govern the data's behavior. This approach allows machine learning models to make predictions or decisions, not by explicitly solving the hidden process, but by closely mirroring its outcomes.

(5) Unseen Data: The ability to make accurate predictions on unseen data is crucial in machine learning. If a model only performs well on its training data, it's akin to a look-up-table, simply recalling specific answers rather than truly understanding patterns. This is a classic sign of overfitting, and lack of generalization, where the model is excessively tailored to the training data, including its noise, and fails to generalize to new data. Generalization is the hallmark of a robust machine learning model, allowing it to apply learned patterns to novel situations effectively. Developing a machine learning model using a fixed, offline dataset is a relatively straightforward task. However, the true challenge emerges when creating an ML system that operates on continuously updating online data in a production environment. The inherent difficulty lies in the dynamic nature of the data, which is constantly evolving, rendering models trained on older data less effective and stale over time. To address this challenge, there's a dedicated field of research known as continual learning. This field focuses on developing strategies for models to adapt to new information without forgetting previously learned data. Additionally, practices like MLOps have been established to efficiently monitor and update models in production, ensuring their relevance and accuracy in real-time applications.

Traditional Algorithms and ML

To better underscore the main properties of machine learning, let's compare it to traditional algorithms.

In the traditional algorithmic approach, we hand-craft heuristics, decision logic, and patterns. Take trading algorithms as an example: a simple rule might be to buy when the Relative Strength Index (RSI) is below 30. This approach is efficient for well-defined problems with simple patterns. However, traditional methods often fall short when faced with complex problems where patterns are not easily discernible or where hand-crafting them is impractical. An example of hand-crafting features is the use of Haar features in face detection algorithms. Hand-crafting features is a laborious and challenging process, as it requires extensive manual effort to design and test each feature for effectiveness. In contrast, machine learning methods have outperformed these techniques by automating feature learning, significantly improving both accuracy and efficiency, and marking a shift from manual to automated processes in feature design. This is where machine learning comes into play. In machine learning, the system doesn't just follow explicitly programmed instructions. Instead, it learns patterns from data during a training stage. This learned knowledge is then applied to make predictions on new, unseen data. This is the main difference between traditional algorithms and machine learning.

As depicted in Fig. 1, traditional algorithms rely on hand-crafted logic. Contrastingly, as shown in Fig. 2, machine learning involves a training phase to learn from examples. The key distinction lies in their approach to problem-solving: traditional algorithms depend on predefined rules, while machine learning algorithms derive rules from data itself.

References

This note is inspired by Chip Huyen's fascinating Designing Machine Learning Systems book, page 3, where she introduces the concept of machine learning in a similar way.

Attribution