How can machines think ? Machine learning from scratch

Image for post
Image for post

I’m a Holberton student, and from time to time we get assigned to write articles about something technical that we’re learning in school. So usually it’s something that I have a knowledge of, which makes the writing much easier. But when I first saw this assignment I knew I had to make a lot of research about machine learning before typing one letter, because so far I am a complete ignorant and all I know is that it’s related to artificial intelligence (which is something I don’t fully understand too), and how to make a computer think. that’s what the tittle says, but really, how does machines learn? How to make a machine smart enough to learn by itself. IT’S A MACHINE!!

Well, I’m gonna be writing this and learning at the same time, so hop in Grandpa this is gonna be a fun ride!

Starting at this exact point, artificial intelligence, Machine learning, deep learning, and neural networks having exactly the same meaning for me, I started googling on the differences to clear things out.

After swiping in and out of some websites, I think the easiest way to visualize the relationship is through a series of concentric circles. It turns out that they’re quite different, but I guess I wasn’t the only one making the mistake of grouping them in one thing. Although that one thing is Artificial Intelligence. It is the mother of all, it has everything else under its skin.

Image for post
Image for post

Okay then What is Artificial Intelligence ?

Have you ever browsed netflix suggestions or told Alexa to order a pizza? You’ve probably been interacting with artificial intelligence more than you realise, and that’s kind of the point. AI is designed so that we don’t know that a computer is calling the shots.

To avoid confusion, let’s head out to the first definition of AI, when it was first coined, it was the year 1956, by her principal pionner John Maccarthy.

“Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves”. -J. Maccarthy

Well you see, it is any code, technique or algorithm that enables machines to mimic develop or demonstrate human cognition or behaviour.

The first question that popped in to my head when I found out that it was already here since almost 70 years is, what’s with the sudden interest? why is it only focused on now?

Like all exponential curves, it’s hard to tell when a line that’s slowly ticking
upwords is going to skyrocket. Actually, AI research programs had to disguise themselves under different names in order to continue receiving funding (machine learning is one of them). Until the past few years when a couple of factors have led to AI becoming the next “big” thing.
One is: it is now that the world is filled with data waiting to be used. AI is driven by data it’s all about acting on data, learning from new data, and improving over time. Same with a little child growing up by learning and getting all the informations to get smarter. and now thanks to advances in processing speeds, computers can actually make sense of all of this information more quickly. Because of this, tech giants have bought into AI and are infusing the market with cash and new applications.

But still, I don’t see any flying cars , or robots in the street. What about that?Turns out, we are still in the era of weak AI, the technology is still in its infency and is only good at a relatively narrow range of tasks. Though it is expected to do anything and everything that humans do in the era of strong AI. To transition from weak AI to strong AI, That’s when machine learning gets in the game. Machines need to learn the ways of humans. The techniques and processes which help machines in this endeavor are broadly categorized under machine learning.

Currently, ML is used to recognize faces, voice commands, and objects,
as well as to translate languages, optical character recognition (OCR) technology to convert images of text into movable type, recommendation engines…etc
It has been successfully implemented in chatbots, such as Siri (Apple),
Cortana (Microsoft), and Alexa (Amazon). It is now increasingly becoming accepted as a useful tool for decision making in the corporate world.

But how do they actually learn? How does a machine learn something?

Like any little kid, that means they have to learn by experience.
With machine learning, programs analyze thousands of examples to build an algorithm, it then tweaks the algorithm based on if it achieves its goal. over time the program actually gets smarter.

Arthur Samuel coined the phrase not too long after AI, in 1959, defining it as, “the ability to learn without being explicitly programmed.”

Simply put, Machine learning algorithms identify patterns and/or predict outcomes. ML techniques differ from traditional computational approaches, well you see in traditional programming, we spend a lot of time creating a program that uses input data and runs on a computer to produce the output.

Image for post
Image for post
Traditional programming

But in machine learning, the input data and output are fed to an algorithm to create a program. This provides insights that can be used to predict future outcomes.

Image for post
Image for post
Machine learning

Machine Learning methods

There are various methods for machines to learn. Which method to follow completely depends on the problem statement. Depending on the dataset, and our problem, there are three different ways to go deeper, one is supervised learning, the other is unsupervised learning and last reiforcement learning.

In supervised learning, machines learn to predict outcomes with the help of data scientists, it’s about labelling data to learn the function that gets us from input to output.

In unsupervised learning, machines learn to predict outcomes on the go by recognizing patterns in input data. They are presented with totally unlabelled data.

In reinforcement learning, machines learn how to find the best strategy to attain a certain objective.

I KNOW! I still don’t get it too. what does it mean?

An other round of swiping in and out of different websites, helped me come to the best definitions of the three methods.

Supervised learning

Image for post
Image for post
roch scissor paper

The purpose is for the algorithm to learn by comparing
its actual output data with the outputs to find lapses, and modify the model
accordingly. During training for supervised learning, systems are exposed to large amounts of labelled data, for example images of hand figures annotated to indicate which hand shape they are presenting. Given sufficient examples, a supervised-learning system would learn to recognize the clusters of pixels and shapes associated with each hand shape and eventually be able to recognize all shapes, it would be able to reliably distinguish between “rock” and “scissors” in a roch scissor paper game.

And now comes the obvious question,

Actually, It’s divided into two sub parts : classification and regression.

Image for post
Image for post

The main difference between them is that the output variable in regression is numerical (or continuous) dealing with numbers in a range. while for classification, it is categorical (or discrete). Categorical means the output variable is a category, either red or black, apple or orange , diabetic or non-diabetic, etc.

There’s also an other method of supervised learning which to be honest I had a doubt wheather I should mention it or not because only some ressources talked about it, it’s called Ensembling and it means combining the predictions of multiple machine learning models that are individually weak to produce a more accurate prediction on a new sample.

What are the most used supervised Learning algorithms?

Linear regression, logistic regression, K-nearest neighbor classifier (KNN) are all supervised learning algorithms, taking the example of KNN, let’s see what it’s about.

K-nearest neighbor classifier (KNN):

KNN is one of the simplest and laziest ones. It makes its selection based off of the proximity to other data points, as they are more similar in terms of features and hence more likely to belong to the same class as the neighbor.

Image for post
Image for post

Results will depend on which distance is closer to “?” to be able to classify it.

Unsupervised learning

On the other hand, when performing unsupervised learning, the machine is presented with totally unlabeled data so the learning algorithm is left to identify patterns among its input data on it’s own, on the purpose to model the underlying structure or distribution in the data and learn more about it. Since unlabeled data is much much more than labeled data, unsupervised learning has come to be more prized. It is actually called unsupervised because unlike supervised learning above there is no correct answers(output) and there is no teacher(trained model).

Image for post
Image for post
Supervised learning VS unsupervised learning

Unsupervised Learning Algorithms are categorized into two parts :

  1. Clustering : it’s as grouping customers by purchasing behavior. It mainly deals with finding a structure or pattern in a collection of uncategorized data.
Image for post
Image for post
  1. Association : it establishes associations among data objects inside large databases. So basically, it’s about discovering interesting relationships between variables in large databases.

Here’s a photo to demonstrate the differences.

Image for post
Image for post

What are the most used unsupervised Learning algorithms?

K means Clustering, Hierarchical Clustering for clustering. Apriori algorithm for association rule learning problems…etc

K means Clustering is one of the simplest, it groups similar data points together and discovers the underlying patterns. To achieve this objective, K-means looks for a fixed number (k) of clusters in a dataset.

Reifocement learning

Unlike supervised learning, in reiforcement learning machines try different scenarios to discover which actions gets to the greatest reward, rather than being told which actions to take. So in the absence of training dataset, there is no answer, it all depends on the reinforcement agent to make the shots .

Sick of the technical words? Let’s look at things from a simpler view..

Typically, a RL setup is composed of two components, an agent and an environment. RL is best explained through an example of a game. Let’s take the example of chess where the goal of the players is to kill the king in the grid while avoiding to be killed by the other component. The grid is the interactive environment. A player receives a reward for killing the king (winning) and punishment if it gets killed by the other player (losing). The states are the locations and movements of each player with time in the grid and the total cumulative reward is one of the players winning the game.

Let’s be more specific. There are two types of RL: positive and negative

Positive: It’s when an event that occurs because of a certain specific behavior. It increases the strength and the frequency of the behavior and impacts positively on the action taken by the agent. basically, it helps maximizing performance and sustaining change for a more extended period.

Negative: It is defined as strengthening of behavior that occurs because of a negative condition which should have been stopped or avoided. It provides defiance to minimum standard of performance.

Image for post
Image for post
Positive vs negative

What are the most used reinforcement Learning algorithms?

There are two approaches to implement a Reinforcement Learning algorithm one focuses on the model, the other ignores it completely.

Model-Based: Uses experience to build an inner working system of the transitions and outcomes in the environment. Actions are then chosen by searching or planning in this world model. which means if we can define a cost function ourselves, we can calculate the optimal actions using the model directly.

Model-Free: On the other hand, in model free RL, we ignore the model. We depend on sampling and simulation to estimate rewards so we don’t need to know the inner working of the system.

For a quick overview, here’s a photo of what we just saw.

Image for post
Image for post
Machine Learning

What about Deep learning and Neural Networks? where do they fit in ?

When machines can draw meaningful inferences from large volumes of datasets they demonstrate the ability to learn deeply. Deep learning is a subset of machine learning. It attempts to imitate how the human brain can process light and sound stimuli into vision and hearing. It requieres articficial neural networks ANNs, which are like the biological neural networks in humans. these networks contain nodes in different layers that are connected and communicate with eachother to make sense of the voluminous input data.

Software engineering student at Holberton School Tunis