01-Machine Learning

#example #example

1.1 Data Science

Data science is the domain of study that deals with vast volumes of data using modern tools and techniques to find unseen patterns, derive meaningful information, and make business decisions. Data science uses complex machine learning algorithms to build predictive models.

Data Science is a field which incorporates Artificial Intelligence, Data Mining, Big Data, Machine Learning, and Deep Learning.

1.2 Why Data Science

Give you decision making power
One of the most in-demand skills sought by major tech companies globally
Provide ample freelancing opportunities
Job stability for the years to come as AI is being adopted in all major domains.

1.3 AI vs ML vs DL

Artificial Intelligence(AI)
- Smart applications that can perform its own task without any human intervention
- Eg: Self Driving Car, Robots
Machine Learning(ML)
- It provides stats tools to learn, analyze, visualize and develop predictive models from the data.
- Eg: Recommendation System
Deep Learning(DL)
- Mimic humab being - Multi layered Neural network
- Eg: Object detection, Image Recogination, ChatBot

AI vs ML vs DL

1.4 Machine Learning

Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the development of algorithms which allow a computer to learn from the data and past experiences on their own.

Classification of Machine Learning

Supervised Learning
Unsupervised Learning
Semi-Supervised Learning
Reinforcement Learning

1.4.1 Supervised Learning

Supervised learning is a type of machine learning method in which we provide sample labeled data to the machine learning system in order to train it, and on that basis, it predicts the output.

Traing data is both input + output, and based on training it will predict output of new inputs.

Types
- Classification
- Regression

Classification

It is used to identify the category of new observations on the basis of training data. i.e based on the input, the categorical outputs is predicted

Eg: Email Spam Detector

Regression

Regression is used to predict the continuous/real output based on the input and output training data. It predicts continuous/real values such as temperature, age, salary, price, etc.

Eg: Advertisement and sales

1.4.2 Unsupervised Learning

Traing data is only input, and based on training it will create clusters

Types
- Clustering
- Association

1.4.3 Semi-Supervised Learning

Combination of Supervised + Unsupervised Learning

The cost to label the data is quite expensive as it requires the knowledge of skilled human experts. The input data is combination of both labeled and unlabelled data.

1.4.4 Reinforcement Learning

Reinforcement Learning is an area of ML concerned with how intelligent agents ought to take actions in an environment in order to maximize the notation of cumulative reward.

The reinforcement learning process is similar to a human being; for example, a child learns various things by experiences in his day-to-day life.

Types
- Positive Reinforcement Learning
- Negative Reinforcement Learning

1.5 Dataset

The dataset is divided into different types before training or doing any predicting

Training Dataset
- Train the model
Validation Dataset
- Hyper parameter tuining of the model
Test Dataset
- Test the model accuracy

Eg:

TD: Books - Learn Q&A - Train
VD: Different Book - Learn Q&A - Train
TD: Exam Paper

1.6 Errors in Machine Learning

Reducible errors
- These errors can be reduced to improve the model accuracy.
- Types - Bias and Variance
Irreducible errors
- These errors will always be present in the model

Bias

Bias tells about the accuracy of training-dataset

Low Bias - High Accuracy
High Bias - Low Accuracy

Variance

Variance tells about the accuracy of test-dataset

Low Variance - High Accuracy
High Variance - Low Accuracy

1.7 Overfitting and Underfitting

Overfitting - - > Low Bias and High Variance
- For training dataset - accuracy is high
- But for new/test dataset - accuracy is low
Underfitting - - > High Bias and High Variance
- For training dataset - accuracy is low
- Also for new/test dataset - accuracy is low

Overfitting and Underfitting

Generalized Model - - > Low Bias and Low Variance
- For both training and test dataset - accuracy is high
- Model should be of this type

Overfitting and Underfitting