🎗️ Breast Cancer Prediction with DecisionTree

🔮 The Project aims to ...

This project focuses on training and fine-tuning a Decision Tree Classifier to predict breast cancer outcomes as either positive or negative based on a diverse range of significant attributes.

The dataset used for this project is the Breast Cancer Wisconsin (Diagnostic) Data Set, containing features for the prediction of the class : Malignant(+ve) or Benign(-ve).

More about the work 📔:

A basic overview of breast cancer dataset is covered in 📒notebook-1. Simple plots to show distribution of features.

Built a basic DecisionTree with default parameters and trained on the training dataset in 📒notebook-2.

The least important features found in the previous notebook are then reduced to n optimal dimensions using Principal Component Analysis (PCA) in 📒notebook-3. The top n principal components having the highest eigenvalues are chosen for model training. A sample is shown below explaining data variance by top 3 eigenvectors.

Further along, in 📒notebook-4, hyperparameters are tuned and optimal parameters are then used for the prediction. RESULT: Individual hyper-parameter training show better results than GridSearch CV.

Prediction performance of the ✨tuned model:

To understand how the tuned model works and how it is making predictions, in 📒notebook-5, SHAP library is used for Model Interpretability. Global and The SHAP library is used to achieve Model Interpretability, enabling both global and local analyses of the optimized model's behavior. The Decision Plot below illustrates how individual features contribute to the prediction process, providing a clear understanding of the model's decision-making logic.

📝 Installation Guide (Building Predictions)

Clone the repository

git clone https://github.com/PragyanTiwari/Breast-Cancer-Prediction-with-DecisionTree-Classifier.git

Using Makefile :

# install uv if not
pip install --upgrade uv

# to create virtual env
make create_environment

# install python dependencies
make requirements

 # build predictions
make breast_cancer_prediction

Using uv (If not Makefile):

# to create virtual env
uv venv

# install python dependencies
uv add --requirements 'requirements.txt' --dev

 # build predictions
uv run make_predictions

❕The output will be saved as predictions.csv in data\result dir.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.readme-utils		.readme-utils
breast_cancer_prediction		breast_cancer_prediction
data		data
docs		docs
figures		figures
models		models
notebooks		notebooks
references		references
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
workflow.py		workflow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎗️ Breast Cancer Prediction with DecisionTree

🔮 The Project aims to ...

More about the work 📔:

📝 Installation Guide (Building Predictions)

About

Releases

Packages

Languages

License

PragyanTiwari/Breast-Cancer-Prediction-with-DecisionTree-Classifier

Folders and files

Latest commit

History

Repository files navigation

🎗️ Breast Cancer Prediction with DecisionTree

🔮 The Project aims to ...

More about the work 📔:

📝 Installation Guide (Building Predictions)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages