Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Superconductor Critical Temperature dataset is a regression situation where we are trying to predict the value of a continuous variable.

INTRODUCTION: The research team wishes to create a statistical model for predicting the superconducting critical temperature based on the features extracted from the superconductor’s chemical formula. The model seeks to examine the features that can contribute the most to the model’s predictive accuracy.

From iteration Take1, we constructed and tuned machine learning models for this dataset using TensorFlow with five layers. We also observed the best result that we could obtain using the tuned models with the validation and test datasets.

From iteration Take2, we constructed and tuned machine learning models for this dataset using TensorFlow with dropout layers. We also observed the best result that we could obtain using the tuned models with the validation and test datasets.

From iteration Take3, we constructed and tuned a TensorFlow model with five layers using the additional material attributes available for modeling. Furthermore, we also applied the tuned model to a test dataset and observed the best result that we could obtain from the model.

In this Take4 iteration, we will construct and tune a TensorFlow model with dropout layers using the additional material attributes available for modeling. Furthermore, we will apply the tuned model to a test dataset and observe the best result that we can obtain from the model.

ANALYSIS: From iteration Take1, the baseline performance of the TensorFlow algorithm achieved an RMSE benchmark of 11.109. After a series of tuning trials, the TensorFlow model processed the validation dataset with an RMSE score of 10.564. When we applied the TensorFlow model to the previously unseen test dataset, we obtained an RMSE score of 10.540.

From iteration Take2, the baseline performance of the TensorFlow algorithm achieved an RMSE benchmark of 10.580. After a series of tuning trials, the TensorFlow model processed the validation dataset with an RMSE score of 10.905. When we applied the TensorFlow model to the previously unseen test dataset, we obtained an RMSE score of 10.885.

From iteration Take3, the baseline performance of the TensorFlow algorithm achieved an RMSE benchmark of 12.298. After a series of tuning trials, the TensorFlow model processed the validation dataset with an RMSE score of 10.299. When we applied the TensorFlow model to the previously unseen test dataset, we obtained an RMSE score of 10.144.

In this Take4 iteration, the baseline performance of the TensorFlow algorithm achieved an RMSE benchmark of 10.304. After a series of tuning trials, the TensorFlow model processed the validation dataset with an RMSE score of 11.048. When we applied the TensorFlow model to the previously unseen test dataset, we obtained an RMSE score of 10.476.

CONCLUSION: In this iteration, the TensorFlow model with the dropout layers did not appear to have a noticeable effect on the modeling of this dataset. However, we still should consider using the algorithm for further modeling.

Dataset Used: Superconductivity Data Set

Dataset ML Model: Regression with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Superconductivty+Data

One potential source of performance benchmarks: https://doi.org/10.1016/j.commatsci.2018.07.052

The HTML formatted report can be found here on GitHub.