Personal tools

ML Algorithms for Materials Science

Harvard (Charles River) IMG 7698
(Harvard University - Harvard Taiwan Student Association)

- Overview

Data science has always played an important role in experimental materials research. The classic research process consists of three steps:

  1. Hypothesis: Researchers make hypotheses based on their understanding of the physical system.
  2. Experiment: The researcher selects and conducts experiments to explore the hypothesis.
  3. Analysis: Researchers use data science methods to determine the statistical confidence with which a hypothesis is proven or dis-proven.


The research process has traditionally relied on significant researcher contributions in all three steps, resulting in materials development being slow and expensive. 

It is obvious from this description that historically, the main impact of data science on experimental materials research has been through data analysis. This is no longer the case in modern materials research. 


- Data Science in Materials Research

Over the past few decades, with the increase in computing power and the widespread use of open source tools, data science methods including machine learning (ML) and artificial intelligence (AI) have greatly increased their impact on materials research.

Notably, data science methods are now generally ingrained in all parts of the research process, not just the analysis step. Therefore, research projects that would not be possible without AI/ML are now common in the literature. 

In addition to allowing researchers to ask novel materials questions, a major outcome of the rapid proliferation of data science in materials research is that many scientific questions are now being asked about the best ways to leverage AI/ML in experiments. Materials research company. 

This new avenue of basic research is still in its relative infancy, but is likely to have a huge impact on the field of AI/ML and the development of advanced materials to solve the myriad challenges facing humanity.

- ML Algorithms in Materials Science

Artificial intelligence (AI), particularly machine learning (ML), has been used by a growing number of disciplines to automate complex problem-solving tasks. Advances in machine learning have led to decision rules that in some cases can be automatically derived through specific algorithms.

Presently, there is a lack of efficient data  handling  algorithms  to  extract pertinent information from these data-sets in an (a) automated; (b) fast; (c) user-independent; and (d) error quantified  manner.  Hence, there is a great need for efficient algorithms and data mining routines. 

Here are some ML algorithms used in materials science: regression, clustering, classification, linear regression, logistic regression, decision trees, random forests, support vector machines (SVMs), neural networks. 

Some algorithms are suitable for modeling with small data, including: 

  • Support vector machine
  • Gaussian process regression
  • Random forest
  • Gradient boosting decision tree
  • XGBoost
  • Symbolic regression


ML can help scientists analyze data that was previously inaccessible. It can also help reduce the cost, risks, and time involved in identifying useful materials. For example, ML algorithms can predict phase diagrams for carbon.



[More to come ...]

Document Actions