Advances In Solid State Physics

0 views

Skip to first unread message

Ermelindo Klatt

unread,

Aug 4, 2024, 8:01:48 PM8/4/24

to incetdecam

Solidstate physics is a branch of physics that deals with the study of physical properties of solid materials, such as crystals and amorphous solids. It involves understanding the behavior of atoms and molecules in solids, and how they contribute to the overall properties of the material.

Solid state physics has a wide range of applications, including the development of electronic devices such as transistors, integrated circuits, and solar cells. It is also crucial in the study of materials for renewable energy, such as batteries and fuel cells. Other applications include the development of new materials for medicine, construction, and transportation.

A solid state physics book typically covers topics such as crystal structure and symmetry, electronic band structure, thermal properties, and magnetic properties of solids. It also includes discussions on the effects of defects and impurities, as well as the behavior of materials under extreme conditions such as high pressure and low temperatures.

Solid state physics can be a challenging subject, as it requires a strong understanding of concepts in quantum mechanics and thermodynamics. However, with a solid foundation in these areas, it is possible to grasp the principles of solid state physics and its applications.

Some recommended solid state physics books for beginners include "Introduction to Solid State Physics" by Charles Kittel, "Solid State Physics" by Neil W. Ashcroft and N. David Mermin, and "Principles of Condensed Matter Physics" by P.M. Chaikin and T.C. Lubensky. It is also helpful to consult textbooks and resources recommended by your professor or university curriculum.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

In recent years, the availability of large datasets combined with the improvement in algorithms and the exponential growth in computing power led to an unparalleled surge of interest in the topic of machine learning. Nowadays, machine learning algorithms are successfully employed for classification, regression, clustering, or dimensionality reduction tasks of large sets of especially high-dimensional input data.1 In fact, machine learning has proved to have superhuman abilities in numerous fields (such as playing go,2 self driving cars,3 image classification,4 etc). As a result, huge parts of our daily life, for example, image and speech recognition,5,6 web-searches,7 fraud detection,8 email/spam filtering,9 credit scores,10 and many more are powered by machine learning algorithms.

While data-driven research, and more specifically machine learning, have already a long history in biology11 or chemistry,12 they only rose to prominence recently in the field of solid-state materials science.

Traditionally, experiments used to play the key role in finding and characterizing new materials. Experimental research must be conducted over a long time period for an extremely limited number of materials, as it imposes high requirements in terms of resources and equipment. Owing to these limitations, important discoveries happened mostly through human intuition or even serendipity.13 A first computational revolution in materials science was fueled by the advent of computational methods,14 especially density functional theory (DFT),15,16 Monte Carlo simulations, and molecular dynamics, that allowed researchers to explore the phase and composition space far more efficiently. In fact, the combination of both experiments and computer simulations has allowed to cut substantially the time and cost of materials design.17,18,19,20 The constant increase in computing power and the development of more efficient codes also allowed for computational high-throughput studies21 of large material groups in order to screen for the ideal experimental candidates. These large-scale simulations and calculations together with experimental high-throughput studies22,23,24,25 are producing an enormous amount of data making possible the use of machine learning methods to materials science.

Machine learning algorithms have already revolutionized other fields, such as image recognition. However, the development from the first perceptron53,54 up to modern deep convolutional neural networks was a long and tortuous process. In order to produce significant results in materials science, one necessarily has not only to play to the strength of machine learning techniques but also apply the lessons already learned in other fields.

As the introduction of machine learning methods to materials science is still recent, a lot of published applications are quite basic in nature and complexity. Often they involve fitting models to extremely small training sets or even applying machine learning methods to composition spaces that could possibly be mapped out in hundreds of CPU hours. It is of course possible to use machine learning methods as a simple fitting procedure for small low-dimensional datasets. However, this does not play to their strength and will not allow us to replicate the success machine learning methods had in other fields.

One of the major criticisms of machine learning algorithms in science is the lack of novel laws, understanding, and knowledge arising from their use. This comes from the fact that machine learning algorithms are often treated as black boxes, as machine-built models are too complex and alien for humans to understand. We will discuss the validity of the criticism and different approaches to this challenge.

Finally, there have already been a number of excellent reviews of materials informatics and machine learning in materials science in general,13,58,59,60,61,62 as well as some other covering specifically machine learning in the chemical sciences,63 in materials design of thermoelectrics and photovoltaics,64 in the development of lithium-ion batteries,65 and in atomistic simulations.66 However, owing to the explosion in the number of works using machine learning, an enormous amount of research has already been published since the past reviews and the research landscape has quickly transformed.

Here we concentrate on the various applications of machine learning in solid-state materials science (especially the most recent ones) and discuss and analyze them in detail. As a starting point, we provide an introduction to machine learning, and in particular to machine learning principles, algorithms, descriptors, and databases in materials science. We then review numerous applications of machine learning in solid-state materials science: the discovery of new stable materials and the prediction of their structure, the machine learning calculation of material properties, the development of machine learning force fields for simulations in material science, the construction of DFT functionals by machine learning methods, the optimization of the adaptive design process by active learning, and the interpretability of, and the physical understanding gained from, machine learning models. Finally, we discuss the challenges and limitations machine learning faces in materials science and suggest a few research strategies to overcome or circumvent them.

Machine learning algorithms aim to optimize the performance of a certain task by using examples and/or past experience.67 Generally speaking, machine learning can be divided into three main categories, namely, supervised learning, unsupervised learning, and reinforcement learning.

Supervised machine learning is based on the same principles as a standard fitting procedure: it tries to find the unknown function that connects known inputs to unknown outputs. This desired result for unknown domains is estimated based on the extrapolation of patterns found in the labeled training data. Unsupervised learning is concerned with finding patterns in unlabeled data, as, e.g., in the clustering of samples. Finally, reinforcement learning treats the problem of finding optimal or sufficiently good actions for a situation in order to maximize a reward.68 In other words, it learns from interactions.

Finally, halfway between supervised and unsupervised learning lies semi-supervised learning. In this case, the algorithm is provided with both unlabeled as well as labeled data. Techniques of this category are particularly useful when available data are incomplete and to learn representations.69

As supervised learning is by far the most widespread form of machine learning in materials science, we will concentrate on it in the following discussion. Figure 1 depicts the workflow applied in supervised learning. One generally chooses a subset of the relevant population for which values of the target property are known or creates the data if necessary. This process is accompanied by the selection of a machine learning algorithm that will be used to fit the desired target quantity. Most of the work consists in generating, finding, and cleaning the data to ensure that it is consistent, accurate, etc. Second, it is necessary to decide how to map the properties of the system, i.e., the input for the model, in a way that is suitable for the chosen algorithm. This implies to translate the raw information into certain features that will be used as inputs for the algorithm. Once this process is finished, the model is trained by optimizing its performance, usually measured through some kind of cost function. Usually this entails the adjustment of hyperparameters that control the training process, structure, and properties of the model. The data are split into various sets. Ideally, a validation dataset separate from the test and training sets is used for the optimization of the hyperparameters.

Every machine learning application has to consider the aspects of overfitting and underfitting. The reason for underfitting usually lies either in the model, which lacks the ability to express the complexity of the data, or in the features, which do not adequately describe the data. This inevitably leads to a high training error. On the other hand, an overfitted model interprets part of the noise in the training data as relevant information, therefore failing to reliably predict new data. Usually, an overfitted model contains more free parameters than the number required to capture the complexity of the training data. In order to avoid overfitting, it is essential to monitor during training not only the training error but also the error of the validation set. Once the validation error stops decreasing, a machine learning model can start to overfit. This problem is also discussed as the bias-variance trade off in machine learning.70,71 In this context, the bias is an error based on wrong assumptions in the trained model, while high variance is the error resulting from too much sensitivity to noise in the training data. As such, underfitted models possess high bias while overfitted models have high variance.