Article
Open access
Published: 21 November 2024

Dielectric tensor prediction for inorganic materials using latent information from preferred potential

Zetian Mao¹,
WenWen Li² &
Jethro Tan²

npj Computational Materials volume 10, Article number: 265 (2024) Cite this article

724 Accesses
4 Altmetric
Metrics details

Subjects

Abstract

Dielectrics are crucial for technologies like flash memory, CPUs, photovoltaics, and capacitors, but public data on these materials are scarce, restricting research and development. Existing machine learning models have focused on predicting scalar polycrystalline dielectric constants, neglecting the directional nature of dielectric tensors essential for material design. This study leverages multi-rank equivariant structural embeddings from a universal neural network potential to enhance predictions of dielectric tensors. We develop an equivariant readout decoder to predict total, electronic, and ionic dielectric tensors while preserving O(3) equivariance, and benchmark its performance against state-of-the-art algorithms. Virtual screening of thermodynamically stable materials from Materials Project for two discovery tasks, high-dielectric and highly anisotropic materials, identifies promising candidates including Cs₂Ti(WO₄)₃ (band gap E_g = 2.93eV, dielectric constant ε = 180.90) and CsZrCuSe₃ (anisotropic ratio α_r = 121.89). The results demonstrate our model’s accuracy in predicting dielectric tensors and its potential for discovering novel dielectric materials.

Machine learning and atomistic origin of high dielectric permittivity in oxides

Article Open access 14 December 2023

High dielectric ternary oxides from crystal structure prediction and high-throughput screening

Article Open access 06 March 2020

Generative learning facilitated discovery of high-entropy ceramic dielectrics for capacitive energy storage

Article Open access 10 June 2024

Introduction

Dielectrics are an essential group of materials identified by their distinct electronic characteristics. They find extensive applications in various fields such as photovoltaics¹, energy storage^2,3, microwave communication^4,5, etc., in contemporary society. When subjected to an external electric field, a dielectric material is polarized by internally generating electric dipole moments. The dielectric constant ε, also known as permittivity, is a measure of such dielectric effect described by the proportionality between the externally applied electric field to the generated field within the material:

$${E}_{i}^{{\rm{int}}}=\sum _{j}{\varepsilon }_{ij}^{-1}{E}_{j}^{{\rm{ext}}}$$

(1)

where E^int and E^ext are the internal and external electric, respectively, and the indices i, j ∈ {1, 2, 3} refer to the directions in the 3D space. The 3 × 3 dielectric tensor has a minimum of 1 and a maximum of 6 independent elements for various types of systems due to the crystal symmetry. It can be represented by a sum of two components: the electronic contribution (ε^∞) and ionic contribution (ε⁰), which are generated by electron cloud distortion and atomic displacement, respectively, i.e., ${\varepsilon }_{ij}={\varepsilon }_{ij}^{\infty }+{\varepsilon }_{ij}^{0}$. Both high and low dielectric constants play an essential role in material design. For instance, high-κ dielectrics enable higher-density charge storage to shrink device sizes, while low-κ dielectrics are important to device packaging by minimizing signal interference between neighboring interconnects.

As of the present, only a limited number of materials in few hundreds have had their dielectric constants measured⁶ because of the requirement of huge experimental efforts. In the past decades, first-principle calculations played a critical role for accelerating the discovery of new materials⁷. Density functional theory (DFT)^8,9 has gained significant popularity due to its commendable balance between computational speed and accuracy for simulating materials in silico. This has, in turn, facilitated the emergence of a wide range of diverse new materials^10,11. Similarly, density functional perturbation theory (DFPT) serves as a simulation tool for establishing the relationship between structure and dielectric properties, and the calculated data in the order of thousands are publicly accessible for material exploration^6,12. However, the DFT-based methods entail significant computational cost, making them less feasible for high-throughput screening, and it is only applicable to small systems with the number of atoms typically less than 1000¹³, since its complexity scales poorly with the number of particles. Alternatively, machine learning is becoming an important research tool, providing an alternative method for quantitative structure property relationship (QSPR) analysis for material science, There are numerous successful works in designing materials for various applications, including fluorescent molecules¹⁴, electrets^15,16, DDR1 kinase prohibitors¹⁷, thermal-electric materials¹⁸, etc.

Several existing works made efforts to make predictions for dielectric property directly from crystal/molecule descriptors by using the machine learning model. Umeda et al. adopted a random forest (RF) model to estimate experimental dielectric constants with an error range as large as 50% in the logarithmic scale on the purpose of developing ferroelectric material¹⁹. Takahashi et al. studied the prediction for static dielectric constants of metal oxides and analyzed the importance of descriptors using RF models. Lin et al. employed a simple Gradient boosting regression (GBR) model to predict the electronic contribution of polycrystalline dielectric constants with RMSE of 0.824 and did analysis on feature importance for ABO₃-type compounds²⁰. Similarly, Kim et al. used a GBR model to predict both the electronic and ionic contribution of ABO₃-type perovskites, achieving RMSE of 0.12 and 0.26 respectively²¹. Although these works proved the strong ability of machine learning models to rapidly estimate dielectric properties on unseen materials and are undoubtedly helpful to guide material designers before conducting wet experiments, there are still several limitations that hinder their full potential for exploration in the dielectric material space. These tasks focus on the average polycrystalline dielectric constants approximated from the dielectric tensor, given by

$$\varepsilon =\mathop{\sum }\limits_{i=1}^{3}{\lambda }_{i}/3$$

(2)

where λ_i is the eigenvalue of ${{\boldsymbol{\varepsilon}}}_{}^{}$. The Matbench dataset²² includes the dielectric task for benchmarking various material property models. The property in this task is the refractive index (η) which is related to the electronic dielectric constant, as given by $\eta =\sqrt{{\varepsilon }^{\infty }}$. Various GNN-based models are evaluated on this task. However, these works ignored the inherent anisotropy in crystal materials. Additionally, the off-diagonal elements in the dielectric tensor may reach large values and sometimes cannot be neglected in practical²³, thus prediction of the dielectric constants as tensors assists human researchers to understand the behavior of materials in the presence of a directed electric field. To distinguish, dielectric constants as scalars and as tensors are denoted with ε and ${{\boldsymbol{\varepsilon }}}_{}^{}$, respectively. ${{\boldsymbol{\varepsilon }}}_{}^{}$ could provide crucial comprehensive measures for the designs of electronic devices in various applications.

In recent years, graph neural networks (GNN) has been becoming a powerful tool for rapid atomic simulation in material science^24,25 as graphs are a natural representation for crystals and molecules where nodes are atoms and edges are bonds between atom pairs. Many great GNN-based models^{26,27,28,29,30,31} are proposed and demonstrated their efficacy in accurately establishing a quantitative structure-property relationship. Recent works proposed the application of equivariant graph neural networks (EGNN) in materials science^31,32 and discussed the necessity of equivariant representations to enhance model expressivity. TeaNet³² is an EGNN architecture that satisfies Euclidean group O(3) equivariance including translation, rotation and reflection. Based on TeaNet with certain modifications, PreFerred Potential (PFP)³³ is a universal neural network potential (NNP) model trained by massive data and has been released for multiple versions. What especially makes PFP robust for universal interatomic interactions is its enhanced representations of the local electronic environments centered around atoms and bonds by converting raw input data (atomic numbers and positions) to rank-0,1,2 tensors (i.e., scalars, vectors and tensors). Pretrained on the original dataset with 22 million diverse structures (calculated by 1144 GPU years), the up-to-date PFP covers 72 elements in the periodic table for empirical universal interatomic potential simulation with strong transferability as the time of this study³⁴. We refer readers to³³ for more details about this original training dataset across all systems including molecular, crystal, slab, cluster, adsorption, and disordered structures. Its abundant intermediate interatomic information in pretrained layers of graph convolutional networks (GCN) and inclusion of equivariant rank-2 tensor features make it potentially suitable to transfer to diverse tensorial regression tasks with relatively limited datasets such as dielectric tensors. During the same period as this study, Lou et al.³⁵ proposed using equivariant graph neural networks to learn irreducible representations for predicting dielectric materials with highly anisotropic electronic dielectric tensors. In contrast, our study leverages transfer learning of pretrained geometric features across properties with varying tensor orders to enable dielectric tensor predictions over a broader numerical range. Moreover, instead of focusing exclusively on electronic components, we evaluate our models on electronic, ionic, and total dielectric tasks, demonstrating state-of-the-art performance in predicting both scalar and tensor properties.

In this study, we propose Dielectric Tensor Neural Network(DTNet) to predict the three types of dielectric tensors, ${\varepsilon }_{ij}^{\infty }$, ${\varepsilon }_{ij}^{0}$ and ε_ij, for inorganic materials across 72 supported elements with the equivairance to the input structure. By leveraging a pretrained PFP model as an efficient and expressive encoder, we demonstrated that the pretrained PFP processes a rich set of high-order latent compositional and structural information for tensorial property prediction. The ablation study shows that both latent atomic features and bond features from PFP contribute to the final prediction. To evaluate the performance of DTNet, we compared it against existing methods, PaiNN³¹, M3GNet³⁶ and MatTen³⁷, on the dataset obtained from Materials Project (MP)³⁸. Finally, we applied our model to virtual screening for two discovery tasks: identifying high-dielectric materials and highly anisotropic dielectrics.

Results

Overview

The challenge of GNN in materials science is that they usually require a large number of data to prevent models from overfitting, while such chemical data are generally expensive to collect either from simulation or experiments. Additionally, the numbers of calculated data for diverse properties can have large difference. For example, by the time of this work, there are more than ~ 154k DFT-relaxed material structures with energies in MP, but only ~ 7k (<5%) of them are available with dielectric constant data. In this work, we took a PFP as the parent encoder model, i.e., N = 5. And for convenience, we denote the PFP with the n(1 ≤ n ≤ 5)-th GCN layer as featurizer as PFP-L₁, …,PFP-L₅. Our method leverages a PFP model consisting of 5 GCN layers, which was pre-trained on 22 million first-principle calculations of potential energies, as a multi-order feature encoder to generate universal compositional and structural information for the second-order dielectric property prediction (see Fig. 1). We design a separate model (see Fig. 2) to capture interactions among the intermediate knowledge from PFP-L_n and make predictions on downstream tasks without the need to finetune parameters in the parent PFP model. This design ensures the effective utilization of high-order common structural and compositional information, leading to improved performance even with a relatively small region of available training data. Details about the module implementation are available in the Methods Section.

**Fig. 1: Diagram of the equivariant model for dielectric tensor prediction.**

Fig. 2: Schematic architecture of the equivariant dielectric model *DTNet.*

Equivariance

Atomic systems are usually represented with coordinates in the 3D Euclidean space. The same system could have transformations under the 3D symmetries: rotations, translations, and inversion. Under orthogonal transformations in 3D space, certain properties of the system remain invariant, e.g., energy, meaning they do not change regardless of the rotation. Equivariant properties, including forces, stresses, and dielectric tensors, are expected to transform accordingly with the system. In other words, as the system rotates or reflects, these properties undergo corresponding transformations to maintain their relationship with the system’s new orientation. Formally, we consider a function ϕ : X → Y and a group transformation G_T that acts on the vectors spaces X and Y. The function ϕ is equivariant to the transformation G_T if the following condition holds for every element x ∈ X : $\phi ({G}_{T}(x))=G^{\prime} \,(\phi (x))$, where G_T(x) represents the group transformation applied to the input x, and $G^{\prime} (\phi(x))$ represents a related transformation applied to the output ϕ(x). Here, we focus on the rotation/inversion equivariance of the second-rank dielectric tensor ${{\boldsymbol{\varepsilon }}}_{}^{}$ with respect to the input vector space ${\mathcal{X}}$ (Fig. 3), whose equivariant condition is $R\Phi ({\mathcal{X}}){R}^{T}=\Phi (R{\mathcal{X}})$ where R is the arbitrary orthogonal matrix and Φ is the neural networks to map ${\mathcal{X}}$ to ${{\boldsymbol{\varepsilon }}}_{}^{}$.

Dielectric tensor prediction covering 72 elements

Data preparation

We leverage on one of the largest open databases of DFT-calculated crystal structure properties, Materials Project³⁸, to obtain a wide range of training structures. The dataset (MP v2023.11.1) contains 7277 relaxed materials along with calculated dielectric constants. We clean up the dataset by restricting the element in total dielectric constants ε_ij ∈ [− 10, 100] for ∀ i, j∈{1,2,3} and filtering out structures that contain elements not supported by PFP. Finally there are 6648 structures retained for model training. The data covers the electronic component ε^∞, ionic component ε⁰, and total component ε with its scalar values in the range of [1.0, 96.688], [0.0, 90.338] and [1.155, 98.889], respectively (Fig. 4a). Figure 4b exhibits that all seven types of crystal systems are included in our dateset. Among them, the orthorhombic crystal system is the most prevalent, comprising 1636 instances. In contrast, the triclinic and hexagonal systems are the least represented, with only 440 and 443 instances, respectively. The version of PFP architecture introduced in this work supports 72 elements in total, covering a wide range of elements across the periodic table. Excluding noble gas elements, He, Ne, Ar, and Kr, 68 elements are covered in the 6648 available structures (Fig. 4c).

**Fig. 4: Distribution of cleaned data from the MP dataset.**

The above dataset was partitioned into training, validation, and test subsets in a 0.8:0.1:0.1 ratio for the experimental analysis. To mitigate the impact of random variation, each experiment was independently performed five times with distinct random data splits. The outcomes are presented as the mean and standard deviation of these five iterations.

Performance of DTNet

We start by investigating the appropriate intermediate GCN layer of PFP as the input to DTNet. Note that neuron parameters related to ${{\boldsymbol{a}}}_{v}^{5}$, ${{\boldsymbol{a}}}_{t}^{5}$ and ${{\boldsymbol{b}}}_{v}^{5}$ in the 5-th GCN layer are not trained in PFP because only ${{\boldsymbol{a}}}_{s}^{5}$ and ${{\boldsymbol{b}}}_{s}^{5}$ are used as readout features. Hence, for the multi-order features generated from the PFP-L_n, we take ${{\boldsymbol{a}}}_{s}^{n}$, ${{\boldsymbol{a}}}_{v}^{n}$, ${{\boldsymbol{a}}}_{t}^{n}$, ${{\boldsymbol{b}}}_{s}^{n}$, and ${{\boldsymbol{b}}}_{v}^{n}$ for PFP-L_n(1 ≤ n ≤ 4), and we take ${{\boldsymbol{a}}}_{s}^{5}$, ${{\boldsymbol{a}}}_{v}^{4}$, ${{\boldsymbol{a}}}_{t}^{4}$, ${{\boldsymbol{b}}}_{s}^{5}$, and ${{\boldsymbol{b}}}_{s}^{4}$ for the PFP-L₅. We have performed hyperparameter optimization for the DTNet architecture via the optuna framework³⁹. The hyperparameter configurations are detailed in Method Section. Finally, 2 equivariant blocks (see Fig. 2b) are stacked and combined with the PFP-L_n. For a batch of B test structures, four mean absolute error (MAE) metrics for the symmetric dielectric tensor, the tensor MAE_ten, mean diagonal MAE_mean-diag, diagonal MAE_diag, and off-diagonal MAE_off-diag, are defined respectively as:

$${{\rm{MAE}}}_{{\rm{ten}}}=\frac{1}{6B}\sum ^{B}\sum _{1\le i\le 3,i\le j\le 3}\left| {\varepsilon }_{ij}^{{\rm{dft}}}-{\varepsilon }_{ij}^{{\rm{pred}}}\right|$$

(3)

$${{\rm{MAE}}}_{{\rm{mean}}-{\rm{diag}}}=\frac{1}{B}\sum ^{B}\left| \frac{1}{3}\sum _{1\le i=j\le 3}{\varepsilon }_{ij}^{{\rm{dft}}}-\frac{1}{3}\sum _{1\le i=j\le 3}{\varepsilon }_{ij}^{{\rm{pred}}}\right|$$

(4)

$${{\rm{MAE}}}_{{\rm{diag}}}=\frac{1}{3B}\sum ^{B}\sum _{1\le i=j\le 3}\left| {\varepsilon }_{ij}^{{\rm{dft}}}-{\varepsilon }_{ij}^{{\rm{pred}}}\right|$$

(5)

$${{\rm{MAE}}}_{{\rm{off}}-{\rm{diag}}}=\frac{1}{3B}\sum ^{B}\sum _{1\le i\le 3,i < j\le 3}\left| {\varepsilon }_{ij}^{{\rm{dft}}}-{\varepsilon }_{ij}^{{\rm{pred}}}\right|$$

(6)

These MAE metrics help assess the accuracy of the model predictions for the dielectric tensor properties. In this work, all models are trained using MAE_ten as the loss function. Due to structural symmetry, the number of independent elements in the dielectric tensor varies according to system types (see Table 1). Figure 5a, b shows that generally the intermediate knowledge from the deeper GCN layer contributes more to the dielectric prediction. When PFP-L₄ and PFP-L₅ are taken as the feature generator for the child model, lower MAE_ten prediction errors are observed on the test structures. This could be explained by that the shallower layers only capture shorter-range local interactions. Meanwhile, quite close accuracy between n = 4 and n = 5 is because only their scalar representations are distinct. Considering the better computational efficiency, PFP-L₄ is taken as the feature generator and is utilized for the subsequent study. The comparison of error metrics between models utilizing transfer learning and those trained from scratch highlights the significance of incorporating compositional and structural encodings from PFP. Specifically, the use of pretrained PFP-L₄ resulted in a reduction of 10.0% for RMSE and 14.1% in MAE compared to a PFP-L₄ model with the same architecture but trained with all parameters initialized from scratch. Figure 5c–e shows cases the performance of DTNet on each crystal system for the prediction task of the electronic, ionic and total dielectric constants. Three models DTNet-ε^∞, DTNet-ε⁰ and DTNet-ε are separately trained with their MAE_ten, MAE_mean-diag, MAE_diag, MAE_off-diag evaluated. Dielectric tensors of cubic, hexagonal, orthorhombic, tetragonal and trigonal systems hold 0 on their off-diagonal elements due to geometric symmetry. The zero off-diagonal errors for them demonstrates the our model keeps equivariance for correct prediction of 0 on all their off-diagonal elements. Additionally, the same MAE_ten and MAE_mean-diag of the cubic system further confirm the output equivariance is guaranteed due to its symmetric structure along all three directions in the Euclidean space. The model performs uniformly for different crystal systems according to all MAE_ten in the range of [0.27, 0.57] for ${{\boldsymbol{\varepsilon }}}_{}^{\infty },\,[1.07,1.93]$ for ${{\boldsymbol{\varepsilon }}}_{}^{0}$ and [1.41, 2.14] for ${{\boldsymbol{\varepsilon }}}_{}^{}$.

Table 1 Number of independent components in the dielectric tensor for different crystal systems

Full size table

**Fig. 5: Performance of *DTNet* models on dielectric data from Materials Project.**

Figure 6a, b illustrates the performance of DTNet by calculating the polycrystalline dielectric constants ε and tensor eigenvalues λ_i from the predicted dielectric tensors, with the eigenvalues being crucial for evaluating anisotropy. The ground trues of ε and λ_i in the test set fall within the range of [2.129, 98.889] and [1.460, 98.889], respectively. DTNet provides predictions that closely match DFT reference values, with MAE of 2.568 for ε and 3.740 for λ_i, within a prediction range of 96.760 and 97.429, respectively. To further validate the accuracy of our method, we evaluated its performance on the dielectric task using the Matbench dataset²², which is a subset of structures from the Materials Project. In Matbench, we assessed our method by calculating refractive index η from the dielectric tensors predicted by DTNet.

**Fig. 6: Performance of *DTNet* on the test set of our refined MP dataset for two distinct tasks.**

Figure 7 compares the mean MAEs with error bars representing standard deviations and the mean RMSEs across 5-fold runs, as defined by the Matbench pipeline. DTNet surpasses all other methods published on the official Matbench leaderboards for the dielectric task, achieving an average MAE of 0.2371 with a standard deviation of 0.0765, and an RMSE of 1.6830 over the 5-fold test data. In comparison, the top-performing model MODNet⁴⁰ reported on Matbanch has an MAE of 0.2711 with a standard deviation of 0.0714 and an RMSE of 1.6832.

**Fig. 7: Benchmarking our approach against other algorithms on the Matbench dataset.**

Model comparison for dielectric tensor prediction

At the time of conducting this work, there were limited open-source works that discussed the prediction of tensorial properties of materials. The models of PaiNN³¹, M3GNet³⁶ and MatTen³⁷ are chosen for comparison in this work. They use different approaches to produce rank-2 equivariant tensors by incorporation of equivariant atomwise representations, partial derivative with respect to the Cartesian tensor, and spherical harmonics expansion, respectively. PaiNN encodes each atom i with an invariant representation ${{\boldsymbol{s}}}_{i}^{}$ and an equivariant vector representation ${{\boldsymbol{v}}}_{i}^{}$. Incorporating ${{\boldsymbol{v}}}_{i}^{}$ has been demonstrated to be effective for structure recognition through ablation studies. The dielectric tensor in PaiNN is constructed using a rank-1 tensor decomposition, and the atom positions ${{\boldsymbol{r}}}_{i}^{}$ are incorporated to hold the global structure information:

$${\boldsymbol{\varepsilon }}=\mathop{\sum }\limits_{i=1}^{{N}^{a}}s({{\boldsymbol{s}}}_{i}){{\boldsymbol{I}}}_{3}+\nu ({{\boldsymbol{v}}}_{i})\otimes {{\boldsymbol{r}}}_{i}+{{\boldsymbol{r}}}_{i}\otimes \nu ({{\boldsymbol{v}}}_{i})$$

(7)

where ${{\boldsymbol{I}}}_{3}^{}$ is an identity matrix with size of three, s and ν are invariant and equivariant nonlinearities, respectively. The invariant M3GNet model is able to generate equivariant 3 × 3 tensor from the partial derivatives of its scalar output using the lattice matrix information. This scalar output is not directly trained as a target variable, and instead we train the model based on the loss computed from the derivative values of this output. Specifically, we calculate the loss of MAE_ten between these derivative results and labeled targets. After training, these derivative values are utilized for dielectric prediction. In MatTen, the atom feature contains a set of high-rank tensors produced by spherical harmonics, namely, a geometric object consisting of a direct sum of irreducible representations of the SO(3) rotation group. Its interaction blocks follow the design of Tensor Field Network⁴¹ and NequIP³⁰. The second-order features of all atoms are ultimately aggregated as the prediction of dielectric tensor. To investigate the necessity of including edge-related representations generated by PFP-L₄, we trained two models, DTNet and DTNet-simple. DTNet leverages all latent embeddings $({{\boldsymbol{a}}}_{s}^{n},{{\boldsymbol{a}}}_{v}^{n},{{\boldsymbol{a}}}_{t}^{n},{{\boldsymbol{b}}}_{s}^{n},{{\boldsymbol{b}}}_{v}^{n})$, while DTNet-simple only utilizes $({{\boldsymbol{a}}}_{s}^{n},{{\boldsymbol{a}}}_{v}^{n},{{\boldsymbol{a}}}_{t}^{n})$ and the layers related to Eq. (14) and Eq. (17) are removed.

Table 2 provides a comparison of the three aforementioned models with DTNet and DTNet-simple. In the prediction task across different systems, the models are trained on the entire training data, while the test structures are categorized according to crystal systems. It is shown that DTNet performs best in 22 among all 24 tasks in comparison with all models. The aggregation of potential edge information showcases significant improvements in accuracy, as evidenced by DTNet beating DTNet-simple in 18 out of the 24 tasks. It should be noted that DTNet-simple still gives much better accuracy than the other 3 models, outperforming 22 tasks. Only two tasks lost to M3GNet when making ionic and total dielectric tensor predictions on triclinic systems. Hence, the latent edge scalar and vector features provide additional useful geometric information which benefits the dielectric prediction.

Table 2 Performance comparison of models across all crystal systems

Full size table

Discovery of dielectric materials

Preparation of candidate structures

The DTNet model has been proven effective predicting dielectric tensors which account for the direction and magnitude of the electric field. To prepare the candidate set for screening, we downloaded 14,375 non-metal materials from the MP database. These materials are specifically selected based on the energy above convex hull E_hull = 0 to estimate their thermodynamic stability, so that only stable materials are included in the candidate set. The candidates have unknown dielectric constant properties, and there is no overlap between this candidate set and our training dataset. We aim to leverage our models for virtual screening on all these candidates to identify potential materials for high-dielectric materials and highly anisotropic materials, incorporating both electronic and ionic contributions. To improve accuracy and assess model uncertainty, we use ensemble results from the five DTNet models trained in the previous section. The expected property μ(χ) is estimated by averaging the predictions, while the uncertainty σ(χ) is quantified by the standard deviations. We conducted iterative active learning to identify top candidates through virtual screening. See details in the Method section.

High-dielectric materials

Dielectrics are crucial functional materials for microelectronic device manufacturing. Materials with high dielectric constants have the potential to enable higher energy density storage, leading to performance enhancement and device miniaturization⁴². Meanwhile, large band gaps (E_g) are also desirable as they prevent leakage currents when materials are exposed to large electric fields, especially at nanoscale thicknesses. Hence, novel dielectric materials with both high dielectric constants ε and large band gaps E_g are highly preferred for such applications. To represent the capacity of electric energy storage, dielectric tensors ${{\boldsymbol{\varepsilon }}}_{}^{}$ can be simplified to scalar quantities by Eq. (2).

It is known that there exists inverse relationship between the electronic part of dielectric constants ε^∞ and band gaps E_g. On the other hand, the ionic contribution ε⁰ does not show a clear correlation with E_g. The theoretical background of this phenomenon is explained in reference⁴³. This trend is also observed in the materials present in the training set as shown in Fig. 8a, b. To showcase the capability of our model in capturing such latent quantum mechanism knowledge underlying chemical structures, we employ the DTNet-${\varepsilon }_{}^{\infty }$ and DTNet-${\varepsilon }_{}^{0}$ models to predict the two dielectric constant components for the 14,375 candidates, respectively. The density of data points are drawn in contours for structures in the training set (red) and candidate set (blue) using the kernel density estimation. Despite the fact that the model does not learn explicitly from band gap data, the distribution of model-predicted electronic dielectric constants of the candidates and their DFPT-computed band gaps from MP indicates the model’s capacity to comprehend this negative relationship between the two properties through materials’ graph representations. This phenomenon can be explained by the interplay between electronic polarization and electronic excitations. Materials with smaller band gaps allow for easier excitation of electrons to higher energy levels, leading to higher electronic polarizability and larger electronic dielectric constants. Conversely, materials with larger band gaps hinder electron excitation, resulting in lower electronic dielectric constants⁴⁴. In contrast to the case of ${\varepsilon }_{}^{\infty }$, no discernible correlation is observed between DTNet-${\varepsilon }_{}^{0}$-predicted ionic part and band gaps. From a microscopic perspective, the ionic dielectric constant is influenced by the displacement of cations and anions from their equilibrium positions under the influence of electric fields. The strength of bonds or phonon frequencies is highly sensitive to variations in bond lengths. Consequently, materials with similar compositions can exhibit a wide range of ε⁰ due to differences in bond lengths⁴⁵. The weak constraint of E_g to ε⁰ along with the broad value spectrum of ε⁰ present the opportunity for the exploration and discovery of novel dielectric materials characterized by both high ε and E_g.

**Fig. 8: ML-predicted and DFT-calculated dielectric constants for the candidate materials.**

We conducted the virtual screening by employing DTNet-ε to predict total dielectric constant scalar ε for all structures in the candidate set. Following the previous work⁴⁵, E_g ⋅ ε is assigned as the figure of merit (FOM) to quantify the performance of dielectrics, since E_g and ε are approximately proportional to the logarithm of the leakage current density⁴⁶. To validate the promising materials identified by DTNet, we conducted a three-round active exploration, with each round proposing 20 top candidates for validation. The details of virtual screening and DFPT configurations are available in the Method section. Finally, totally 35 out of the 60 structures across 3 iterations are successfully converged through first-principle calculations.

Figure 8c illustrates the distribution of the FOM for stable materials in both the training set and screened candidate set. The thermaldynamically stable material with the highest FOM in the training dataset is Sn₂ClF₃ (mp-29252, E_g = 3.56eV, ε^∞ = 3.81, ε⁰ = 72.77, ε = 76.55). We successfully identified 3 new materials, Cs₂Ti(WO₄)₃ (mp-1226157, E_g = 2.83eV, ε^∞ = 5.34, ε⁰ = 175.55, ε = 180.89), RbNbWO₆ (mp-1219587, E_g = 2.93eV, ε^∞ = 5.27, ε⁰ = 150.30, ε = 155.57) and Ba₂SmTaO₆ (mp-1214622, E_g = 3.36eV, ε^∞ = 4.85, ε⁰ = 88.96, ε = 93.81), with structures visualized in Fig. 8d. Their FOM scores are higher than any calculated stable structures in our training database.

Highly anisotropic dielectrics

Crystalline dielectrics can exhibit anisotropic dielectric properties w.r.t. the direction of the applied electric field due to their cell structures and atomic arrangements. Crystals with highly anisotropic dielectric properties can demonstrate unique characteristics, such as enhanced piezoelectricity, ferroelectricity⁴⁷, and optical performance⁴⁸. Lou et al. explored the discovery of crystals with high anisotropy in high-frequency dielectric tensors, focusing solely on the electronic contribution³⁵. In this task, we tackle the more challenging task of discovering highly anisotropic static dielectric tensors, incorporating both electronic and ionic contributions, to achieve significantly greater anisotropy under the static external electric field.

Given the limited number of training samples with high anisotropy, we conducted a 3-round active learning for virtual screening using the same strategy to more effectively identify marginal candidates exhibiting high anisotropic ratio defined by ${\alpha }_{r}={\lambda }_{\max }/{\lambda }_{\min }$³⁵. All molecular crystalline candidates were excluded prior to virtual screening due to their typically high anisotropy and potential instability under the conditions of interest. For example, we evaluated the molecular crystal BrCl (mp-1066781). Despite exhibiting a high total anisotropy of α_r = 26.774, its melting point is as low as −66 °C, making it unstable under room temperature conditions. Consequently, 41 out of the 60 screened structures were successfully validated via DFPT computations. Figure 9a presents the distribution of anisotropic ratios α_r. Notably, only 17 structures in the training dataset exhibit α_r exceeding 10, with the highest of 35.027. In contrast to the average α_r of 1.665 observed in stable crystals within the dataset, the three iterative rounds produced significantly improved average α_r of 5.608, 12.946, and 17.687, respectively. The top three DFPT-validated candidates were identified during the second and third iterations, with their structures visualized in Fig. 9b. Notably, CsZrCuSe₃ and BaNiO₃ are 3D materials, which may offer greater robustness and suitability for industrial applications compared to the 1D material SeI₂. The ionic components significantly contribute to the anisotropy in all three materials. For CsZrCuSe₃, the eigenvalue ranges for the electronic, ionic, and total contributions are [5.152, 12.851], [4.306, 1141.694], and [9.458, 1152.884], respectively, achieving electronic, ionic and total ratios of 2.494, 265.140 and 128.890, respectively. The anisotropic ratios for SeI₂ are 2.745 (electronic), 316.090 (ionic), and 96.763 (total), and for BaNiO₃ they are 1.749 (electronic), 92.904 (ionic), and 61.026 (total). Compared to the largest α_r = 35.027 in the training set (BrNO₃, mp-29526), CsZrCuSe₃ is 93.863 larger and 2.680 times greater. A recent research has also highlighted the excellent optoelectronic and thermoelectric properties of CsZrCuSe₃⁴⁹, underscoring its great potential for practical applications.

Fig. 9: DFT-validated highly anisotropic candidates proposed by *DTNet.*

Discussion

Transfer learning is the most popular approach to tackle the data bottleneck which is common in materials science. Previous studies have utilized transfer learning in GNN models, both for the same properties and across different properties, to demonstrate enhanced accuracy in materials science^15,50,51. For example, Chen et al. developed the AtomSets framework⁵², which employs a straightforward model architecture to process structure-embedded scalar information read from a pretrained MEGNet model²⁶, enabling high accuracy predictions with limited datasets. However, the feasibility of transfer learning for properties across different tensorial orders has been less explored. Our study indicates that more than scalar features, it is also possible to transfer higher-rank equivariant compositional and structural information from a pretrained EGNN energy model to a different tensorial property prediction with a small amount of available data, while maintaining the equivariance. These equivariant representations may hold shared structural or even electronic-level information, since the PFP model aims to emulate iterative electronic relaxation. This is evident through the accurate predictions our model achieved for both electronic and ionic dielectric constants, which are produced by different microscopic mechanisms. Furthermore, the pretrained PFP serves as an encoder that integrates electron density information, assisting in comprehending the electronic-level relationship between the E_g and ε^∞. Apart from dielectric constants, several works attempted to predict tensorial properties of materials such as the polarizability⁵³, quadrupole⁵⁴, by using equivariant architectures. The accuracy of such tasks could potentially be improved by fixing parameters in the shallower layers of an EGNN model pretrained on commonly computed properties like energy.

The PFP model is pretrained on over 22 million structures covering the diverse chemical space. It is efficient and expressive to take a a pretrained PFP as the parent model to generate locally-interacted features as the input to a much simpler child model. In the current readout architecture, it consists of only 0.6 million trainable parameters for the downstream tasks of dielectric constant prediction. The model performing best on the validation accuracy appears at the average epoch of 263 out of 5 training runs for total dielectric constants. By stopping the training after 200 epochs without improvement in validation accuracy, the model training can be efficiently completed within approximately 2 hours using about 6GB memory on an Nvidia Tesla-V100 GPU. We proved that such efficient training on encodings from the parent model can bring a promising property predictor even on a small data region as demonstrated in Table 2. The incorporation of second-order equivariant features in our model may also contribute to high data efficiency as the symmetry of the dielectric tensors is automatically guaranteed due to the model architecture. Based on only 6.6k sample structures with labeled tensor containing up to 6 independent elements from MP, our model achieved a mean tensor prediction error of 0.368 for ${{\boldsymbol{\varepsilon }}}_{}^{\infty }$, 1.677 for ${{\boldsymbol{\varepsilon }}}_{}^{0}$ and 1.911 for ${{\boldsymbol{\varepsilon }}}_{}^{}$.

Thanks to widely-supported elements in the pretrained PFP, our model can also give relatively good prediction on rare earth elements, e.g., the Sm element in Ba₂SmTaO₆. This compound emerged as one of the top high-dielectric candidates identified from a dataset of over 14,000 structures, based directly on its structural graph. The rare earth elements have diverse industrial applications such as glass, lights, magnets, batteries, and catalytic converters and so on⁵⁵, due to their unique physical and chemical properties. Our model has the potential to make predictions about properties concerning rare elements, which typically suffer from limited data availability.

In this study, we demonstrate the applicability of our model to two design challenges, utilizing different metrics derived from the predicted dielectric tensors with high accuracy. Accurate predictions of dielectric constants as tensors can contribute to understanding material behavior in the presence of electric fields and their interactions with other materials. This knowledge can facilitate the development of new materials with tailored properties for specific applications, such as energy storage, signal transmission, electronic component design, and cutting-edge fields such as dark matter detection⁵⁶.

To conclude, an equivariant model is introduced to predict the second-order tensorial property, dielectric constants, for inorganic materials across 72 supported elements. By leveraging transfer learning, the PFP was treated as the frozen base model to encode material graphs with integrated elemental and structural information. A light-weight trainable equivariant readout module is connected to integrate information in different tensorial orders for the final outputs. It was discovered that embedded features from deeper GCN layers in PFP contribute to better accuracy in predicting total dielectric constants. After efficiently training the equivariant readout module, our DTNet model achieved superior results on the dielectric task in Matbench, and also higher accuracy in predicting the electronic, ionic, total dielectric tensors compared with state-of-the-art models, PaiNN, M3GNet and MatTen, across various crystal system types. To assess the model’s capability in discovering novel dielectric materials, we utilized it to identify high-score candidates near the tail of the training distribution for two distinct design targets by conducting virtual screening on total static dielectric constants. For applications requiring both high dielectric constants and large band gaps in microelectronic manufacturing, the model successfully identified the top three dielectric materials, Cs₂Ti(WO₄)₃ (mp-1226157, E_g = 2.83eV, ε = 180.89), RbNbWO₆ (mp-1219587, E_g = 2.93eV, ε = 155.57) and Ba₂SmTaO₆ (mp-1214622, E_g = 3.36eV, ε = 93.81), from a comprehensive pool of over 14,000 candidates, with 35 DFPT calculations performed. For dielectrics with high anisotropy, the top three materials (including two 3D materials) identified by the model exhibit high anisotropic ratios of 128.890 (CsZrCuSe₃, mp-7152), 96.763 (SeI₂, mp-861871) and 61.026 (BaNiO₃, mp-19138), respectively, validated by 41 DFPT calculations. Promising candidates that significantly outperform the training distribution were successfully identified in both tasks.

Our work highlights that pretrained equivariant graph neural networks, which capture higher-order tensor representations, can incorporate common compositional and structural information, leading to improved accuracy in predicting properties of different orders. This approach also represents a viable alternative for enhancing prediction accuracy for other higher-order properties such as polarizability, multi-poles, and elasticity, thereby accelerating new material discoveries for specific applications.

Methods

Data source

The Materials Project (v2023.11.1) contains a dataset of 7277 dielectric tensors. The dielectric properties are calculated using the Vienna Ab-Initio Simulation Package (VASP version 5.3.4), employing the generalized gradient approximation GGA/PBE exchange-correlation functional⁵⁷ with the +U correction^58,59 to account for electron-electron interactions within the transition metal orbitals. Projector Augmented Wave pseudopotentials^60,61 were also included. In the majority of cases, the dielectric constants obtained from these calculations exhibit a relative deviation of less than +/− 25% when compared to experimental values at room temperature. For each structure, it contains the value of the two components, namely, the electronic component and the ionic component. The total dielectric tensor can be calculated by summing up the two components. We clean up the dataset by filtering out dielectric constants that contain any element out of the range of [−10, 100]. Structures that contain elements not supported by PFP (shown in Fig. 4c) are cleaned up, and finally there are 6,648 structures retained for model training.

The dataset is randomly split into the training, validation and test data in the ratio of 80%, 10% and 10%, respectively. For each experiment conducted in the present study, we do 5 different random splits first and we report their average performance as well as the deviation of the 5 runs to reduce the impact of random.

Model Architectures

Overview

The input materials are preprocessed to identify appropriate neighbor atoms under the periodic boundary conditions for graph construction. Each material is represented by a graph ${\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})$. The initialized atom attributes contain three types of information with different ranks: ${\mathcal{V}}={\{{({{\boldsymbol{a}}}_{s}^{0},{{\boldsymbol{a}}}_{v}^{0},{{\boldsymbol{a}}}_{t}^{0})}_{i}\}}_{i = 1:{N}^{a}}$, where ${{\boldsymbol{a}}}_{s}^{0}\in {{\mathbb{R}}}^{{C}_{s}^{a}}$, ${{\boldsymbol{a}}}_{v}^{0}\in {{\mathbb{R}}}^{{C}_{v}^{a}\times 3}$, ${{\boldsymbol{a}}}_{t}^{0}\in {{\mathbb{R}}}^{{C}_{t}^{a}\times 3\times 3}$, and N^a is the number of atoms. ${{\boldsymbol{a}}}_{s}^{0}$ is initialized by looking up a predefined table, in which each element is mapped to a high-dimension encoding with summation equal to the atomic number divided by 2. Both atom vector ${{\boldsymbol{a}}}_{v}^{0}$ and atom tensor ${{\boldsymbol{a}}}_{t}^{0}$ are initialized to be zeros. Two types of bond attributes are prepared: ${\mathcal{E}}={\{{({{\boldsymbol{b}}}_{s}^{0},{{\boldsymbol{b}}}_{v}^{0})}_{k}\}}_{k = 1:{N}^{b}}$, in which ${{\boldsymbol{b}}}_{s}^{0}\in {{\mathbb{R}}}^{{C}_{s}^{b}}$, ${{\boldsymbol{b}}}_{v}^{0}\in {{\mathbb{R}}}^{{C}_{v}^{b}\times 3}$, N^b is the number of bonds between atom pairs within the cutoff radius R_c, and ${C}_{s}^{a}$, ${C}_{v}^{a}$, ${C}_{t}^{a}$, ${C}_{s}^{b}$, ${C}_{v}^{b}$ are the corresponding number of channels for each feature. Both ${{\boldsymbol{b}}}_{v}^{0}$ and ${{\boldsymbol{b}}}_{s}^{0}$ are filled with zeros initially in this work.

TeaNet³² is used as the base GNN architecture in PFP^33,34. The TeaNet architecture incorporates a second-order Euclidean tensor for high-order geometry interaction and performs equivariant graph convolutions using its information. The original material³² provides a step-by-step explanation for TeaNet implementation for more details. PFP has several modifications on TeaNet, such as introducing the Morse-style two-body potential term and so on, see the corresponding material³³ for details. PFP holds the scalar features ${{\boldsymbol{a}}}_{s}^{},\,{{\boldsymbol{b}}}_{s}^{}$ invariant and the vector/tensor features ${{\boldsymbol{a}}}_{v}^{},\,{{\boldsymbol{a}}}_{t}^{},\,{{\boldsymbol{b}}}_{v}^{}$ equivariant to rotations/inversions. PFP calculates local interactions in each of its GCN layers. 5-layer GCN layers are constructed with the cutoff radius R_c as 3, 3, 4, 6 and 6Å, respectively. In other words, latent embeddings in the earlier convolutional layers hold less distant information. As a result, the maximum distance of atomic interactions counted in PFP is 22Å after the 5-layer information propagation. The intermediate atom/bond features ${{\boldsymbol{a}}}_{s}^{n}$, ${{\boldsymbol{a}}}_{v}^{n}$, ${{\boldsymbol{a}}}_{t}^{n}$, ${{\boldsymbol{b}}}_{s}^{n}$, and ${{\boldsymbol{b}}}_{v}^{n}$ extracted from the n-th GCN layer are used as the input to the designed DTNet which contains M equivariant blocks and a readout neural network.

Neural network definition

We denote the one-layer linear transformation without the bias term as

$${{\mathcal{L}}}^{k}:x\mapsto {{\boldsymbol{W}}}_{{\boldsymbol{k}}}x$$

(8)

and one layer of the perceptron model as

$${{\mathcal{L}}}_{g}^{k}:x\,\mapsto \,g({{\boldsymbol{W}}}_{{\boldsymbol{k}}}x+{{\boldsymbol{b}}}_{{\boldsymbol{k}}})$$

(9)

where ${{\boldsymbol{W}}}_{{\boldsymbol{k}}}^{}$ and ${{\boldsymbol{b}}}_{{\boldsymbol{k}}}^{}$ are learnable parameters, g(x) is the activation function which can be substituted by s(x) as the SiLU activation function and σ(x) as the sigmoid activation function. The SiLU activation function s(x)^62,63,64 is defined by the sigmoid function σ(x) multiplied by its input x

$$s(x)=x* \sigma (x)=\frac{x}{1+{e}^{-x}}$$

(10)

Thus the K-layer MLP with s(x) in the intermediate layers and without the activation function in the final layer can be expressed as

$${\xi }^{K}(x)=\left({{\mathcal{L}}}^{K}\,{\circ}\,\,{{\mathcal{L}}}_{s}^{K-1}\,{\circ}\, \ldots {{\mathcal{L}}}_{s}^{1}\right)(x)$$

(11)

And the K-layer gated MLP⁶⁵ is represented by

$${\phi }^{K}(x)=\left(\left({{\mathcal{L}}}^{K}\,{\circ}\,\, {{\mathcal{c}}}_{s}^{K-1}\,{\circ}\, \ldots {{\mathcal{c}}}_{s}^{1}\right)(x)\right)\odot \left(\left({{\mathcal{c}}}_{\sigma }^{K}\,{\circ}\,\, {{\mathcal{c}}}_{s}^{K-1}\,{\circ}\, \ldots {{\mathcal{L}}}_{s}^{1}\right)(x)\right)$$

(12)

where ⊙ denotes the element-wise product. It comprises two networks, i.e., a normal MLP defined as defined in Eq. (11) and a gate network with the final layer activated by a sigmoid function.

Equivariant block

The equivariant block takes ${{\boldsymbol{a}}}_{s}^{(n,m)}$, ${{\boldsymbol{a}}}_{v}^{(n,m)}$, ${{\boldsymbol{a}}}_{t}^{(n,m)}$, ${{\boldsymbol{b}}}_{s}^{n}$, and ${{\boldsymbol{b}}}_{v}^{n}$ as inputs, where n denotes the GCN layer number in PFP, and m denotes the layer number of the equivariant block. The perceptron models are applied to the atom scalar features ${{\boldsymbol{a}}}_{s}^{(n,m)}$ and edge scalar features ${{\boldsymbol{b}}}_{s}^{n}$, respectively,

$${{\boldsymbol{a}}}_{s1}^{(n,m)}={\xi }_{{a}_{s}1}^{K}\left({{\boldsymbol{a}}}_{s}^{(n,m)}\right)$$

(13)

$${{\boldsymbol{b}}}_{s1}^{n}={\xi }_{{b}_{s}1}^{K}\left({{\boldsymbol{b}}}_{s}^{n}\right)$$

(14)

To keep the equivariance of vector features and tensor features, the linear transformations without the bias term are applied to the atom vector features ${{\boldsymbol{a}}}_{v}^{(n,m)}$, atom tensor features ${{\boldsymbol{a}}}_{t}^{(n,m)}$, and bond vector features ${{\boldsymbol{b}}}_{v}^{n}$.

$${{\boldsymbol{a}}}_{v1}^{(n,m)}={{\mathcal{L}}}_{{a}_{v}1}\left({{\boldsymbol{a}}}_{v}^{(n,m)}\right)$$

(15)

$${{\boldsymbol{a}}}_{t1}^{(n,m)}={{\mathcal{L}}}_{{a}_{t}1}\left({{\boldsymbol{a}}}_{t}^{(n,m)}\right)$$

(16)

$${{\boldsymbol{b}}}_{v1}^{n}={{\mathcal{L}}}_{{b}_{v}1}\left({{\boldsymbol{b}}}_{v}^{n}\right)$$

(17)

The neighbor set of atom i is denoted as ${{\mathcal{N}}}_{i}$. Let ${{\boldsymbol{a}}}_{s1,i}^{(n,m)}$ denote the scalar feature of atom i, and ${{\boldsymbol{b}}}_{s1,\{i,j\}}^{n}$ denote the scalar feature of the bond connecting atom i and atom j. Other vector and tensor features are represented in a similar manner. The bond scalar features ${{\boldsymbol{b}}}_{s1,\{i,j\}}^{n}$ and vector features ${{\boldsymbol{b}}}_{v1,\{i,j\}}^{n}$ are aggregated to atom scalar features ${{\boldsymbol{a}}}_{s1,i}^{(n,m)}$ and vector features ${{\boldsymbol{a}}}_{v1,i}^{(n,m)}$ accordingly, thus for each atom i

$${{\boldsymbol{a}}}_{s2,i}^{(n,m)}={{\boldsymbol{a}}}_{s1,i}^{(n,m)}+\sum _{j\in {{\mathcal{N}}}_{i}}{{\boldsymbol{b}}}_{s1,\{i,j\}}^{n}$$

(18)

$${{\boldsymbol{a}}}_{v2,i}^{(n,m)}={{\boldsymbol{a}}}_{v1,i}^{(n,m)}+\sum _{j\in {{\mathcal{N}}}_{i}}{{\boldsymbol{b}}}_{v1,\{i,j\}}^{n}$$

(19)

Then the Frobenius norm ∥ ⋅ ∥ of node vector features and tensor features are combined with the node scalar features and fed into another perceptron model.

$${{\boldsymbol{a}}}_{s3}^{(n,m)}={\xi }_{{a}_{s}3}^{K}\left({\rm{concat}}\left({{\boldsymbol{a}}}_{s2}^{(n,m)}+\parallel {{\boldsymbol{a}}}_{v2}^{(n,m)}\parallel +\parallel {{\boldsymbol{a}}}_{t2}^{(n,m)}\parallel \right)\right.$$

(20)

Inspired by the gate nonlinearity design⁶⁶, the interaction output of different-rank node features are split, and the vector and tensor components are multiplied by the sigmoid activation function to form a gate equivariant nonlinearity for the transformed input of the vector feature and tensor feature, respectively.

$${{\boldsymbol{a}}}_{s4}^{(n,m+1)},{{\boldsymbol{a}}}_{v3}^{(n,m)},{{\boldsymbol{a}}}_{v3}^{(n,m)}={\rm{split}}\left({{\boldsymbol{a}}}_{s3}^{(n,m)}\right)$$

(21)

$${{\boldsymbol{a}}}_{v4}^{(n,m+1)}=\sigma \left({{\boldsymbol{a}}}_{v3}^{(n,m)}\right)\odot {{\mathcal{L}}}_{{a}_{v}2}\left({{\boldsymbol{a}}}_{v}^{(n,m)}\right)$$

(22)

$${{\boldsymbol{a}}}_{t4}^{(n,m+1)}=\sigma \left({{\boldsymbol{a}}}_{t3}^{(n,m)}\right)\odot {{\mathcal{L}}}_{{a}_{t}2}\left({{\boldsymbol{a}}}_{t}^{(n,m)}\right)$$

(23)

Finally, the updated ${{\boldsymbol{a}}}_{s4}^{(n,m+1)}$, ${{\boldsymbol{a}}}_{v4}^{(n,m+1)}$ and ${{\boldsymbol{a}}}_{t4}^{(n,m+1)}$ are used as the input for the next equivariant readout block.

Readout

The final layer output, i.e., layer M, of the equivariant blocks ${{\boldsymbol{a}}}_{s}^{(n,M)}$, ${{\boldsymbol{a}}}_{v}^{(n,M)}$ and ${{\boldsymbol{a}}}_{t}^{(n,M)}$ are aggregated for the tensorial property prediction. Since the dielectric constant is an intensive property, we take the average aggregation of the atom features in the graph representation of materials. The gated MLP applied to the scalar features ${{\boldsymbol{a}}}_{s}^{(n,M)}$ is found to help improving the accuracy compared with the normal MLP. Therefore, the atom scalar features is processed by

$${\boldsymbol{a}}^{\prime}_{s} ={\phi }^{K}\left({{\boldsymbol{a}}}_{s}^{(n,\,M)}\right)$$

(24)

The channels of atom vector and tensor features are compressed to 1 by

$${\boldsymbol{a}}^{\prime}_{\nu} ={{\mathcal{L}}}_{{a}_{v}3}\left({{\boldsymbol{a}}}_{v}^{(n,M)}\right)$$

(25)

$${\boldsymbol{a}}^{\prime}_{t} ={{\mathcal{L}}}_{{a}_{t}3}\left({{\boldsymbol{a}}}_{t}^{(n,\,M)}\right)$$

(26)

Finally, we construct the dielectric tensors using

$${\boldsymbol{\varepsilon }}=\frac{1}{{N}^{a}}\sum ^{{N}^{a}}\left({\boldsymbol{a}}^{\prime}_{s} {{\boldsymbol{I}}}_{3}+{\boldsymbol{a}}^{\prime}_{\nu} \otimes {\boldsymbol{a}}^{\prime}_{\nu} +{\boldsymbol{a}}^{\prime}_{t} \right)$$

(27)

where ${{\boldsymbol{I}}}_{3}^{}$ is the 3 × 3 identity tensor, and ⊗ represents the outer product operation. The first models isotropic, and the final two terms add the anisotropic components of the dielectric tensor. Therefore, the final output of our model is the 3 × 3 dielectric tensor following the equivariance of the materials input.

Hyperparameters

The DTNet model contains 2 equivariant blocks (Fig. 3a). All the latent dimensions in MLP in Fig. 3b, c are set to 128 with a dropout rate of 0.2. The batch size is set to 64 during training. The AdamW optimizer⁶⁷ is utilized to optimize parameters with β₁ = 0.9, β₂ = 0.999, λ = 0.01. The learning rate starts at 1 × 10⁻⁴ and a cosine annealing schedule is applied for tighter convergence. The gradient norm is clipped at 2.0 to stabilize the training. Early stopping is applied during training if no improvement on the validation accuracy is observed after 200 epochs.

Virtual screening

We performed virtual screening based on iterative active learning to search for candidates for the two problems. The acquisition function uses an upper confidence bound (UCB) to balance exploration and exploitation during multi-round screening, defined by

$$a(\chi ;\kappa )=\mu (\chi )+\kappa \cdot \sigma (\chi )$$

(28)

where χ is the input crystal structure, κ is a hyperparameter controlling the trade-off and here κ = 1.5 is selected for both tasks. μ(χ) and σ(χ) represent the mean and standard deviation of polycrystalline dielectric constants and anisotropy ratios in the two respective tasks. In each task, we select the top 20 highest a(x; κ = 1.5) candidates with unit cell sizes of up to 20 atoms for validation by DFPT in each round in consideration of computational budget constraints. After each round, the dataset was updated with the newly validated results, and the next iteration was performed. In this case, finally, there were 60 new structures validated by DFPT computations across 3-round active learning for each discovery task.

DFPT configuration

The density functional perturbation theory (DFPT) validation of the screened results was performed using the Vienna Ab initio Simulation Package (VASP)⁶⁸. For the calculations, the generalized-gradient approximation (GGA) with the Perdew-Burke-Ernzerhof (PBE) functional was employed, augmented by Hubbard U corrections with the same settings on the MP public documentation⁶⁹. The projector augmented wave (PAW) method⁶¹ was used to represent the core and valence electrons. The plane-wave basis set was truncated at an energy cutoff of 600 eV. The k-point density of 3000 k-points per reciprocal atom was used. A Gaussian smearing with a width of 0.05 eV was applied to account for partial occupancies in the electronic structure. A tight global electronic relaxation convergence criterion is applied with an energy difference tolerance of 1 × 10⁻⁷ eV. The electronic minimization utilizes a fairly robust mixture of the blocked-Davidson and RMM-DIIS algorithms to achieve efficient convergence. The precision setting was configured to “Accurate" to enhance the overall precision of the calculations.

Data availability

The materials with dielectric constants are freely accessible using The Materials Project API (https://next-gen.materialsproject.org/api). Our preprocessed data from MP used in this study and our validated DFPT results are provided at https://github.com/pfnet-research/dielectric-pred.

Code availability

The implementation code of DTNet is available at https://github.com/pfnet-research/dielectric-pred. The pretrained model PFP is provided in the proprietary software named Matlantis (https://matlantis.com/). The code and trained parameters are not open-source. The related information of latest-version PFP can be found at https://matlantis.com/news/pfp-v7-release.

References

Brebels, J., Manca, J. V., Lutsen, L., Vanderzande, D. & Maes, W. High dielectric constant conjugated materials for organic photovoltaics. J. Mater. Chem. A 5, 24037–24050 (2017).
Article CAS Google Scholar
Wu, X., Chen, X., Zhang, Q. M. & Tan, D. Q. Advanced dielectric polymers for energy storage. Energy Storage Mater. 44, 29–47 (2022).
Article Google Scholar
Zha, J-W., Zheng, M-S., Fan, B-H. & Dang, Z-M. Polymer-based dielectrics with high permittivity for electric energy storage: A review. Nano Energy 89, 106438 (2021).
Article CAS Google Scholar
Hill, M. D., Cruickshank, D. B. & MacFarlane, I. A. Perspective on ceramic materials for 5g wireless communication systems. Appl. Phys. Lett. 118, 120501 (2021).
Shehbaz, M., Du, C., Zhou, D., Xia, S. & Xu, Z. Recent progress in dielectric resonator antenna: Materials, designs, fabrications, and their performance. Appl. Phys. Lett. 10, 021303 (2023).
Petousis, I. et al. High-throughput screening of inorganic compounds for the discovery of novel dielectric and optical materials. Sci. Data 4, 1–12 (2017).
Article Google Scholar
Hautier, G., Jain, A. & Ong, S.P. From the computer to the laboratory: materials discovery and design using first-principles calculations. J. Mater. Sci. 47, 7317–7340 (2012).
Article CAS Google Scholar
Hohenberg, P. & Kohn, W. Inhomogeneous electron gas. Phys. Rev. 136, B864 (1964).
Article Google Scholar
Kohn, W. & Sham, Lu. Jeu Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Article Google Scholar
Choudhary, K. et al. High-throughput density functional perturbation theory and machine learning predictions of infrared, piezoelectric, and dielectric responses. npj Comput. Mater. 6, 64 (2020).
Article Google Scholar
Greeley, J., Jaramillo, T. F., Bonde, J., Chorkendorff, I. B. & Nørskov, J. K. Computational high-throughput screening of electrocatalytic materials for hydrogen evolution. Nat. Mater. 5, 909–913 (2006).
Article CAS PubMed Google Scholar
Petousis, I. et al. Benchmarking density functional perturbation theory to enable high-throughput screening of materials for dielectric constant and refractive index. Phys. Rev. B 93, 115151 (2016).
Article Google Scholar
Chen, C. et al. A critical review of machine learning of energy materials. Adv. Energy Mater. 10, 1903242 (2020).
Article CAS Google Scholar
Sumita, M. et al. De novo creation of a naked eye–detectable fluorescent molecule based on quantum chemical computation and machine learning. Sci. Adv. 8, eabj3906 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mao, Z., Chen, C., Zhang, Y., Suzuki, K. & Suzuki, Y. Ai-driven discovery of amorphous fluorinated polymer electret with improved charge stability for energy harvesting. Adv. Mater. 36, 18, (2024).
Zhang, Y. et al. Discovery of polymer electret material via de novo molecule generation and functional group enrichment analysis. Appl. Phys. Lett. 118, 223904 (2021).
Zhavoronkov, A. et al. Deep learning enables rapid identification of potent ddr1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).
Article CAS PubMed Google Scholar
Oliynyk, A. O. & Mar, A. Discovery of intermetallic compounds from traditional to machine-learning approaches. Acc. Chem. Res. 51, 59–68 (2018).
Article CAS PubMed Google Scholar
Umeda, Y., Hayashi, H., Moriwake, H. & Tanaka, I. Prediction of dielectric constants using a combination of first principles calculations and machine learning. Jpn. J. Appl. Phys. 58, SLLC01 (2019).
Article CAS Google Scholar
Lin, X., Li, C., Hao, H., Zhao, G. & Liu, H. Accelerated search for abo3-type the electronic contribution of polycrystalline dielectric constants by machine learning. Comput. Mater. Sci. 193, 110404 (2021).
Article CAS Google Scholar
Kim, E., Kim, J. & Min, K. Prediction of dielectric constants of abo 3-type perovskites using machine learning and first-principles calculations. Phys. Chem. Chem. Phys. 24, 7050–7059 (2022).
Article CAS PubMed Google Scholar
Dunn, A., Wang, Q., Ganose, A., Dopp, D. & Jain, A. Benchmarking materials property prediction methods: the matbench test set and automatminer reference algorithm. npj Comput. Mater. 6, 138 (2020).
Article Google Scholar
Sturm, C., Furthmüller, Jürgen, Bechstedt, F., Schmidt-Grund, Rüdiger, & Grundmann, M. Dielectric tensor of monoclinic ga2o3 single crystals in the spectral range 0.5–8.5 ev. APL Mater., 3, 106106 (2015).
Battaglia, P. W. et al. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning, 1263–1272 (2017).
Chen, C., Ye, W., Zuo, Y., Zheng, C. & Ong, ShyuePing Graph networks as a universal machine learning framework for molecules and crystals. Chem. Mater. 31, 3564–3572 (2019).
Article CAS Google Scholar
Schütt, K. T., Sauceda, H. E., Kindermans, P-J, Tkatchenko, A., & Müller, K-R. Schnet–a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (2020) https://openreview.net/forum?id=B1eWbxStPH.
Liao, Y.-L. & Smidt, T. Eqiformer: Equivariant graph attention transformer for 3d atomistic graphs. In The Eleventh International Conference on Learning Representations (2022) https://openreview.net/forum?id=KwmPfARgOTD.
Batzner, S. et al. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).
Article CAS PubMed PubMed Central Google Scholar
Schütt, K., Unke, O., & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning 9377–9388 (2021).
Takamoto, S., Izumi, S. & Li, J. Teanet: Universal neural network interatomic potential inspired by iterative electronic relaxations. Comput. Mater. Sci. 207, 111280 (2022).
Article CAS Google Scholar
Takamoto, S. et al. Towards universal neural network potential for material discovery applicable to arbitrary combination of 45 elements. Nat. Commun. 13, 2991 (2022).
Article CAS PubMed PubMed Central Google Scholar
Takamoto, S., Okanohara, D., Li, Qing-Jie & Li, J. Towards universal neural network interatomic potential. J. Materiomics 9, 447–454 (2023).
Article Google Scholar
Lou, Y. & Ganose, A. M. Discovery of highly anisotropic dielectric crystals with equivariant graph neural networks. Faraday Discuss. https://doi.org/10.1039/d4fd00096j (2024).
Chen, C. & Ong, ShyuePing A universal graph deep learning interatomic potential for the periodic table. Nat. Computational Sci. 2, 718–728 (2022).
Article Google Scholar
Wen, M., Horton, M. K., Munro, J. M., Huck, P. & Persson, K. A. A universal equivariant graph neural network for the elasticity tensors of any crystal system. Digit. Discov. 3, 869–882 (2024).
Article Google Scholar
Jain, A. et al. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Mater. 1(1), 011002 (2013).
Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631 (2019).
De Breuck, Pierre-Paul, Hautier, G. & Rignanese, Gian-Marco Materials property prediction for limited datasets enabled by feature selection and joint learning with modnet. npj computational Mater. 7, 83 (2021).
Article Google Scholar
Thomas, N. et al. Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018.
Qu, J., Zagaceta, D., Zhang, W. & Zhu, Q. High dielectric ternary oxides from crystal structure prediction and high-throughput screening. Sci. Data 7, 81 (2020).
Article PubMed PubMed Central Google Scholar
Takahashi, A., Kumagai, Y., Miyamoto, J., Mochizuki, Y. & Oba, F. Machine learning models for predicting the dielectric constants of oxides based on high-throughput first-principles calculations. Phys. Rev. Mater. 4, 103801 (2020).
Article CAS Google Scholar
Pilania, G., Wang, C., Jiang, X., Rajasekaran, S. & Ramprasad, R. Accelerating materials property predictions using machine learning. Sci. Rep. 3, 2810 (2013).
Article PubMed PubMed Central Google Scholar
Lee, M., Youn, Y., Yim, K. & Han, S. High-throughput ab initio calculations on dielectric constant and band gap of non-oxide dielectrics. Sci. Rep. 8, 14794 (2018).
Article PubMed PubMed Central Google Scholar
Yim, K. et al. Novel high-κ dielectrics for next-generation electronic devices screened by automated ab initio calculations. NPG Asia Mater. 7, e190–e190 (2015).
Article CAS Google Scholar
Newnham, R. E. Properties of Materials: Anisotropy, Symmetry, Structure (Oxford University Press, 2005).
Tudi, A., Han, S., Yang, Z. & Pan, S. Potential optical functional crystals with large birefringence: Recent advances and future prospects. Coord. Chem. Rev. 459, 214380 (2022).
Article CAS Google Scholar
Goumri-Said, S. et al. Unraveling essential optoelectronic and thermoelectric excellence in cszrcuse3 with hybrid functional and boltzmann transport insights. Results Phys. 57, 107395 (2024).
Article Google Scholar
Jha, D. et al. Enhancing materials property prediction by leveraging computational and experimental data using deep transfer learning. Nat. Commun. 10, 5316 (2019).
Article CAS PubMed PubMed Central Google Scholar
Hutchinson, M. L. et al. Overcoming data scarcity with transfer learning. arXiv preprint arXiv:1711.05099, 2017.
Chen, C. & Ong, ShyuePing Atomsets as a hierarchical transfer learning framework for small and large materials datasets. npj Comput. Mater. 7, 173 (2021).
Article Google Scholar
Wilkins, D. M. et al. Accurate molecular polarizabilities with coupled cluster theory and machine learning. Proc. Natl Acad. Sci. 116, 3401–3406 (2019).
Article CAS PubMed PubMed Central Google Scholar
Thurlemann, M., Böselt, L. & Riniker, S. Learning atomic multipoles: prediction of the electrostatic potential with equivariant graph neural networks. J. Chem. Theory Comput. 18, 1701–1710 (2022).
Article CAS PubMed Google Scholar
Balaram, V. Rare earth elements: A review of applications, occurrence, exploration, analysis, recycling, and environmental impact. Geosci. Front. 10, 1285–1303 (2019).
Article CAS Google Scholar
Coskuner, A., Mitridate, A., Olivares, A. & Zurek, K. M. Directional dark matter detection in anisotropic dirac materials. Phys. Rev. D. 103, 016006 (2021).
Article CAS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Article CAS PubMed Google Scholar
Dudarev, S. L., Botton, G. A., Savrasov, S. Y., Humphreys, C. J. & Sutton, A. P. Electron-energy-loss spectra and the structural stability of nickel oxide: An lsda+ u study. Phys. Rev. B 57, 1505 (1998).
Article CAS Google Scholar
Jain, A. et al. A high-throughput infrastructure for density functional theory calculations. Comput. Mater. Sci. 50, 2295–2310 (2011).
Article CAS Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Article Google Scholar
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. b 59, 1758 (1999).
Article CAS Google Scholar
Hendrycks, D. & Gimpel, K. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
Elfwing, S., Uchibe, E. & Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 107, 3–11 (2018).
Article PubMed Google Scholar
Ramachandran, P., Zoph, B. & Le, Q. V. Swish: a self-gated activation function. arXiv Prepr. arXiv:1710. 05941 7, 5 (2017).
Google Scholar
Liu, H., Dai, Z., So, D. & Le, Q. V. Pay attention to mlps. Adv. Neural Inf. Process. Syst. 34, 9204–9215 (2021).
Google Scholar
Weiler, M., Geiger, M., Welling, M., Boomsma, W., & Cohen, T. S. 3D steerable cnns: Learning rotationally equivariant features in volumetric data. Adv. Neural Inf. Process. Syst. 31, (2018).
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (2019) https://openreview.net/forum?id=Bkg6RiCqY7.
Kresse, G. & Furthmüller, J. ürgen Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169 (1996).
Article CAS Google Scholar
Materials Project. MP Public Docs, Hubbard U Values. https://docs.materialsproject.org/methodology/materials-methodology/calculation-details/gga+u-calculations/hubbard-u-values (2023).

Download references

Acknowledgements

The authors thank So Takamoto for fruitful discussions and helpful comments. The authors thank Chikashi Shinagawa for assistance in DFPT computations using VASP. The research is supported by Preferred Networks Inc.

Author information

Authors and Affiliations

Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, 277-8561, Chiba, Japan
Zetian Mao
Preferred Networks Inc., 1-6-1 Otemachi, Chiyoda-ku, 100-0004, Tokyo, Japan
WenWen Li & Jethro Tan

Authors

Zetian Mao
View author publications
You can also search for this author in PubMed Google Scholar
WenWen Li
View author publications
You can also search for this author in PubMed Google Scholar
Jethro Tan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z. M. implemented the codes and conducted experiments. W. L, J. T. and Z. M. conceived the idea. W. L. and J. T. provided technical support of PFP, and supervised the research. All authors wrote and confirmed the manuscript.

Corresponding author

Correspondence to WenWen Li.

Ethics declarations

Competing interests

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Zetian Mao was an intern/part-time employee at Preferred Networks, Inc. during this study. Jethro Tan and Wenwen Li are full-time employees working at Preferred Networks, Inc.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mao, Z., Li, W. & Tan, J. Dielectric tensor prediction for inorganic materials using latent information from preferred potential. npj Comput Mater 10, 265 (2024). https://doi.org/10.1038/s41524-024-01450-z

Download citation

Received: 17 May 2024
Accepted: 21 October 2024
Published: 21 November 2024
DOI: https://doi.org/10.1038/s41524-024-01450-z

Subjects

Abstract

Similar content being viewed by others

Machine learning and atomistic origin of high dielectric permittivity in oxides

High dielectric ternary oxides from crystal structure prediction and high-throughput screening

Generative learning facilitated discovery of high-entropy ceramic dielectrics for capacitive energy storage

Introduction

Results

Overview

Equivariance

Dielectric tensor prediction covering 72 elements

Data preparation

Performance of DTNet

Model comparison for dielectric tensor prediction

Discovery of dielectric materials

Preparation of candidate structures

High-dielectric materials

Highly anisotropic dielectrics

Discussion

Methods

Data source

Model Architectures

Overview

Neural network definition

Equivariant block

Readout

Hyperparameters

Virtual screening

DFPT configuration

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links