Comprehensive database released to advance data-driven research in the field of quasicrystals

by

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

proofread

Three datasets comprising HYPOD-X and their data collection procedures. Credit: The Institute of Statistical Mathematics

Quasicrystals are materials with unique, non-periodic symmetry that distinguishes them from conventional crystals. Approximant crystals, often regarded as precursor materials closely related to quasicrystals, share similar compositional and structural features but retain periodic atomic arrangements.

These materials exhibit distinct physical properties, such as unique temperature-dependencies in electrical and thermal conductivity compared to conventional metals. However, the lack of a comprehensive database has long been a significant barrier to advancing machine-learning-driven quasicrystal research.

Furthermore, to deepen our understanding of the relationship between quasicrystal structures and their properties —and to stimulate the development of new materials—there is a growing need for a comprehensive open database.

The research group has developed the world's first open database for quasicrystals and their approximants, called "HYPOD-X" (Hypermaterials Open Database for X, where X represents a wildcard for application targets, such as machine learning). The work is published in the journal Scientific Data.

HYPOD-X provides structured data on the composition, structure, and physical properties of quasicrystals and approximant crystals, extracted from texts and figures in scientific papers and books, in an accessible format for researchers and engineers. This database serves as a foundation for data-driven research in the field of quasicrystal research.

HYPOD-X comprises three datasets: the composition dataset, the phase diagram dataset, and the property dataset. The data, which have been manually or semi-automatically extracted, undergo rigorous expert review before being added to the database.

The composition dataset serves as a foundational source of information on quasicrystals and approximants. The data, including compositions, structural types, and heat treatment conditions, have been manually collected and submitted into the database after rigorous validation by experts.

Automated algorithms for error data extraction has also served to enhance data quality. The data volume is approximately ten times greater than that of a previous study that compiled the compositions of quasicrystals. Using this dataset, the research group successfully discovered new quasicrystals with a machine learning algorithm called TSAI 1.0.

The properties dataset includes temperature-dependent data for thermal conductivity, electrical properties, and magnetic properties, extracted from figures and tables in scientific papers and books.

By analyzing this data, new patterns that have been overlooked even by experts in quasicrystals could be discovered. For instance, quasicrystals tend to exhibit an increase in thermal conductivity at higher temperatures, which has not been typically observed in conventional metals or crystals.

This unique property could be utilized in the development of thermal rectifying materials that control the heat flow in specific directions. Identifying quasicrystals with favorable promising temperature dependencies from this dataset may accelerate the development of new thermal management devices.

The phase diagram dataset contains digitalized data extracted from figures in the vast literature to date. Specifically, it stores data quantifying the boundary composition of each phase region, providing compositional ranges and other conditions under which quasicrystals and approximant crystals are thermodynamically stabilized. Applying machine learning to this dataset enables the prediction of new phases for quasicrystals and approximant crystals.

Future prospects

HYPOD-X offers a valuable new resource to advance quasicrystal research. The research group plans to continually expand the database. While data-driven research is becoming popular across various fields of materials science, the limited availability of data has hindered progress of data-driven quasicrystal research.

With the launch of HYPOD-X, a diverse array of data-driven research is expected to arise. Furthermore, by providing a comprehensive view of extensive data, it is anticipated that new insights and scientific principles will be discovered in quasicrystal science.

More information: Erina Fujita et al, Comprehensive experimental datasets of quasicrystals and their approximants, Scientific Data (2024). DOI: 10.1038/s41597-024-04043-z

Journal information: Scientific Data

Provided by Research Organization of Information and Systems