Big Data for Designing Amazingly Useful Materials

An x-ray beam aimed at a sample of tiny particles provides information about its properties. Here, physicist Simon Billinge prepares a sample for analysis. (Jane Nisselson/Columbia Engineering)

With the right material, batteries might one day store enough solar energy to power cities, and unconventional drugs might quickly be absorbed in the body to fight cancer. Nanotechnology, the science of manipulating atoms and molecules to make materials with new and useful properties, has the potential to change how we get energy, treat disease and more.

Simon Billinge, a physicist at Columbia Engineering and the Data Science Institute, is on a hunt for the next wonder material. The process has typically taken time and luck, with promising candidates scavenged from nature or formulated in a lab. But in a new approach, Billinge and others dream of arranging atoms on the computer to design custom-materials that can be manufactured quickly and cheaply.

The U.S. National Science Foundation recently awarded Billinge and his team a three-year $983,000 grant to pursue this work, part of a larger Materials Genome Initiative coordinated by the U.S. government to fast-track the discovery of new materials.

This new approach depends on building databases that describe the properties of known materials—their electrical conductivity, strength, durability and so on—and how the atoms and electrons around them are arranged. Researchers have performed experiments on tiny samples to extract this information, but too often the data wind up on computer hard-drives, lost to the larger research world.

Billinge and his colleagues plan to convert their experimental data to a machine-readable format where they can be shared and analyzed on powerful computers.

Billinge and his colleagues will convert experimental data into a digital format that can be analyzed with data science techniques. (Jane Nisselson/Columbia Engineering)

“We’re getting a ton of high-quality experimental data, but need data analytic methods to pick through it to get reliable, robust solutions for nanostructure–the arrangements of atoms in minute substructures of a material—critical to understanding its properties,” he said.

Their data will come from the National Synchotron Light Source II at Brookhaven National Lab in Long Island. Beams of high-energy x-rays are aimed at a sample to reveal its underlying structure and chemical makeup.

Among the tiny particles Billinge has studied so far are cadmium selenide (CdSe) and lead telluride (PbTe), which are used in flat-panel displays and to image living cells, among other applications. With further research, the materials could be used to make more efficient devices to convert sunlight to energy, and to store that energy for later use—reducing the cost and impracticality of going solar.

Other experiments have focused on methods for measuring the active ingredients in complex, next generation drugs.  A growing number of drugs in the pipeline are highly insoluble and need extensive processing before they can be turned into pills that are easily swallowed and absorbed in the bloodstream. The problem for drug makers is that the pills are so complex that verifying and measuring their active ingredients is not yet possible. Billinge and his team are looking for a way.

Looming over their work is a theoretical problem that they also hope to solve. The materials are so complex that as you drill down to smaller and smaller samples, to minute nanoparticle sizes, relatively little information is left. Physicists call it the Nanostructure Inverse Problem.

“It’s like trying to make out a view through blurry, steamed up glasses,” says Billinge.  “By using data analytic methods we hope to make individual images sharper, and to stitch together multiple views to create a clearer and more informative picture.” 

The project brings together a range of experts at Columbia-- applied mathematician Qiang Du and computer scientist Daniel Hsu—and will make use of techniques in image recognition, information theory and machine learning. 

— Kim Martineau

550 W. 120th St., Northwest Corner 1401, New York, NY 10027    212-854-5660
©2017 Columbia University