Theory and Experiment in “Step” on Semiconductors

February 8th, 2010 by George Fitzgerald, PhD

A recent news article by the University of Texas at Dallas (UTD)  highlighted recent joint work by the Department of Materials Science and Engineering and Accelrys on critical surface reactions of Silicon. The research points the way to ”improve semiconductor devices’ performance in health care and solar power applications in particular.”

Who cares? Anybody who uses chips, solar cells, or any other device containing semiconductors (in other words, all of us.)  

Insertion of Nitrogen atom is predicted to occur preferentially at the step edge of Si(111)

 How does the latest research help? A typical semiconductor device consists of a metal oxide semiconductor layer (e.g., HfO2) deposited on a silicon substrate. As explained by co-author Dr. Mat Halls, formation of an SiO2interlayer between the silicon substrate and metal oxide can decrease semiconductor performance. One approach to solving this is to introduce a nitride barrier to prevent the growth of interfacial SiO2. The ability to introduce such heteroatoms into the topmost layers of Si affords additional opportunities to tune the surface properties by enhancing chemical reactivity at these sites to form functional surfaces. But how do you get the nitrogen to stick to the surface?     

In the latest research, published in Nature Materials, used infra-red spectroscopy  to explore the possible formation mechanisms of nitride on silicon surfaces terminated by hydrogen. Calculations using density functional theory (DFT) demonstrated how stepped edges are important to formation of the nitride layers. The reaction mechanism on the stepped surface provides a means of controlling the reaction. As the authors wrote: “The ability to control the reaction … enables the realization of applications … including sensing, electrical and thermal transport, and molecular computing.” This is a beautiful demonstration of the complementarity of theory and experiment. One can deal with facts, but requires interpretation. The other provides detailed explanations at the atomic level, but sometime requires an anchor to the “real world.” Together they can do more. Wouldn’t it be great if all viewpoints could be reconciled this well?

  • Share/Save/Bookmark

Can Automated NMR Analysis Cure Gephyrophobia?

October 8th, 2009 by Max Petersen, Ph.D.

It wasn’t until my career steered towards marketing that I was diagnosed with gephyrophobia – the pathological fear of bridges. Certainly, years of riding motorcycles in Southern California’s backcountry never triggered this anxiety disorder, but the realization that a simple query on istockphoto for “bridge” and “puzzle” returns 56 variations of chasmcrossing stereotypisms. My guess is that if you are gainfully employed and work outside the shelter of academic or government institutions, you have been exposed: The “bridge” as a symbol. To make you work better. With your colleagues. With other departments. In general: To cross a chasm.

A car about to cross a chasm. Or is it?

A car about to cross a chasm. Or is it?

Working as a product manager for Pipeline Pilot, I had no choice but to face my fears. There are no two ways to put it: Pipeline Pilot is a “bridge,” it helps scientists and departments work together – better.  Now, don’t get me wrong – I have nothing against teamwork, I think it makes it worthwhile commuting 60 miles to work every day and wondering what frequent flyer status comes after “Platinum.”  My fears are empty marketing promises that create customer reactions ranging anywhere from confusion to disappointment.

Customers I mainly interact with work in analytical labs. Last time I checked, they are concerned with analyzing boatloads of data in the shortest time possible. Sure, they could use a bridge when it comes time to toss the results over the fence to the chemists that originally requested the work. And after quite a bit of effort from my colleagues in  Cambridge R&D, we can now process NMR data in Pipeline Pilot. And although I have no delusions of grandeur that Accelrys would invest in this project simply to address my exotic condition, I still hope that the next time I drive over the Bay Bridge to see my favorite customer, my heart will be beating at its customary 120 bmp.

I would also like to thank all the great people at Modgraph helping us integrate their first rate NMR prediction engine, NMRPredict, into Pipeline Pilot. Not only would our NMR release have been significantly less complete, but it makes Accelrys’ vision come alive that Pipeline Pilot is an open integration platform for scientific applications.

  • Share/Save/Bookmark

Innovative chemical solutions – tired of tweaking the process parameters knob?

September 1st, 2009 by Michael Doyle, PhD

Just back from the ACS meeting in Washington, and I was reading the Presidential Awards nominees and winners document from the EPA.  What struck me in the list of excellent and erudite projects was the degree of chemical innovation that companies from Dow to P&G, BASF and Eli Lilly were pursuing and achieving.

What was interesting to me was that in these tough economic times, it can be hard sometimes to justify and develop innovative chemical solutions, rather than just tweaking the process parameters knob again.  These companies had all done that.

Also, as someone who has worked in the materials simulation area for 20 years, and whose role can be summed up in the word ideation, I was stunned by the scope and creativity of these solutions. What particularly interested me was that many of these companies use virtual chemistry as part of their innovation process, since the relative comparison, mechanistic understanding and what if questions can only be accurately and systematically probed using this approach.

The fact that:

  • improvements in scale inhibitors for power stations leading to lower energy usage,
  • benign corrosion inhibitors,
  • improved fuel cell Polymer membranes for more efficient energy storage, and
  • improved chemical synthesis and coating materials

have all been studied and innovations achieved using modeling approaches shows the direction which we could take to innovate our way out of energy dependency issues.

I look forward to seeing the 2010 nominees and hope that chemical simulation and informatics linked projects are key in that innovation path.

  • Share/Save/Bookmark

Webapps: Making a (Zero Footprint) Mark on Applied Research

June 25th, 2009 by Max Petersen, Ph.D.

It is now a couple of weeks ago since Accelrys offered its customers and prospects to try out a web-enabled version of Synthia, a popular tool for quick estimations of polymer properties. For me, the intellectual fallout of this exercise was a close view on what companies are looking for when thinking about using web applications for their day-to-day research needs. Here are two points I wanted to discuss in this blog:

  • Hosted services vs. IP protection
  • Zero footprint molecular editors

Hosted services vs. IP protection: As expected for a tool that allows users to test viability of new materials prior to experimental synthesis, our web tool allowed users to sketch their own molecular structures, save them, and run predictions based on those structures. This almost immediately prompted customer responses that an Accelrys hosted trial of this functionality was not an option. Instead, these customers preferred running a trial version within the comfort of their own intranet.

On the other hand, I had the chance to see CCDC’s new web based version of the CSD WebCSD, at this spring’s ACS meeting. Here, hosted services give a great advantage over bi-annual or annual distributions of their popular crystal structure database.  Updates to the hosted DB become immediately available to users and IP issues are non-existent, as it is virtually impossible to relate a CSD query to a specific compound in development. Unless of course you use the CSD framework as a repository for your proprietary structures. In this case, while hosting an in-house database install, still all the other benefits of a web application can be enjoyed

Zero footprint molecular editors: Web enabled research tools that cater to chemists and materials scientists will seldom avoid displaying an atomistic representation of a structure at some point. Generally, a scientist will also have to edit a structure or create one from scratch.

Our approach was to allow users pick a molecular editor of their choice that can be invoked from within the webapp. This has the benefit that users can use a tool they are comfortable with and the disadvantage that the web app is “not so zero footprint” any more.

Structure editor for polymer properties web application

Figure 1: We allowed users to choose between a selection of popular molecular editors. These are of course not thin clients, but traditional thick clients that need to be installed on individual machines.

Looking outside the Accelrys box, the promising candidates to become standards in web-based molecular viewers/editors are Jmol, OpenAstexViewer, and JME. While Jmol follows the classic OpenSource community development model, the AstexViewer and JME have their roots directly in industry research. They have impressive deployments (JME states 10.000 users in over 150 companies) and are used as 3rd party tools by commercial institutions, including the above mentioned CCDC and Accelrys.

The fact that the industry invests, deploys, and shares these tools with peers is the most impressive testimony that scientific web applications are starting to make a mark on today’s R&D environments, even if it’s a zero footprint one.

  • Share/Save/Bookmark

Look Before You Learn

June 23rd, 2009 by Dana Honeycutt, Ph.D.

Part of my job is creating and maintaining learner components for building statistical models in Accelrys’s Pipeline Pilot product. A statistical model is an empirically derived equation or set of rules for predicting some unknown property (say the toxicity of a chemical compound) from a set of known properties (say descriptors derived from the compound’s structure).

A statistical model–as contrasted to a mechanistic model–is built from a specific set of data, called the training set, using a specific learning algorithm (such as linear least-squares, recursive partitioning, etc.). The quality of the model is crucially dependent on the quality of the training data.

Pipeline Pilot makes it really easy to build statistical models from your data. All it takes is dropping in a data reader component, choosing an appropriate learner component, and specifying the variables you wish to use. Because
of this ease, you may be tempted to build models from a data set before taking a look at the data.

Don’t do it!

Here’s why: more often than you might think, data sets are dirty. Some values are missing or invalid. What you thought was a scalar property appears as an array in the data. A few extreme outliers are present which (depending on the learner) may seriously skew the results. Extra commas in your CSV file have shifted some values to the wrong columns. You’re trying to build a classification model, but all data records have been assigned the same class. You thought that your data set contained only small organic molecules, but somehow a few organometallics got in there. Unbeknownst to you, the creator of the data set used 99 as a missing value tag. And so on.

Pairs Plot of Contaminated Data Set

Pairs Plot of a Contaminated Data Set

I am sometimes called upon to diagnose problems that customers or colleagues have when trying to build a model. Often the root of the problem is that something is wrong with the input data. In many such cases, just looking at the data in a table makes the problem obvious. Other times, simple analysis (such as univariate analysis) or plots (such as pairs plots) show what’s wrong.

The more worrisome cases are the ones we may never hear about. Not all problems with a training data set will make a learner fail or produce obviously incorrect results. So even if you have gone ahead and successfully built a model before looking at the data, you should still look at the data afterward.

Whether you build models in Pipeline Pilot, R, Weka, or some other program, remember to Look before you Learn.

  • Share/Save/Bookmark

Acclerys Joins Microsoft at ACHEMA

May 24th, 2009 by George Fitzgerald, PhD
Microsoft booth at ACHEMA 2009

Microsoft booth at ACHEMA 2009

I just spent a week working in the Microsoft booth at ACHEMA 2009 in Frankfurt. Microsoft offered several of its software partners the chance to participate in their booth. Accelrys was there along with OSIsoft, Sycor, AspenTech, and others. Accelrys was demonstrating the integration between Pipeline Pilot and Microsoft Sharepoint as a solution for data integration and reporting. Pipeline Pilot provides Sharepoint customers with the means to access large amounts of scientific data, improving the discovery process. At the same time, embedding this within Sharepoint means that Accelrys solutions can be executed in a web environment, which is more familiar to most users than the specialized, “thick clients” normally employed for these applications. Read more about the integration here.

Accelrys station in the Microsoft booth

Accelrys station in the Microsoft booth

More about the conference: Held only every 3 years, the “Ausstellungskongress für Chemische Technik, Umweltschutz und Biotechnologie” is similar to the ACS or AIChE national meetings in the US in that it brings together scientists, engineers, and exhibitors from a broad range of chemical industries and universities, but the similarities end there. According to the ACHEMA web site the congress has some very impressive statistics:

  • 4,000 exhibitors
  • 180,000 participants
  • 900 lectures

Many of you will be familiar with national conventions like ACS, AIChE, and MRS so you have an idea what a typical expo looks like. This show was amazing. There were 3 floors of exhibits for pumps, fittings, and valves alone. There’s a total of 140,000 m² of exhibition space (that’s 1.5 million ft² to Americans). Keep in mind that the Germans only get to do this every 3 years, so they really need to make the most of it. I hope to do this again in 2012. See you there!

 

  • Share/Save/Bookmark

Informatics lessons from the MRS

May 19th, 2009 by George Fitzgerald, PhD

The Materials Research Society (MRS) “encourage communication and technical information exchange across the various fields of science affecting materials.” It sponsors a spring meeting held in San Francisco and a fall meeting in Boston. This year’s spring meeting covered topics ranging from amorphous materials to methods for environmental stability to multiple topics in nanotechnology. (See all symposium titles here.)

Most interesting to me was Symposium Z: “Computational Nano science — How to Exploit Synergy between Predictive Simulations and Experiment, which fits with the comments I made in my previous posting, and shows just how much active interest there is in this topic. Prof. Krishna Rajan, who heads the Combinatorial Sciences and Materials Informatics Collaboratory, demonstrated how he uses data mining as a tool to understand the formation of apatites (minerals of the form A10­(BO4) 8X2) based on data mining and statistical analysis. How do you get your head around and N-dimensional space? How do you grasp trends when there are dozens of variables to consider? Use methods like recursive partitioning and Principal Component Analysis (PCA). 

Simpler than the modeling approaches I mentioned in my earlier posting, these require only a statistical analysis of the data (some experimental results, some modeling output). The results reduce N-dimensional datasets to 2 or 3 dimensions that are “grasp-able” by mere humans. Applying these approaches to the apatite data clearly shows how the choices of cation and anion influence the stability of the crystal.

Just think how many other research problems we could understand if we had the tools to look at the data in the right way.

  • Share/Save/Bookmark

High Throughput – What’s a Researcher to do?

May 12th, 2009 by George Fitzgerald, PhD

High-throughput experimentation has been a mainstay in pharmaceutical discovery since the mid-1990’s. In a 1999 C&E News article (C&EN, vol. 77, pp 33-48 March 8, 1999) this approach was hailed as the next great thing. Unfortunately, we chemists soon realized that quantity is no replacement for quality; a notable article in the WSJ Drug Industry’s Big Push into Technology Falls Short,” was critical of this approach.

 

At the time, I was working on a DOE-funded project (DE-FC26-02NT41218) for high-throughput catalyst discovery for NOx catalysis in lean diesel engines, together with GM and Engelhard (now BASF). In practice, our method was not to generate 1000’s of samples and hope for the best but to screen fewer carefully selected samples quickly, and subject the “winners” to more sophisticated testing.

 

The approach employed in our NOx project was based on analysis of experimental data, design of experiment, and fitting response surfaces – and it worked. As pointed out in a recent BIOIT World article, however, experimental data alone are usually too noisy to build reliable statistical models. What’s a researcher to do? Molecular modeling, of course – hey I’m a modeller: you knew I was going to suggest that.

 

The key for success, it seems, is to employ a plurality of methods, both experimental and computational. Given even a modest amount of experimental data, you’ll need a database with decent search & query tools and basic statistical approaches like principle component analysis. But atomistic modeling is also important. Work by a number of research groups has shown that you can generate good predictive models from quantum mechanical methods (QM) for lots of different kinds of materials. (Keep in mind that these examples barely scratch the surface of the available literature).

 

But how do get to the point that anybody can make use of QM-based results? Doing these calculations typically takes a log time.

 

QSAR (Quantitative Structure Activity Relationship) is a terrific way to leverage QM results for complex research topics. These research groups followed the same basic procedure:

  • Start with some experimental data
  • Generate a statistical model
  • Grind through a lot of calculations
  • Forward the “winners” for experimental testing

 

You can see in the examples above that the approach can actually work. But how do you figure out what QM calculations to perform, and how do you create good statistical models? Well, that’s a story for next month.

  • Share/Save/Bookmark