Header Artwork
Header Artwork


The Case against “Lazy” E&L-Identifications in GC/MS

In Chemical Characterization studies for Medical devices, as well as in Extractables & Leachable Studies for Pharmaceutical Packaging Systems, it is of the utmost importance that one can correctly identify the compounds encountered by your screening methods. The reason for this is that a chemical compound can only be linked with its potential toxicological properties when the compound is correctly identified. If a lab fails to properly identify compounds, it becomes a fatal error in the overall safety assessment of a device.

When we started performing E&L studies at Nelson Labs Europe in Leuven almost 20 years ago, the only thing we could rely on were the NIST/WILEY Mass Spectral Libraries. So, for all Headspace GC/MS (VOC) and GC/MS (SVOC) evaluations, we relied heavily on the output mass spectral match between the compound being identified and a mass spectrum that was in one of those libraries. Our lab procedures described this process very well, so from a quality perspective we were covered. We also thought that by specifying the match factors of the mass spectral libraries in our reports, sponsors could make the interpretation of the quality of the identification reported, and that they would take appropriate safety measures. However, this turned out not to be the case; and to be honest, we struggled with this process ourselves.

After a while, we became more familiar with mass spectral interpretation and we discovered that match factors may work well if they are very high, but do not work at all in the lower percentage ranges. For example, if you have a mass spectral (MS) match factor of 98% for a compound, there is a very high probability (although not 100%) that you have correctly identified the compound. Even if the match score is above 80%, there may still be some value in reporting the compound, identified through NIST/WILEY; although the probability of correct identification will be substantially lower than in the case of the 98% mass spectral fit.  If the compound was not identified correctly, there is a distinct probability that its true structure will be related to the one that is being reported.

Reporting compounds with a lower fit than 80% brings you into the grey zone. The identification could have some value, but it could also be completely off. The lower you go in mass spectral fit scores, the lower the probability that the reported compound will be closely related to the true structure of the compound. In reality, reporting compounds with spectral match factors substantially below 80% does not make sense knowing what we know today about the need for accurate compound identifications for safety assessments.

The problem is the perceptions of the users of this data.

If you do not understand the proper meaning of these spectral match factors, you could think that the match factor represents the probability of the correctness of the identification. As an example, if you have a spectral match factor of 98% you might assume the identification is 98% certain that you identified the right compound. If the spectral match factor is 53%, you are 53% certain it is the right compound, etc. This is absolutely incorrect! What most users of the data do not understand is that the match factor DOES NOT REPRESENT the percentage of correctness of the identification at all. Mass spectral match factors simply express the “quality of resemblance” of a target mass spectrum with a reference mass spectrum, they do NOT quantify the likelihood of correct identification!  Mass spectral match factors also do not show a linear behaviour, if the match factor is lower than 70% the probability of having the compound identified correctly is next to 0%. That is the correct and very misunderstood interpretation of the data.

Unfortunately, we started to see full assessments made, based upon identifications we provided – in line with the procedures in our Quality System, when we knew it was not the correct structure assessed.

That left us with a problem; while we were not initially aware of all the consequences of providing our customers with low quality identifications, now we were confronted with these consequences. Companies spent a lot of time and money assessing compounds where we knew there was some serious doubt regarding the accuracy of the compound identification. Not only was money spent on assessing the wrong compounds, but even more importantly were the potential safety risks to the patient that came from compounds that were not identified correctly, leaving us with a sour feeling.

After these observations, in 2006 we concluded three things:

  1. All suggested identities, based on match spectral matching, needed to be reviewed, regardless of the quality of the match factor. A visual inspection of the resemblance of the mass spectrum of the analyte to be identified, to the mass spectrum of the compound giving the “best fit”, will often tell more about the likelihood of correct identification than the mass spectral match factor itself. Mass fragments that are present in the target mass spectrum but not in the reference mass spectrum, or vice versa, or deviations in their relative mass abundancies are already a good predictor of this likelihood of correct identification.
  2. Spectral Match Factor Threshold: We determined that we would not report chemical names and structures if the spectral match factor is below 80%, even after a mass spectral review process. Instead, we would report conclusions on its chemical structure (e.g. it is a branched alkane; the molecule contains N and O; it is a brominated compound, etc.) based upon mass spectral interpretation. If a company would like to understand more about the chemical structure of such a partially identified compound, we could perform a “second pass” test using high-end analytical equipment such as accurate mass platforms to get a more complete profile of the unidentified compound.
  3. Nelson Labs Unique Compounds Screener Database: These results led us to understand that we urgently needed to start working on our own mass spectral database, built by analysing authentic analytical standards, where we could confirm the identity of the compounds, not only based upon the mass spectral features, but also based upon retention time in the chromatographic method. This paved the way for the Nelson Discovery and Screener database, which now contains close to 6,000 compounds for unique identifications. This is important because we could move away from match factors and move toward true identification of compounds. I will elaborate more in one of the next blogs.

In general, an identification strategy based on mass spectral matching with existing commercial libraries (such as NIST/WILEY) starts from the assumption that ALL mass spectra of ALL compounds are present in these MS libraries. However, it is clear that an identification strategy based upon mass spectral matching can only be successful if the compounds are actually present in these libraries! Based upon our 20 years of experience, we have seen that there is a high number of E&L compounds not in any MS library.  For example, degradation, oxidation, hydrolysis compounds of polymers, their additives or their impurities, reaction products (after elastomer vulcanization), etc. Trying to find an identity for every single E&L compound in an MS library is oversimplifying the task of an analytical chemist in the area of E&L.

I want to share this because I see a lot of other practitioners in E&L using practices that we used 15-20 years ago assuming mass spectral match factors equalled compound identifications–which is not correct. Reporting compound identifications through “lazy” identifications –simply reporting the best fit with mass spectra libraries for every single compound detected in an extraction study, regardless of their mass spectral match factor – leads in many cases to incorrect safety assessments, which puts the end users of these devices or pharmaceutical packaging systems at risk. In general, a reviewer of analytical data is helped with a chemical name, CAS number and structure, as this information can link the compounds identity to its toxicological profile. This is the reason there is a trend for analytical labs to report chemical names and structures, even if they have a very low probability of being “the right compound”. However, companies providing identifications of compounds should be aware of the risks that are inherent with those practices. As I look at it, a practitioner of E&L services should provide reliable data to reviewers and authorities and use scientific discretion to judge whether a level of identification of a compound has sufficient value or not as well as recognizing the limitations of the identification made. We cannot expect every reviewer to have a full understanding of the analytics behind E&L testing so the scientific community must be nonsense filters in this regard. This is the responsibility of the lab, regardless of what is in your SOPs. The nonsense filter should be at the level of the lab, not at the level of the reviewer.

Piet Christiaens

Piet Christiaens

Scientific Director

Piet Christiaens received his Ph.D. from the Analytical Chemistry Department of the University of Leuven (Belgium) in 1991. From 1992 to 1997, he was Lab Manager in two CROs. From 1997 to 2000, he worked as an independent consultant with Shell Chemical Company in Houston, Texas (US), working on hydrogenated triblock co-polymers. Since 2001, Piet...