Search This Blog

Thursday, August 24, 2023

Emotion recognition

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Emotion_recognition

Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables.

Human

Humans show a great deal of variability in their abilities to recognize emotion. A key point to keep in mind when learning about automated emotion recognition is that there are several sources of "ground truth," or truth about what the real emotion is. Suppose we are trying to recognize the emotions of Alex. One source is "what would most people say that Alex is feeling?" In this case, the 'truth' may not correspond to what Alex feels, but may correspond to what most people would say it looks like Alex feels. For example, Alex may actually feel sad, but he puts on a big smile and then most people say he looks happy. If an automated method achieves the same results as a group of observers it may be considered accurate, even if it does not actually measure what Alex truly feels. Another source of 'truth' is to ask Alex what he truly feels. This works if Alex has a good sense of his internal state, and wants to tell you what it is, and is capable of putting it accurately into words or a number. However, some people are alexithymic and do not have a good sense of their internal feelings, or they are not able to communicate them accurately with words and numbers. In general, getting to the truth of what emotion is actually present can take some work, can vary depending on the criteria that are selected, and will usually involve maintaining some level of uncertainty.

Automatic

Decades of scientific research have been conducted developing and evaluating methods for automated emotion recognition. There is now an extensive literature proposing and evaluating hundreds of different kinds of methods, leveraging techniques from multiple areas, such as signal processing, machine learning, computer vision, and speech processing. Different methodologies and techniques may be employed to interpret emotion such as Bayesian networks, Gaussian Mixture models and Hidden Markov Models and deep neural networks.

Approaches

The accuracy of emotion recognition is usually improved when it combines the analysis of human expressions from multimodal forms such as texts, physiology, audio, or video. Different emotion types are detected through the integration of information from facial expressions, body movement and gestures, and speech. The technology is said to contribute in the emergence of the so-called emotional or emotive Internet.

The existing approaches in emotion recognition to classify certain emotion types can be generally classified into three main categories: knowledge-based techniques, statistical methods, and hybrid approaches.

Knowledge-based techniques

Knowledge-based techniques (sometimes referred to as lexicon-based techniques), utilize domain knowledge and the semantic and syntactic characteristics of language in order to detect certain emotion types. In this approach, it is common to use knowledge-based resources during the emotion classification process such as WordNet, SenticNet, ConceptNet, and EmotiNet, to name a few. One of the advantages of this approach is the accessibility and economy brought about by the large availability of such knowledge-based resources. A limitation of this technique on the other hand, is its inability to handle concept nuances and complex linguistic rules.

Knowledge-based techniques can be mainly classified into two categories: dictionary-based and corpus-based approaches. Dictionary-based approaches find opinion or emotion seed words in a dictionary and search for their synonyms and antonyms to expand the initial list of opinions or emotions. Corpus-based approaches on the other hand, start with a seed list of opinion or emotion words, and expand the database by finding other words with context-specific characteristics in a large corpus. While corpus-based approaches take into account context, their performance still vary in different domains since a word in one domain can have a different orientation in another domain.

Statistical methods

Statistical methods commonly involve the use of different supervised machine learning algorithms in which a large set of annotated data is fed into the algorithms for the system to learn and predict the appropriate emotion types. Machine learning algorithms generally provide more reasonable classification accuracy compared to other approaches, but one of the challenges in achieving good results in the classification process, is the need to have a sufficiently large training set.

Some of the most commonly used machine learning algorithms include Support Vector Machines (SVM), Naive Bayes, and Maximum Entropy. Deep learning, which is under the unsupervised family of machine learning, is also widely employed in emotion recognition. Well-known deep learning algorithms include different architectures of Artificial Neural Network (ANN) such as Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), and Extreme Learning Machine (ELM). The popularity of deep learning approaches in the domain of emotion recognition may be mainly attributed to its success in related applications such as in computer vision, speech recognition, and Natural Language Processing (NLP).

Hybrid approaches

Hybrid approaches in emotion recognition are essentially a combination of knowledge-based techniques and statistical methods, which exploit complementary characteristics from both techniques. Some of the works that have applied an ensemble of knowledge-driven linguistic elements and statistical methods include sentic computing and iFeel, both of which have adopted the concept-level knowledge-based resource SenticNet. The role of such knowledge-based resources in the implementation of hybrid approaches is highly important in the emotion classification process. Since hybrid techniques gain from the benefits offered by both knowledge-based and statistical approaches, they tend to have better classification performance as opposed to employing knowledge-based or statistical methods independently. A downside of using hybrid techniques however, is the computational complexity during the classification process.

Datasets

Data is an integral part of the existing approaches in emotion recognition and in most cases it is a challenge to obtain annotated data that is necessary to train machine learning algorithms. For the task of classifying different emotion types from multimodal sources in the form of texts, audio, videos or physiological signals, the following datasets are available:

  1. HUMAINE: provides natural clips with emotion words and context labels in multiple modalities
  2. Belfast database: provides clips with a wide range of emotions from TV programs and interview recordings
  3. SEMAINE: provides audiovisual recordings between a person and a virtual agent and contains emotion annotations such as angry, happy, fear, disgust, sadness, contempt, and amusement
  4. IEMOCAP: provides recordings of dyadic sessions between actors and contains emotion annotations such as happiness, anger, sadness, frustration, and neutral state 
  5. eNTERFACE: provides audiovisual recordings of subjects from seven nationalities and contains emotion annotations such as happiness, anger, sadness, surprise, disgust, and fear 
  6. DEAP: provides electroencephalography (EEG), electrocardiography (ECG), and face video recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips 
  7. DREAMER: provides electroencephalography (EEG) and electrocardiography (ECG) recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips 
  8. MELD: is a multiparty conversational dataset where each utterance is labeled with emotion and sentiment. MELD provides conversations in video format and hence suitable for multimodal emotion recognition and sentiment analysis. MELD is useful for multimodal sentiment analysis and emotion recognition, dialogue systems and emotion recognition in conversations.
  9. MuSe: provides audiovisual recordings of natural interactions between a person and an object. It has discrete and continuous emotion annotations in terms of valence, arousal and trustworthiness as well as speech topics useful for multimodal sentiment analysis and emotion recognition.
  10. UIT-VSMEC: is a standard Vietnamese Social Media Emotion Corpus (UIT-VSMEC) with about 6,927 human-annotated sentences with six emotion labels, contributing to emotion recognition research in Vietnamese which is a low-resource language in Natural Language Processing (NLP).
  11. BED: provides electroencephalography (EEG) recordings, as well as emotion annotations in terms of valence and arousal of people watching images. It also includes electroencephalography (EEG) recordings of people exposed to various stimuli (SSVEP, resting with eyes closed, resting with eyes open, cognitive tasks) for the task of EEG-based biometrics.

Applications

Emotion recognition is used in society for a variety of reasons. Affectiva, which spun out of MIT, provides artificial intelligence software that makes it more efficient to do tasks previously done manually by people, mainly to gather facial expression and vocal expression information related to specific contexts where viewers have consented to share this information. For example, instead of filling out a lengthy survey about how you feel at each point watching an educational video or advertisement, you can consent to have a camera watch your face and listen to what you say, and note during which parts of the experience you show expressions such as boredom, interest, confusion, or smiling. (Note that this does not imply it is reading your innermost feelings—it only reads what you express outwardly.) Other uses by Affectiva include helping children with autism, helping people who are blind to read facial expressions, helping robots interact more intelligently with people, and monitoring signs of attention while driving in an effort to enhance driver safety.

A patent filed by Snapchat in 2015 describes a method of extracting data about crowds at public events by performing algorithmic emotion recognition on users' geotagged selfies.

Emotient was a startup company which applied emotion recognition to reading frowns, smiles, and other expressions on faces, namely artificial intelligence to predict "attitudes and actions based on facial expressions". Apple bought Emotient in 2016 and uses emotion recognition technology to enhance the emotional intelligence of its products.

nViso provides real-time emotion recognition for web and mobile applications through a real-time API. Visage Technologies AB offers emotion estimation as a part of their Visage SDK for marketing and scientific research and similar purposes.

Eyeris is an emotion recognition company that works with embedded system manufacturers including car makers and social robotic companies on integrating its face analytics and emotion recognition software; as well as with video content creators to help them measure the perceived effectiveness of their short and long form video creative.

Many products also exist to aggregate information from emotions communicated online, including via "like" button presses and via counts of positive and negative phrases in text and affect recognition is increasingly used in some kinds of games and virtual reality, both for educational purposes and to give players more natural control over their social avatars.

Subfields of emotion recognition

Emotion recognition is probably to gain the best outcome if applying multiple modalities by combining different objects, including text (conversation), audio, video, and physiology to detect emotions.

Emotion recognition in text

Text data is a favorable research object for emotion recognition when it is free and available everywhere in human life. Compare to other types of data, the storage of text data is lighter and easy to compress to the best performance due to the frequent repetition of words and characters in languages. Emotions can be extracted from two essential text forms: written texts and conversations (dialogues). For written texts, many scholars focus on working with sentence level to extract "words/phrases" representing emotions.

Emotion recognition in audio

Different from emotion recognition in text, vocal signals are used for the recognition to extract emotions from audio.

Emotion recognition in video

Video data is a combination of audio data, image data and sometimes texts (in case of subtitles).

Emotion recognition in conversation

Emotion recognition in conversation (ERC) extracts opinions between participants from massive conversational data in social platforms, such as Facebook, Twitter, YouTube, and others. ERC can take input data like text, audio, video or a combination form to detect several emotions such as fear, lust, pain, and pleasure.

Progress in artificial intelligence

Progress in machine classification of images
The error rate of AI by year. Red line - the error rate of a trained human on a particular task.

Progress in Artificial Intelligence (AI) refers to the advances, milestones, and breakthroughs that have been achieved in the field of artificial intelligence over time. AI is a multidisciplinary branch of computer science that aims to create machines and systems capable of performing tasks that typically require human intelligence. Artificial intelligence applications have been used in a wide range of fields including medical diagnosis, economic-financial applications, robot control, law, scientific discovery, video games, and toys. However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore." "Many thousands of AI applications are deeply embedded in the infrastructure of every industry." In the late 1990s and early 21st century, AI technology became widely used as elements of larger systems, but the field was rarely credited for these successes at the time.

Kaplan and Haenlein structure artificial intelligence along three evolutionary stages: 1) artificial narrow intelligence – applying AI only to specific tasks; 2) artificial general intelligence – applying AI to several areas and able to autonomously solve problems they were never even designed for; and 3) artificial super intelligence – applying AI to any area capable of scientific creativity, social skills, and general wisdom.

To allow comparison with human performance, artificial intelligence can be evaluated on constrained and well-defined problems. Such tests have been termed subject matter expert Turing tests. Also, smaller problems provide more achievable goals and there are an ever-increasing number of positive results.

Humans still substantially outperform both GPT-4 and models trained on the ConceptARC benchmark that scored 60% on most, and 77% on one category, while humans 91% on all and 97% on one category.

Current performance

Game Champion year Legal states (log10) Game tree complexity (log10) Game of perfect information?
Draughts (checkers) 1994 21 31 Perfect
Othello (reversi) 1997 28 58 Perfect
Chess 1997 46 123 Perfect
Scrabble 2006


Shogi 2017 71 226 Perfect
Go 2016 172 360 Perfect
2p no-limit hold 'em 2017

Imperfect
StarCraft - 270+
Imperfect
StarCraft II 2019

Imperfect

There are many useful abilities that can be described as showing some form of intelligence. This gives better insight into the comparative success of artificial intelligence in different areas.

AI, like electricity or the steam engine, is a general purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at. Some versions of Moravec's paradox observe that humans are more likely to outperform machines in areas such as physical dexterity that have been the direct target of natural selection. While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets. Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI."

Games provide a high-profile benchmark for assessing rates of progress; many games have a large professional player base and a well-established competitive rating system. AlphaGo brought the era of classical board-game benchmarks to a close when Artificial Intelligence proved their competitive edge over humans in 2016. Deep Mind’s AlphaGo AI software program defeated the world’s best professional Go Player Lee Sedol. Games of imperfect knowledge provide new challenges to AI in the area of game theory; the most prominent milestone in this area was brought to a close by Libratus' poker victory in 2017. E-sports continue to provide additional benchmarks; Facebook AI, Deepmind, and others have engaged with the popular StarCraft franchise of videogames.

Broad classes of outcome for an AI test may be given as:

  • optimal: it is not possible to perform better (note: some of these entries were solved by humans)
  • super-human: performs better than all humans
  • high-human: performs better than most humans
  • par-human: performs similarly to most humans
  • sub-human: performs worse than most humans

Optimal

Super-human

High-human

Par-human

Sub-human

  • Optical character recognition for printed text (nearing par-human for Latin-script typewritten text)
  • Object recognition
  • Various robotics tasks that may require advances in robot hardware as well as AI, including:
    • Stable bipedal locomotion: Bipedal robots can walk, but are less stable than human walkers (as of 2017)
    • Humanoid soccer
  • Speech recognition: "nearly equal to human performance" (2017)
  • Explainability. Current medical systems can diagnose certain medical conditions well, but cannot explain to users why they made the diagnosis.
  • Many tests of fluid intelligence (2020)
  • Bongard visual cognition problems, such as the Bongard-LOGO benchmark (2020)
  • Visual Commonsense Reasoning (VCR) benchmark (as of 2020)
  • Stock market prediction: Financial data collection and processing using Machine Learning algorithms
  • Angry Birds video game, as of 2020
  • Various tasks that are difficult to solve without contextual knowledge, including:

Proposed tests of artificial intelligence

In his famous Turing test, Alan Turing picked language, the defining feature of human beings, for its basis. The Turing test is now considered too exploitable to be a meaningful benchmark.

The Feigenbaum test, proposed by the inventor of expert systems, tests a machine's knowledge and expertise about a specific subject. A paper by Jim Gray of Microsoft in 2003 suggested extending the Turing test to speech understanding, speaking and recognizing objects and behavior.

Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; however, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels.

Exams

According to OpenAI, in 2023 ChatGPT GPT-4 scored the 90th percentile on the Uniform Bar Exam. On the SATs, GPT-4 scored the 89th percentile on math, and the 93rd percentile in Reading & Writing. On the GREs, it scored on the 54th percentile on the writing test, 88th percentile on the quantitative section, and 99th percentile on the verbal section. It scored in the 99th to 100th percentile on the 2020 USA Biology Olympiad semifinal exam. It scored a perfect "5" on several AP exams.

Independent researchers found in 2023 that ChatGPT GPT-3.5 "performed at or near the passing threshold" for the three parts of the United States Medical Licensing Examination. GPT-3.5 was also assessed to attain a low, but passing, grade from exams for four law school courses at the University of Minnesota. GPT-4 passed a text-based radiology board–style examination.

Competitions

Many competitions and prizes, such as the Imagenet Challenge, promote research in artificial intelligence. The most common areas of competition include general machine intelligence, conversational behavior, data-mining, robotic cars, and robot soccer as well as conventional games.

Past and current predictions

An expert poll around 2016, conducted by Katja Grace of the Future of Humanity Institute and associates, gave median estimates of 3 years for championship Angry Birds, 4 years for the World Series of Poker, and 6 years for StarCraft. On more subjective tasks, the poll gave 6 years for folding laundry as well as an average human worker, 7–10 years for expertly answering 'easily Googleable' questions, 8 years for average speech transcription, 9 years for average telephone banking, and 11 years for expert songwriting, but over 30 years for writing a New York Times bestseller or winning the Putnam math competition.

Chess

Deep Blue at the Computer History Museum

An AI defeated a grandmaster in a regulation tournament game for the first time in 1988; rebranded as Deep Blue, it beat the reigning human world chess champion in 1997 (see Deep Blue versus Garry Kasparov).

Estimates when computers would exceed humans at Chess
Year prediction made Predicted year Number of Years Predictor Contemporaneous source
1957 1967 or sooner 10 or less Herbert A. Simon, economist
1990 2000 or sooner 10 or less Ray Kurzweil, futurist Age of Intelligent Machines

Go

AlphaGo defeated a European Go champion in October 2015, and Lee Sedol in March 2016, one of the world's top players (see AlphaGo versus Lee Sedol). According to Scientific American and other sources, most observers had expected superhuman Computer Go performance to be at least a decade away.

Estimates when computers would exceed humans at Go
Year prediction made Predicted year Number of years Predictor Affiliation Contemporaneous source
1997 2100 or later 103 or more Piet Hutt, physicist and Go fan Institute for Advanced Study New York Times
2007 2017 or sooner 10 or less Feng-Hsiung Hsu, Deep Blue lead Microsoft Research Asia IEEE Spectrum
2014 2024 10 Rémi Coulom, Computer Go programmer CrazyStone Wired

Human-level artificial general intelligence (AGI)

AI pioneer and economist Herbert A. Simon inaccurately predicted in 1965: "Machines will be capable, within twenty years, of doing any work a man can do". Similarly, in 1970 Marvin Minsky wrote that "Within a generation... the problem of creating artificial intelligence will substantially be solved."

Four polls conducted in 2012 and 2013 suggested that the median estimate among experts for when AGI would arrive was 2040 to 2050, depending on the poll.

The Grace poll around 2016 found results varied depending on how the question was framed. Respondents asked to estimate "when unaided machines can accomplish every task better and more cheaply than human workers" gave an aggregated median answer of 45 years and a 10% chance of it occurring within 9 years. Other respondents asked to estimate "when all occupations are fully automatable. That is, when for any occupation, machines could be built to carry out the task better and more cheaply than human workers" estimated a median of 122 years and a 10% probability of 20 years. The median response for when "AI researcher" could be fully automated was around 90 years. No link was found between seniority and optimism, but Asian researchers were much more optimistic than North American researchers on average; Asians predicted 30 years on average for "accomplish every task", compared with the 74 years predicted by North Americans.

Estimates of when AGI will arrive
Year prediction made Predicted year Number of years Predictor Contemporaneous source
1965 1985 or sooner 20 or less Herbert A. Simon The shape of automation for men and management
1993 2023 or sooner 30 or less Vernor Vinge, science fiction writer "The Coming Technological Singularity"
1995 2040 or sooner 45 or less Hans Moravec, robotics researcher Wired
2008 Never / Distant future
Gordon E. Moore, inventor of Moore's Law IEEE Spectrum
2017 2029 12 Ray Kurzweil Interview

Liquid chromatography–mass spectrometry

Liquid chromatography–mass spectrometry
Bruker Amazon Speed ETD
Ion trap LCMS system with ESI interface
AcronymLCMS
ClassificationChromatography
Mass spectrometry
Analytesorganic molecules
biomolecules
ManufacturersAgilent
Bruker
PerkinElmer
SCIEX
Shimadzu Scientific
Thermo Fisher Scientific
Waters Corporation
Other techniques
RelatedGas chromatography–mass spectrometry

Liquid chromatography–mass spectrometry (LC–MS) is an analytical chemistry technique that combines the physical separation capabilities of liquid chromatography (or HPLC) with the mass analysis capabilities of mass spectrometry (MS). Coupled chromatography - MS systems are popular in chemical analysis because the individual capabilities of each technique are enhanced synergistically. While liquid chromatography separates mixtures with multiple components, mass spectrometry provides spectral information that may help to identify (or confirm the suspected identity of) each separated component. MS is not only sensitive, but provides selective detection, relieving the need for complete chromatographic separation. LC–MS is also appropriate for metabolomics because of its good coverage of a wide range of chemicals. This tandem technique can be used to analyze biochemical, organic, and inorganic compounds commonly found in complex samples of environmental and biological origin. Therefore, LC–MS may be applied in a wide range of sectors including biotechnology, environment monitoring, food processing, and pharmaceutical, agrochemical, and cosmetic industries. Since the early 2000s, LC–MS (or more specifically LC–MS–MS) has also begun to be used in clinical applications.

In addition to the liquid chromatography and mass spectrometry devices, an LC–MS system contains an interface that efficiently transfers the separated components from the LC column into the MS ion source. The interface is necessary because the LC and MS devices are fundamentally incompatible. While the mobile phase in a LC system is a pressurized liquid, the MS analyzers commonly operate under high vacuum. Thus, it is not possible to directly pump the eluate from the LC column into the MS source. Overall, the interface is a mechanically simple part of the LC–MS system that transfers the maximum amount of analyte, removes a significant portion of the mobile phase used in LC and preserves the chemical identity of the chromatography products (chemically inert). As a requirement, the interface should not interfere with the ionizing efficiency and vacuum conditions of the MS system. Nowadays, most extensively applied LC–MS interfaces are based on atmospheric pressure ionization (API) strategies like electrospray ionization (ESI), atmospheric-pressure chemical ionization (APCI), and atmospheric pressure photoionization (APPI). These interfaces became available in the 1990s after a two decade long research and development process.

History of LC–MS

The coupling of chromatography with MS is a well developed chemical analysis strategy dating back from the 1950s. Gas chromatography (GC)MS was originally introduced in 1952, when A. T. James and A. J. P. Martin were trying to develop tandem separation - mass analysis techniques. In GC, the analytes are eluted from the separation column as a gas and the connection with electron ionization (EI) or chemical ionization (CI) ion sources in the MS system was a technically simpler challenge. Because of this, the development of GC-MS systems was faster than LC–MS and such systems were first commercialized in the 1970s. The development of LC–MS systems took longer than GC-MS and was directly related to the development of proper interfaces. V. L. Tal'roze and collaborators started the development of LC–MS in the late 1960s, when they first used capillaries to connect an LC columns to an EI source. A similar strategy was investigated by McLafferty and collaborators in 1973 who coupled the LC column to a CI source, which allowed a higher liquid flow into the source. This was the first and most obvious way of coupling LC with MS, and was known as the capillary inlet interface. This pioneer interface for LC–MS had the same analysis capabilities of GC-MS and was limited to rather volatile analytes and non-polar compounds with low molecular mass (below 400 Da). In the capillary inlet interface, the evaporation of the mobile phase inside the capillary was one of the main issues. Within the first years of development of LC–MS, on-line and off-line alternatives were proposed as coupling alternatives. In general, off-line coupling involved fraction collection, evaporation of solvent, and transfer of analytes to the MS using probes. Off-line analyte treatment process was time consuming and there was an inherent risk of sample contamination. Rapidly, it was realized that the analysis of complex mixtures would require the development of a fully automated on-line coupling solution in LC–MS.

The key to the success and wide-spread adoption of LC–MS as a routine analytical tool lies in the interface and ion source between the liquid-based LC and the vacuum-base MS. The following interfaces were stepping-stones on the way to the modern atmospheric-pressure ionization interfaces, and are described for historical interest.

Moving-belt interface

The moving-belt interface (MBI) was developed by McFadden et al. in 1977 and commercialized by Finnigan. This interface consisted of an endless moving belt onto which the LC column effluent was deposited in a band. On the belt, the solvent was evaporated by gently heating and efficiently exhausting the solvent vapours under reduced pressure in two vacuum chambers. After the liquid phase was removed, the belt passed over a heater which flash desorbed the analytes into the MS ion source. One of the significant advantages of the MBI was its compatibility with a wide range of chromatographic conditions. MBI was successfully used for LC–MS applications between 1978 and 1990 because it allowed coupling of LC to MS devices using EI, CI, and fast-atom bombardment (FAB) ion sources. The most common MS systems connected by MBI interfaces to LC columns wre magnetic sector and quadrupole instruments. MBI interfaces for LC–MS allowed MS to be widely applied in the analysis of drugs, pesticides, steroids, alkaloids, and polycyclic aromatic hydrocarbons. This interface is no longer used because of its mechanical complexity and the difficulties associated with belt renewal as well as its inability to handle very labile biomolecules.


Direct liquid-introduction interface

The direct liquid-introduction (DLI) interface was developed in 1980. This interface was intended to solve the problem of evaporation of liquid inside the capillary inlet interface. In DLI, a small portion of the LC flow was forced through a small aperture or diaphragm (typically 10um in diameter) to form a liquid jet composed of small droplets that were subsequently dried in a desolvation chamber. The analytes were ionized using a solvent-assisted chemical ionization source, where the LC solvents acted as reagent gases. To use this interface, it was necessary to split the flow coming out of the LC column because only a small portion of the effluent (10 to 50 μl/min out of 1 ml/min) could be introduced into the source without raising the vacuum pressure of the MS system too high. Alternately, Henion at Cornell University had success with using micro-bore LC methods so that the entire (low) flow of the LC could be used. One of the main operational problems of the DLI interface was the frequent clogging of the diaphragm orifices. The DLI interface was used between 1982 and 1985 for the analysis of pesticides, corticosteroids, metabolites in horse urine, erythromycin, and vitamin B12. However, this interface was replaced by the thermospray interface, which removed the flow rate limitations and the issues with the clogging diaphragms.

A related device was the particle beam interface (PBI), developed by Willoughby and Browner in 1984. Particle beam interfaces took over the wide applications of MBI for LC–MS in 1988. The PBI operated by using a helium gas nebulizer to spray the eluant into the vacuum, drying the droplets and pumping away the solvent vapour (using a jet separator) while the stream of monodisperse dried particles containing the analyte entered the source. Drying the droplets outside of the source volume, and using a jet separator to pump away the solvent vapour, allowed the particles to enter and be vapourized in a low-pressure EI source. As with the MBI, the ability to generate library-searchable EI spectra was a distinct advantage for many applications. Commercialized by Hewlett Packard, and later by VG and Extrel, it enjoyed moderate success, but has been largely supplanted by the atmospheric pressure interfaces such as electrospray and APCI which provide a broader range of compound coverage and applications.

Thermospray interface

The thermospray (TSP) interface was developed in 1980 by Marvin Vestal and co-workers at the University of Houston. It was commercialized by Vestec and several of the major mass spectrometer manufacurers. The interface resulted from a long term research project intended to find a LC–MS interface capable of handling high flow rates (1 ml/min) and avoiding the flow split in DLI interfaces. The TSP interface was composed of a heated probe, a desolvation chamber, and an ion focusing skimmer. The LC effluent passed through the heated probe and emerged as a jet of vapor and small droplets flowing into the desolvation chamber at low pressure. Initially operated with a filament or discharge as the source of ions (thereby acting as a CI source for vapourized analyte), it was soon discovered that ions were also observed when the filament or discharge was off. This could be attributed to either direct emission of ions from the liquid droplets as they evaporated in a process related to electrospray ionization or ion evaporation, or to chemical ionization of vapourized analyte molecules from buffer ions (such as ammonium acetate). The fact that multiply-charged ions were observed from some larger analytes suggests that direct analyte ion emission was occurring under at least some conditions. The interface was able to handle up to 2 ml/min of eluate from the LC column and would efficiently introduce it into the MS vacuum system. TSP was also more suitable for LC–MS applications involving reversed phase liquid chromatography (RT-LC). With time, the mechanical complexity of TSP was simplified, and this interface became popular as the first ideal LC–MS interface for pharmaceutical applications comprising the analysis of drugs, metabolites, conjugates, nucleosides, peptides, natural products, and pesticides. The introduction of TSP marked a significant improvement for LC–MS systems and was the most widely applied interface until the beginning of the 1990s, when it began to be replaced by interfaces involving atmospheric pressure ionization (API).

FAB based interfaces

The frit fast atom bombardment (FAB) and continuous flow-FAB (CF-FAB) interfaces were developed in 1985 and 1986 respectively. Both interfaces were similar, but they differed in that the first used a porous frit probe as connecting channel, while CF-FAB used a probe tip. From these, the CF-FAB was more successful as a LC–MS interface and was useful to analyze non-volatile and thermally labile compounds. In these interfaces, the LC effluent passed through the frit or CF-FAB channels to form a uniform liquid film at the tip. There, the liquid was bombarded with ion beams or high energy atoms (fast atoms). For stable operation, the FAB based interfaces were able to handle liquid flow rates of only 1–15 μl and were also restricted to microbore and capillary columns. In order to be used in FAB MS ionization sources, the analytes of interest had to be mixed with a matrix (e.g., glycerol) that could be added before or after the separation in the LC column. FAB based interfaces were extensively used to characterize peptides, but lost applicability with the advent of electrospray based interfaces in 1988.

Liquid chromatography

Diagram of an LC–MS system

Liquid chromatography is a method of physical separation in which the components of a liquid mixture are distributed between two immiscible phases, i.e., stationary and mobile. The practice of LC can be divided into five categories, i.e., adsorption chromatography, partition chromatography, ion-exchange chromatography, size-exclusion chromatography, and affinity chromatography. Among these, the most widely used variant is the reverse-phase (RP) mode of the partition chromatography technique, which makes use of a nonpolar (hydrophobic) stationary phase and a polar mobile phase. In common applications, the mobile phase is a mixture of water and other polar solvents (e.g., methanol, isopropanol, and acetonitrile), and the stationary matrix is prepared by attaching long-chain alkyl groups (e.g., n-octadecyl or C18) to the external and internal surfaces of irregularly or spherically shaped 5 μm diameter porous silica particles.

In HPLC, typically 20 μl of the sample of interest are injected into the mobile phase stream delivered by a high pressure pump. The mobile phase containing the analytes permeates through the stationary phase bed in a definite direction. The components of the mixture are separated depending on their chemical affinity with the mobile and stationary phases. The separation occurs after repeated sorption and desorption steps occurring when the liquid interacts with the stationary bed. The liquid solvent (mobile phase) is delivered under high pressure (up to 400 bar or 5800 psi) into a packed column containing the stationary phase. The high pressure is necessary to achieve a constant flow rate for reproducible chromatography experiments. Depending on the partitioning between the mobile and stationary phases, the components of the sample will flow out of the column at different times. The column is the most important component of the LC system and is designed to withstand the high pressure of the liquid. Conventional LC columns are 100–300 mm long with outer diameter of 6.4 mm (1/4 inch) and internal diameter of 3.04.6 mm. For applications involving LC–MS, the length of chromatography columns can be shorter (30–50 mm) with 3–5 μm diameter packing particles. In addition to the conventional model, other LC columns are the narrow bore, microbore, microcapillary, and nano-LC models. These columns have smaller internal diameters, allow for a more efficient separation, and handle liquid flows under 1 ml/min (the conventional flow-rate). In order to improve separation efficiency and peak resolution, ultra performance liquid chromatography (UHPLC) can be used instead of HPLC. This LC variant uses columns packed with smaller silica particles (~1.7 μm diameter) and requires higher operating pressures in the range of 310000 to 775000 torr (6000 to 15000 psi, 400 to 1034 bar).

Mass spectrometry

LC–MS spectrum of each resolved peak

Mass spectrometry (MS) is an analytical technique that measures the mass-to-charge ratio (m/z) of charged particles (ions). Although there are many different kinds of mass spectrometers, all of them make use of electric or magnetic fields to manipulate the motion of ions produced from an analyte of interest and determine their m/z. The basic components of a mass spectrometer are the ion source, the mass analyzer, the detector, and the data and vacuum systems. The ion source is where the components of a sample introduced in a MS system are ionized by means of electron beams, photon beams (UV lights), laser beams or corona discharge. In the case of electrospray ionization, the ion source moves ions that exist in liquid solution into the gas phase. The ion source converts and fragments the neutral sample molecules into gas-phase ions that are sent to the mass analyzer. While the mass analyzer applies the electric and magnetic fields to sort the ions by their masses, the detector measures and amplifies the ion current to calculate the abundances of each mass-resolved ion. In order to generate a mass spectrum that a human eye can easily recognize, the data system records, processes, stores, and displays data in a computer.

The mass spectrum can be used to determine the mass of the analytes, their elemental and isotopic composition, or to elucidate the chemical structure of the sample. MS is an experiment that must take place in gas phase and under vacuum (1.33 * 10−2 to 1.33 * 10−6 pascal). Therefore, the development of devices facilitating the transition from samples at higher pressure and in condensed phase (solid or liquid) into a vacuum system has been essential to develop MS as a potent tool for identification and quantification of organic compounds like peptides. MS is now in very common use in analytical laboratories that study physical, chemical, or biological properties of a great variety of compounds. Among the many different kinds of mass analyzers, the ones that find application in LC–MS systems are the quadrupole, time-of-flight (TOF), ion traps, and hybrid quadrupole-TOF (QTOF) analyzers.

Interfaces

The interface between a liquid phase technique (HPLC) with a continuously flowing eluate, and a gas phase technique carried out in a vacuum was difficult for a long time. The advent of electrospray ionization changed this. Currently, the most common LC–MS interfaces are electrospray ionization (ESI), atmospheric pressure chemical ionization (APCI), and atmospheric pressure photo-ionization (APPI). These are newer MS ion sources that facilitate the transition from a high pressure environment (HPLC) to high vacuum conditions needed at the MS analyzer. Although these interfaces are described individually, they can also be commercially available as dual ESI/APCI, ESI/APPI, or APCI/APPI ion sources. Various deposition and drying techniques were used in the past (e.g., moving belts) but the most common of these was the off-line MALDI deposition. A new approach still under development called direct-EI LC–MS interface, couples a nano HPLC system and an electron ionization equipped mass spectrometer.

Electrospray ionization (ESI)

ESI interface for LC–MS systems was developed by Fenn and collaborators in 1988. This ion source/ interface can be used for the analysis of moderately polar and even very polar molecules (e.g., metabolites, xenobiotics, peptides, nucleotides, polysaccharides). The liquid eluate coming out of the LC column is directed into a metal capillary kept at 3 to 5 kV and is nebulized by a high-velocity coaxial flow of gas at the tip of the capillary, creating a fine spray of charged droplets in front of the entrance to the vacuum chamber. To avoid contamination of the vacuum system by buffers and salts, this capillary is usually perpendicularly located at the inlet of the MS system, in some cases with a counter-current of dry nitrogen in front of the entrance through which ions are directed by the electric field. In some sources, rapid droplet evaporation and thus maximum ion emission is achieved by mixing an additional stream of hot gas with the spray plume in front of the vacuum entrance. In other sources, the droplets are drawn through a heated capillary tube as they enter the vacuum, promoting droplet evaporation and ion emission. These methods of increasing droplet evaporation now allow the use of liquid flow rates of 1 - 2 mL/min to be used while still achieving efficient ionisation and high sensitivity. Thus while the use of 1 - 3 mm microbore columns and lower flow rates of 50 - 200 μl/min was commonly considered necessary for optimum operation, this limitation is no longer as important, and the higher column capacity of larger bore columns can now be advantageously employed with ESI LC–MS systems. Positively and negatively charged ions can be created by switching polarities, and it is possible to acquire alternate positive and negative mode spectra rapidly within the same LC run . While most large molecules (greater than MW 1500-2000) produce multiply charged ions in the ESI source, the majority of smaller molecules produce singly charged ions.

Atmospheric pressure chemical ionization (APCI)

The development of the APCI interface for LC–MS started with Horning and collaborators in the early 1973. However, its commercial application was introduced at the beginning of the 1990s after Henion and collaborators improved the LC–APCI–MS interface in 1986. The APCI ion source/ interface can be used to analyze small, neutral, relatively non-polar, and thermally stable molecules (e.g., steroids, lipids, and fat soluble vitamins). These compounds are not well ionized using ESI. In addition, APCI can also handle mobile phase streams containing buffering agents. The liquid from the LC system is pumped through a capillary and there is also nebulization at the tip, where a corona discharge takes place. First, the ionizing gas surrounding the interface and the mobile phase solvent are subject to chemical ionization at the ion source. Later, these ions react with the analyte and transfer their charge. The sample ions then pass through small orifice skimmers by means of or ion-focusing lenses. Once inside the high vacuum region, the ions are subject to mass analysis. This interface can be operated in positive and negative charge modes and singly-charged ions are mainly produced. APCI ion source can also handle flow rates between 500 and 2000 μl/min and it can be directly connected to conventional 4.6 mm ID columns.

Atmospheric pressure photoionization (APPI)

The APPI interface for LC–MS was developed simultaneously by Bruins and Syage in 2000. APPI is another LC–MS ion source/ interface for the analysis of neutral compounds that cannot be ionized using ESI. This interface is similar to the APCI ion source, but instead of a corona discharge, the ionization occurs by using photons coming from a discharge lamp. In the direct-APPI mode, singly charged analyte molecular ions are formed by absorption of a photon and ejection of an electron. In the dopant-APPI mode, an easily ionizable compound (Dopant) is added to the mobile phase or the nebulizing gas to promote a reaction of charge-exchange between the dopant molecular ion and the analyte. The ionized sample is later transferred to the mass analyzer at high vacuum as it passes through small orifice skimmers.

Applications

The coupling of MS with LC systems is attractive because liquid chromatography can separate delicate and complex natural mixtures, which chemical composition needs to be well established (e.g., biological fluids, environmental samples, and drugs). Further, LC–MS has applications in volatile explosive residue analysis. Nowadays, LC–MS has become one of the most widely used chemical analysis techniques because more than 85% of natural chemical compounds are polar and thermally labile and GC-MS cannot process these samples. As an example, HPLC–MS is regarded as the leading analytical technique for proteomics and pharmaceutical laboratories. Other important applications of LC–MS include the analysis of food, pesticides, and plant phenols.

Pharmacokinetics

LC–MS is widely used in the field of bioanalysis and is specially involved in pharmacokinetic studies of pharmaceuticals. Pharmacokinetic studies are needed to determine how quickly a drug will be cleared from the body organs and the hepatic blood flow. MS analyzers are useful in these studies because of their shorter analysis time, and higher sensitivity and specificity compared to UV detectors commonly attached to HPLC systems. One major advantage is the use of tandem MS–MS, where the detector may be programmed to select certain ions to fragment. The measured quantity is the sum of molecule fragments chosen by the operator. As long as there are no interferences or ion suppression in LC–MS, the LC separation can be quite quick.

Proteomics/metabolomics

LC–MS is used in proteomics as a method to detect and identify the components of a complex mixture. The bottom-up proteomics LC–MS approach generally involves protease digestion and denaturation using trypsin as a protease, urea to denature the tertiary structure, and iodoacetamide to modify the cysteine residues. After digestion, LC–MS is used for peptide mass fingerprinting, or LC–MS/MS (tandem MS) is used to derive the sequences of individual peptides. LC–MS/MS is most commonly used for proteomic analysis of complex samples where peptide masses may overlap even with a high-resolution mass spectrometry. Samples of complex biological (e.g., human serum) may be analyzed in modern LC–MS/MS systems, which can identify over 1000 proteins. However, this high level of protein identification is possible only after separating the sample by means of SDS-PAGE gel or HPLC-SCX. Recently, LC–MS/MS has been applied to search peptide biomarkers. Examples are the recent discovery and validation of peptide biomarkers for four major bacterial respiratory tract pathogens (Staphylococcus aureus, Moraxella catarrhalis; Haemophilus influenzae and Streptococcus pneumoniae) and the SARS-CoV-2 virus.

LC–MS has emerged as one of the most commonly used techniques in global metabolite profiling of biological tissue (e.g., blood plasma, serum, urine). LC–MS is also used for the analysis of natural products and the profiling of secondary metabolites in plants. In this regard, MS-based systems are useful to acquire more detailed information about the wide spectrum of compounds from a complex biological samples. LC–nuclear magnetic resonance (NMR) is also used in plant metabolomics, but this technique can only detect and quantify the most abundant metabolites. LC–MS has been useful to advance the field of plant metabolomics, which aims to study the plant system at molecular level providing a non-biased characterization of the plant metabolome in response to its environment. The first application of LC–MS in plant metabolomics was the detection of a wide range of highly polar metabolites, oligosaccharides, amino acids, amino sugars, and sugar nucleotides from Cucurbita maxima phloem tissues. Another example of LC–MS in plant metabolomics is the efficient separation and identification of glucose, sucrose, raffinose, stachyose, and verbascose from leaf extracts of Arabidopsis thaliana.

Drug development

LC–MS is frequently used in drug development because it allows quick molecular weight confirmation and structure identification. These features speed up the process of generating, testing, and validating a discovery starting from a vast array of products with potential application. LC–MS applications for drug development are highly automated methods used for peptide mapping, glycoprotein mapping, lipodomics, natural products dereplication, bioaffinity screening, in vivo drug screening, metabolic stability screening, metabolite identification, impurity identification, quantitative bioanalysis, and quality control.

Operator (computer programming)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Operator_(computer_programmin...