Cheminformatics
From Wikipedia, the free encyclopedia
History
The term chemoinformatics was defined by F.K. Brown [1][2] in 1998:Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.Since then, both spellings have been used, and some have evolved to be established as Cheminformatics,[3] while European Academia settled in 2006 for Chemoinformatics.[4] The recent establishment of the Journal of Cheminformatics is a strong push towards the shorter variant.
Basics
Cheminformatics combines the scientific working fields of chemistry, computer science and information science for example in the areas of topology, chemical graph theory, information retrieval and data mining in the chemical space.[5][6][7][8]Cheminformatics can also be applied to data analysis for various industries like paper and pulp, dyes and such allied industries.Applications
Storage and retrieval
The primary application of cheminformatics is in the storage, indexing and search of information relating to compounds. The efficient search of such stored information includes topics that are dealt with in computer science as data mining, information retrieval, information extraction and machine learning. Related research topics include:File formats
The in silico representation of chemical structures uses specialized formats such as the XML-based Chemical Markup Language or SMILES. These representations are often used for storage in large chemical databases. While some formats are suited for visual representations in 2 or 3 dimensions, others are more suited for studying physical interactions, modeling and docking studies.Virtual libraries
Chemical data can pertain to real or virtual molecules. Virtual libraries of compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties.Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm. [9] This was done by using cheminformatic tools to train transition probabilities of a Markov chain on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.