A Medley of Potpourri

Nanoinformatics is the application of informatics to nanotechnology. It is an interdisciplinary field that develops methods and software tools for understanding nanomaterials, their properties, and their interactions with biological entities, and using that information more efficiently. It differs from cheminformatics in that nanomaterials usually involve nonuniform collections of particles that have distributions of physical properties that must be specified. The nanoinformatics infrastructure includes ontologies for nanomaterials, file formats, and data repositories.

Nanoinformatics has applications for improving workflows in fundamental research, manufacturing, and environmental health, allowing the use of high-throughput data-driven methods to analyze broad sets of experimental results. Nanomedicine applications include analysis of nanoparticle-based pharmaceuticals for structure–activity relationships in a similar manner to bioinformatics.

Background

While conventional chemicals are specified by their chemical composition, and concentration, nanoparticles have other physical properties that must be measured for a complete description, such as size, shape, surface properties, crystallinity, and dispersion state. In addition, preparations of nanoparticles are often non-uniform, having distributions of these properties that must also be specified. These molecular-scale properties influence their macroscopic chemical and physical properties, as well as their biological effects. They are important in both the experimental characterization of nanoparticles and their representation in an informatics system. The context of nanoinformatics is that effective development and implementation of potential applications of nanotechnology requires the harnessing of information at the intersection of safety, health, well-being, and productivity; risk management; and emerging nanotechnology.

One working definition of nanoinformatics developed through the community-based Nanoinformatics 2020 Roadmap and subsequently expanded is:

Determining which information is relevant to meeting the safety, health, well-being, and productivity objectives of the nanoscale science, engineering, and technology community;
Developing and implementing effective mechanisms for collecting, validating, storing, sharing, analyzing, modeling, and applying the information;
Confirming that appropriate decisions were made and that desired mission outcomes were achieved as a result of that information; and finally
Conveying experience to the broader community, contributing to generalized knowledge, and updating standards and training.

Data representations

Although nanotechnology is the subject of significant experimentation, much of the data are not stored in standardized formats or broadly accessible. Nanoinformatics initiatives seek to coordinate developments of data standards and informatics methods.

Ontologies

In the context of information science, an ontology is a formal representation of knowledge within a domain, using hierarchies of terms including their definitions, attributes, and relations. Ontologies provide a common terminology in a machine-readable framework that facilitates sharing and discovery of data. Having an established ontology for nanoparticles is important for cancer nanomedicine due to the need of researchers to search, access, and analyze large amounts of data.

The NanoParticle Ontology is an ontology for the preparation, chemical composition, and characterization of nanomaterials involved in cancer research. It uses the Basic Formal Ontology framework and is implemented in the Web Ontology Language. It is hosted by the National Center for Biomedical Ontology and maintained on GitHub. The eNanoMapper Ontology is more recent and reuses wherever possible already existing domain ontologies. As such, it reuses and extends the NanoParticle Ontology, but also the BioAssay Ontology, Experimental Factor Ontology, Unit Ontology, and ChEBI.

File formats

ISA-TAB-Nano is a set of four spreadsheet-based file formats for representing and sharing nanomaterial data based on the ISA-TAB metadata standard. In Europe, other templates have been adopted that were developed by the Institute of Occupational Medicine, and by the Joint Research Centre for the NANoREG project.

Tools

Nanoinformatics is not limited to aggregating and sharing information about nanotechnologies, but has many complementary tools, some originating from chemoinformatics and bioinformatics.

Databases and repositories

Over the past decase, various publicly available nanomaterials databases and repositories have been constructed to support nanoinformatics and toxicology modelling. These databases often store standardised physicochemical, biological, and toxicological data on engineered nanomaterials and offer model-ready datasets to the scientific community to enable data reuse. Given the limitation of data availability in nanoinformatics tasks, the curation of large datasets and storage into accessible repositories is prioritised to support computational modelling, regulatory assessment, and data-driven research.

caNanoLab, developed by the U.S. National Cancer Institute, focuses on nanotechnologies related to biomedicine. The NanoMaterials Registry, maintained by RTI International, is a curated database of nanomaterials, and includes data from caNanoLab.

The eNanoMapper database, a project of the EU NanoSafety Cluster, is a deployment of the database software developed in the eNanoMapper project. It has since been used in other settings, such as the EU Observatory for NanoMaterials (EUON).

Other databases include the Center for the Environmental Implications of NanoTechnology's NanoInformatics Knowledge Commons (NIKC) and NanoDatabank, PEROSH's Nano Exposure & Contextual Information Database (NECID), Data and Knowledge on Nanomaterials (DaNa), NanoPharos and Springer Nature's Nano database.

Applications

Nanoinformatics has applications for improving workflows in fundamental research, manufacturing, and environmental health, allowing the use of high-throughput data-driven methods to analyze broad sets of experimental results.

Nanoinformatics is especially useful in nanoparticle-based cancer diagnostics and therapeutics. They are very diverse in nature due to the combinatorially large numbers of chemical and physical modifications that can be made to them, which can cause drastic changes in their functional properties. This leads to a combinatorial complexity that far exceeds, for example, genomic data. Nanoinformatics can enable structure–activity relationship modelling for nanoparticle-based drugs. Nanoinformatics and biomolecular nanomodeling provide a route for effective cancer treatment. Nanoinformatics also enables a data-driven approach to the design of materials to meet health and environmental needs.

Modeling and NanoQSAR

Viewed as a workflow process, nanoinformatics deconstructs experimental studies using data, metadata, controlled vocabularies and ontologies to populate databases so that trends, regularities and theories will be uncovered for use as predictive computational tools. Models are involved at each stage, some material (experiments, reference materials, model organisms) and some abstract (ontology, mathematical formulae), and all intended as a representation of the target system. Models can be used in experimental design, may substitute for experiment or may simulate how a complex system changes over time.

At present, nanoinformatics is an extension of bioinformatics due to the great opportunities for nanotechnology in medical applications, as well as to the importance of regulatory approvals to product commercialization. In these cases, the models target, their purposes, may be physico-chemical, estimating a property based on structure (quantitative structure–property relationship, QSPR); or biological, predicting biological activity based on molecular structure (quantitative structure–activity relationship, QSAR) or the time-course development of a simulation (physiologically based toxicokinetics, PBTK). Each of these has been explored for small molecule drug development with a supporting body of literature.

Particles differ from molecular entities, especially in having surfaces that challenge nomenclature system and QSAR/PBTK model development. For example, particles do not exhibit an octanol–water partition coefficient, which acts as a motive force in QSAR/PBTK models; and they may dissolve in vivo or have band gaps. Illustrative of current QSAR and PBTK models are those of Puzyn et al. and Bachler et al. The OECD has codified regulatory acceptance criteria, and there are guidance roadmaps with supporting workshops to coordinate international efforts.

Communities

Communities active in nanoinformatics include the European Union NanoSafety Cluster, The U.S. National Cancer Institute National Cancer Informatics Program's Nanotechnology Working Group, and the US–EU Nanotechnology Communities of Research.

Individuals who engage in nanoinformatics can be viewed as fitting across four categories of roles and responsibilities for nanoinformatics methods and data:

Customers, who need either the methods to create the data, the data itself, or both, and who specify the scientific applications and characterization methods and data needs for their intended purposes;
Creators, who develop relevant and reliable methods and data to meet the needs of customers in the nanotechnology community;
Curators, who maintain and ensure the quality of the methods and associated data; and
Analysts, who develop and apply methods and models for data analysis and interpretation that are consistent with the quality and quantity of the data and that meet customers' needs.

In some instances, the same individuals perform all four roles. More often, many individuals must interact, with their roles and responsibilities extending over significant distances, organizations, and time. Effective communication is important across each of the twelve links (in both directions across each of the six pairwise interactions) that exist among the various customers, creators, curators, and analysts.

History

One of the first mentions of nanoinformatics was in the context of handling information about nanotechnology.

An early international workshop with substantial discussion of the need for sharing all types of information on nanotechnology and nanomaterials was the First International Symposium on Occupational Health Implications of Nanomaterials held 12–14 October 2004 at the Palace Hotel, Buxton, Derbyshire, UK. The workshop report included a presentation on Information Management for Nanotechnology Safety and Health that described the development of a Nanoparticle Information Library (NIL) and noted that efforts to ensure the health and safety of nanotechnology workers and members of the public could be substantially enhanced by a coordinated approach to information management. The NIL subsequently served as an example for web-based sharing of characterization data for nanomaterials.

The National Cancer Institute prepared in 2009 a rough vision of, what was then still called, nanotechnology informatics, outlining various aspects of what nanoinformatics should comprise. This was later followed by two roadmaps, detailing existing solutions, needs, and ideas on how the field should further develop: the Nanoinformatics 2020 Roadmap and the EU US Roadmap Nanoinformatics 2030.

A 2013 workshop on nanoinformatics described current resources, community needs and the proposal of a collaborative framework for data sharing and information integration.