Guide for chemical information

Four basic steps for constructing an effective substructure

  • Identify portion of molecules(s) that is of chemical interest.
  • Determine whether stereochemistry, geometry, and bond order of each connection in your structure are fixed or flexible.
    • ​​If you care about the configuration of the center use the correct bond types to indicate the directions in which the atoms are pointing; if you are open, use plain single bonds rather than stereo bonds (In Reaxys: make sure that the stereo search is turned off for the best results).
    • Search engines often ignore double-bond geometry; if you use a plain double bond, some search algorithms, like SciFinder allow a Z configuration to be replaced by E geometry.
    • Bond order: dashed bond is used to represent single, double or triple bonds, but some systems like Reaxys have bond orders that are single/double or double/triple; aromatic systems are recognized and searched for all variants.
  •  Decide where and how your substructure may be further substituted.
  • Determine topology of each atom and connection in the structure
    • Important to understand the defaults for topology; different handlings of chains; most systems allow to specify chain that you draw by to be either chain only or ring/chain topology.
      SciFinder: lock out tool allows you to lock out ring formation.
      Reaxys: you get to the topology tools through a right-click or an atom or bond attributes menu.

SciFinder substructure algorithm is more permissive (analyse your results by precision).

References: J.N. Currano and D.L. Roth. Chemical Information for Chemists: A Primer, RCS Publishing, Cambridge (UK), 2014, p. 114 –116.

name content publisher
SciFinder indexes conference papers, reports some arXiv with a chemical background, CERN preprints and ACS Meeting Abstracts CAS: (Since 1907 CAS has abstracted and indexed the chemical literature, substances and concepts)
ACS (accessed December 29 2015)
technical programming Archive of past National Meetings since March 2004
browseable and searchable
American Chemical Society (ACS) not-for-profit organization
ArXiv (accessed December 29 2015) electronic preprint archive covers the fields of mathematics, physics, computer science, quantitative biology, statistics and quantitative finance maintained and operated by the Cornell University Library with guidance from the arXiv Scientific & Member Advisory Board and with the help of subject moderators
BASE Beilefeld Academic Search Engine
(accessed December 29 2015)
harvests OAI metadata from institutional repositories and other academic digital libraries
indexes of selected web sites and local data collections, all of which can be searched via a single search interface
ECD Energy citation database
(accessed December 29 2015)
free public access to over 2.5 million report citations from 1943 to the present in Chemistry, Materials, renewable energy,… (funded by DOE)
results covering the report literature, conference papers, journal articles, books, dissertation and patents
includes the Nuclear Science Abstracts (1948-1976) and the ERDA/Energy Research Abstracts (1976-1994) databases
U.S. Department of Energy (DOE)
International Nuclear Information System (INIS) (accessed December 29 2015)
focused on peaceful uses of nuclear science and technology (Nuclear safety)
keyword default search & advanced search ( “include” and “exclude” options)
International Atomic Energy Agency
successor to Atomindex, which first appeared on 1970
NASA Technical Report Server (NTRS) (accessed December 29 2015) indexing and abstracting information for over 500000 aerospace citations and information on images and videos (incl. chemistry and material science);
NTRS integrates the NACA citations and reports (1916 – 1958) and the NASA citations and documents (1958 – present) with the NIX (NASA Image Exchange)
National Technical Information Service ( (accessed December 29 2015) bibliographic database (2 million records) for nearly 40 subject areas (e.g. chemistry, combustion, energy and material science) produced by U.S. government agencies U.S. Department of Commerce
Science (accessed December 29 2015) free, publicly available deep web search engine that uses advanced "federated search technology" to return high quality results by submitting your search query
results are ranked and de-duplicated; offers text search and subject options

Evaluation of data quality

NIST (National Institute of Standards and Technology) has established several levels of data evaluation to investigate data reliability (NIST interactive Data Evaluation Assessment tool3)
Questions based on the NIST protocol should be always asked:

  • Is compound identified?
  • Is measurement method described?
  • Was data compared to certified reference values?
  • Is the data consistent with other properties about the compound?
  • Does the data agree with other independent measurements of the same property of the same compound?
  • Is the source peer-reviewed or a commercial manufacturer?
  • Is the source the original report or part of a secondary compilation?
  • Is the data considered preliminary or included as supplemental information?
  • If the source indicates unexpected results, does the research follow up and indicates methods to determine an explanation 
Properties Definition Techniques Searching strategy / Sources
Molar Mass and Dispersity
Molar mass: weight of one mole of polymer molecules; most common types are the number-average (Mn), weight-average (Mw) and z-average (Mz)
Dispersity: ratio of the weight-average molar mass to the number-average molar mass
GPC: entire molar mass distribution but requires comparison with standard material
SLS: absolute value for Mw
DLS: diffusion coefficient and hydrodynamic radius
Mass spectroscopy: structure and molar mass
MALDI-TOF: for high molecular weight compounds; give distribution of travel times with heavier species taking longer travel through the detector
Different measurement techniques: may yield different values  important to have general idea of the molar mass range & be aware of the measurement technique
With Boolean operators molar mass ranges and measurement techniques should be included in the search string construction
Sources: listing of scattering factors in Section VII of Polymer Handbook1 & Polymer Data Handbook2 (compiles molar mass by name within polymer classes and types)
Spectral Analysis   IR: absorption of infrared radiation by chemical bonds; for identification of substances by measuring characteristic frequencies associated with specific molecular motions; different types: FT-IR or ATR-IR for surface characterization
Raman Spectroscopy: frequency change of inelastically scattered light; to measure specific chemical bond energy properties and to detect morphological changes; can be employed in different environments
NMR: measures radiation absorption relative to reference compound
X-ray diffraction and scattering Analysis To elucidate crystalline and semicrystalline phases in polymers X-ray diffraction: scattering pattern; size of unit cell can be calculated
WAXS: measure spacing between individual chains in ordered regions  to calculate degree of crystallinity and density
SAXS: information about larger molecular structure, crystal thickness and periodicity
Sources: Crystallographic data in International Tables of Crystallography (ITC), WebCSD (further explanation in physical properties and spectra) and in Section V of Polymer Handbook1 & Polymer Data Handbook2
Glass Transition (Tg) and melting (Tm) temperatures Types of thermophysical properties available for any polymer will depend on its structure (polymers can have the same chemical structure, but different physical structures depending on their formation and processing conditions)
Tg: for highly regular structure no variation by measurement method; amorphous polymers will only display glass transition temperature; Semi-crystalline or crystalline polymers may exhibit Tg and Tm
XRD: to determine crystallinity of polymers
DSC: to measure crystallinity as well as glass and melting transitions, and heat capacityl; main experimental parameter is rate of change in the heating element and should be noted
Sources: In Section V of the Polymer Handbook1, in the Polymer Data Handbook2 and in Section 13 of the CRC Handbook of Chemistry and Physics4
Attention should be paid to the method by which the value was determined
Decomposition temperature   TGA: measures mass loss as function of temperature; experiments can be carried out at different heating rates in various gaseous environment Heating rates and gaseous environment should be included for searching
Sources: In Section II of the Polymer Handbook1,
Polymer Solubility and Miscibility Implications for many experimental applications, including synthesis and characterization Solubility is directly related to polymer solution phase behavior in terms of upper critical solution temperatures (UCST) and lower critical solution temperatures (LCST) Sources: list of common solvents in Section III of the Polymer Handbook1 and in Section 13 of the CRC Handbook of Chemistry and Physics4
UCST & LCST: CRC Handbook of Chemistry and Physics4, p. 13-26 – 13-43
Viscoelastic behavior Mechanical properties of polymers are related to the time temperature superposition (TTS) principle = properties measured at high rate are equivalent to properties measured at low temperature and vice versa (at low temperature there is thermal energy to effect molecular rearrangement and at high rate rates there is less time) Static testing: mechanical properties of relevance include tensile modulus, compressive modulus, shear modulus, ultimate tensile strength and Poisson’s rate; temperature of measurement is vital as properties can change drastically
Dynamic testing: mechanical properties such as storage modulus, loss modulus, tan delta and dynamic viscosity; specific to temperature, frequency of measurement and intrinsic polymer properties; can also vary by instrumentation used; experimental condition (temperature, shearing rate and frequency/frequency spectrum should be noted)
For searching keep in mind TTS principle and be aware of the conditions under which reported values are obtained (temperature & rate of testing)
Sources: mechanical property for polymers in Section V of the Polymer Handbook1, in the Polymer Data Handbook2 (more easily navigable by polymer name than by property value)and in Section 13 of the CRC Handbook of Chemistry and Physics4
Name Content Search process Publisher
CASREACT Over 58 million graphical reactions with coverage beginning in 1840 and being most comprehensive in the period from 1985 to present
Single or multi-step reactions of organics and organometallics, synthetic preparation of natural products and biotransformation
Search: either by clicking the “Get Reactions” prompt from the substance or reference context or by searching the reaction context directly (exact reaction search)
partial or complete graphical reaction search
simple or complex reaction structure or substructure query
Records: CAS registry number, structures for the reactants, reagents, catalysts, solvents and products of reaction
Refine: with yield, number of steps, reaction classification, solvent, availability of experimental procedure and bibliographic information (publication year and document type)
Available through the reaction context of SciFinder
Reaxys Derives from Beilstein’s Handbuch der Organischem Chemie, Gmelin’s Handbuch der Anorganischen Chemie and Patent Chemistry database
Over 32 million graphical organic, organometallic and inorganic reactions
Search: only draw graphical reaction scheme with no refinement (possible to refine after) or specify conditions such as reagents, catalysts, solvents, temperature, pressure, time, pH value, reaction classification, number of steps and stages at the time of search (for comprehensive search leave out conditions)
Text field “Subject Studied”: indicates focus of paper with respect to the reaction (incl. values such as product, distribution, kinetics, mechanistic or kinetic studies);helpful when looking for mechanistic or kinetic studies avoid it for comprehensive reaction search
Records: structure and identifier for reactants and products, with reagents, catalysts and solvents included
SPRESIweb (Speicherung und Recherche Struktur-chemischer Informationen) 11.8 million substances and 4.2 million reactions from about 1300 different literature sources Reaction search interface: use structure and substructure techniques to perform either partial or complete reaction searches; possible to classify according to atoms in the immediate neighbourhood of the reaction center; broad (within reaction center), medium (one atom from the reaction center) and narrow (two atoms from the reaction center) category to get sense of the types of transformations; also possible to locate transformations similar to a reaction of interest and to find name reaction German company InfoChem
The encyclopedia of reagents for organic synthesis
About 4000 detailed articles of a variety of reagents and catalysts used in the synthesis of organic molecules (around 70000 examples of reactions)
Information about reactivity of the reagent
Search: arranged alphabetically; browse the articles by name; e-EROS allows search by keywords,
structure or substructure searches for substances and reactions; text field beneath the structure editor enable to enhance search with some reaction conditions
Records: structure, CAS registry number, formula, molecular weight, InCHlKey, brief description of the function of the reagent or catalyst; physical data and solubility, the form in which it is available or preparatory information or references; purification information Rest of the records details types of reactions that is facilitated and includes graphical reactions, tables and descriptive text
Wiley; print and electronic format; published information has also been reproduced in a series called the Handbooks of Reagents for Organic Synthesis