FAQ
1. What kind of information can I find in SuperToxic?
2. What are the data sources for SuperToxic?
3. What is toxicity?
4. What is LC50?
5. What if the my structure was not found in the database?
6. How can I search for a toxin?
7. What is the difference between “similarity search” and “substructure search”?
8. What are structural informations?
9. What's KEGG?
10. What are toxicity classes?
11. How can I add a new compound to the database?
2. What are the data sources for SuperToxic?
3. What is toxicity?
4. What is LC50?
5. What if the my structure was not found in the database?
6. How can I search for a toxin?
7. What is the difference between “similarity search” and “substructure search”?
8. What are structural informations?
9. What's KEGG?
10. What are toxicity classes?
11. How can I add a new compound to the database?
1. What kind of information can I find in SuperToxic?
SuperToxic is a database, primarily designed for pharmacists, biochemists, and medical scientists, but also researchers working in cognate disciplines. It provides access to information about toxic compounds (names, synonyms and structures).
SuperToxic predicts the toxicity of compounds, informs about potential targets in biochemical pathways and shows potential binding partners. The database also includes links to suppliers, where the compound can be obtained for further investigations.
2. What are the data sources for SuperToxic?
Data forSuperToxic were collected and merged from several sources providing toxicity information. Sources were DSSTox, PubChem, NCI60 and manually curated data extracted from literature. The proportions are:
3. What is toxicity?
Toxicity is the ability of a chemical or physical agent to induce detrimental temporary or permanent tissue change or to detrimentally interfere with normal biochemical processing. A central concept of toxicity is the dose-dependence, expressing that even water can lead to water-intoxication. On the other hand, there is a dose threshold, below which even the most toxic substance is nonhazardous.
In general toxicity can be divided into biological (viruses, bacteria), chemical (medications, organic or inorganic substances) and physical (radiations, vibrations) effects. The amount of toxicity can be determined by effects on targets. One such measurement is LC50 which was mainly used in this database. Toxicity predictions can be made by comparisons to known similar toxic compounds or to exposures to similar organisms.
4. What is LC50?
LC50 is the median lethal concentration of a toxic substance or radiation which is required to kill 50% of the members of a test population. It allows a comparison of the relative toxicity of different substances. LC50 is usually expressed as the amount of substance administered per unit mass of the test subject, (e.g. in milligram per kilogram of body weigh).
Other measurements for toxicity:
5. What if my structure was not found in the database?
Among medicinal chemists it is widely known that structurally similar molecules have similar biological activities. Thus, on the basis of similarities between compounds it might be possible to predict toxicities of unknown compounds.
In case there is no entry for my search structure inSuperToxic , results of the similarity search with more than 85% similarity
might be used to predict the toxicity of my compound, pathways my compound might be involved in (via KEGG), and potential binding partners (via PDB).
6. How can I search for a toxin?
Depending on the information available one can choose between three search options implemented in this site:
During similarity search the fingerprint of a search structure is compared with the fingerprints of all the compounds stored inSuperToxic in order to find structurally similar molecules. This comparison is performed by the calculation of the Tanimoto coefficiant which is defined as:

where:
Two structures with a Tanimoto coefficiant greater or equal to 0.85 (which refers to a similarity of 85%) are considered as similar enough to be able to transfer biological activities of one molecule to the other and, thus, predict toxicities, pathways the molecule might participate in, and potential binding partners.
In order to find a small structure in a bigger molecule a substructure search is performed. WithinSuperToxic this process is divided into three steps :
8. What are structural information?
9. What's KEGG?
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of database resources for linking genomes to life and the environment. It is a database of biological systems, consisting of:SuperToxic the user can find information about pathways the search structure or its target are participating in.
The “KEGG button” gives a connection to the KEGG database for more information regarding the compound or the pathway of the compound and its target, respectively.
This information can be helpful to find out more about the mode of action of the search structure.
10. What are toxicity classes?
There are many ways to classify toxic compounds. Classifications are typically based on results of acute toxicity studies such as the determination of LD50 values in animal experiments. The experimental design measures the acute death rate of an agent. The Environmental Protection Agency distinguishes four toxicity classes:
If you wish to add a new toxic compound toSuperToxic you can choose the option “Add a new compound”.
Here you can either draw a molecule, upload a molfile or enter a SMILES or InChI of the compound.
Additionally, it is required to provide toxicity information as well as a contact address. All other information are optional.
Your entry will be manually curated and then added to the database.
2. What are the data sources for SuperToxic?
Data for
|
|
|
|
|
|
|
|
|
|
|
|
3. What is toxicity?
Toxicity is the ability of a chemical or physical agent to induce detrimental temporary or permanent tissue change or to detrimentally interfere with normal biochemical processing. A central concept of toxicity is the dose-dependence, expressing that even water can lead to water-intoxication. On the other hand, there is a dose threshold, below which even the most toxic substance is nonhazardous.
In general toxicity can be divided into biological (viruses, bacteria), chemical (medications, organic or inorganic substances) and physical (radiations, vibrations) effects. The amount of toxicity can be determined by effects on targets. One such measurement is LC50 which was mainly used in this database. Toxicity predictions can be made by comparisons to known similar toxic compounds or to exposures to similar organisms.
4. What is LC50?
LC50 is the median lethal concentration of a toxic substance or radiation which is required to kill 50% of the members of a test population. It allows a comparison of the relative toxicity of different substances. LC50 is usually expressed as the amount of substance administered per unit mass of the test subject, (e.g. in milligram per kilogram of body weigh).
Other measurements for toxicity:
- LCt50 (Lethal concentration and time), which is often expressed in terms of mg-min/m3
- GI50 (growth inhibition) is the concentration of tested substance, which measures its growth inhibitory power. There applies: 100 × (T - T0)/(C - T0) = 50; the optical density of the test well after a 48-h period of exposure to test drug is T, the optical density at time zero is T0, and the control optical density is C. The ”50'' is called the GI50PRCNT, a T/C-like parameter that can have values from +100 to -100.
- TGI (total growth inhibition) is the compound activity of the cell cultures. The effect of the structure is expressed as percent reduction of the control signal. The TGI50 values are determined from the dose response curves by graphical extrapolation where 100 × (T - T0)/(C - T0) = 0.
5. What if my structure was not found in the database?
Among medicinal chemists it is widely known that structurally similar molecules have similar biological activities. Thus, on the basis of similarities between compounds it might be possible to predict toxicities of unknown compounds.
In case there is no entry for my search structure in
6. How can I search for a toxin?
Depending on the information available one can choose between three search options implemented in this site:
-
Searching via structure:
The user can either draw the structure of the compound or upload an already existing file in mol format. It is also possible to state a SMILES or an InChI. -
Searching via identifier:
On the "Toxin search" site it is possible to search for the name, CAS or NSC number and empirical formula. It is not necessary to know the common name for a compound, there are all known synonyms stored in our database to facilitate the search. CAS (Chemical Abstracts Service) numbers: Every known chemical compound is registered with a unique numerical identifier. NSC: the NSC number is a universally recognized unique identification number. NSC refers to the Cancer Chemotherapy National Service Center (CCNSC). - Searching via physicochemical properties:
The user can give as many informations regarding structural properties as possible to optimize the search:- molweight
- number of atoms
- number of rings
- number of bonds
- number of rotatable bonds
- logp
- number of H-bond donors
- number of H-bond acceptors and
- toxicity value
-
Browse the database:
To browse the whole database, the user can choose an alphabetic character or numbers, to display all database entries starting with the selection. Alternatively, all CASRN or NSC numbers, which are available in the database, can be listed.
During similarity search the fingerprint of a search structure is compared with the fingerprints of all the compounds stored in

- Nab = number of "1" bits that occur in both fingerprint a and fingerprint b
- Na = number of "1" bits in fingerprint a
- Nb = number of "1" bits in fingerprint b
Two structures with a Tanimoto coefficiant greater or equal to 0.85 (which refers to a similarity of 85%) are considered as similar enough to be able to transfer biological activities of one molecule to the other and, thus, predict toxicities, pathways the molecule might participate in, and potential binding partners.
In order to find a small structure in a bigger molecule a substructure search is performed. Within
- Comparison of the amount of atoms and rings
In the first step those molecules which are smaller than the search structure are neglected for the search. Thus, it is possible to narrow down the search space. - Identification of identical bits in both fingerprints
Since the single bits of a fingerprint represent certain structural properties of a compound a logical AND-comparison of the fingerprints determines whether two molecules share the same structural features. - Common substructure search algorithm
The final exact search algorithm builds trees of the substructure and the structure of the compound to be searched, whereas atoms function as nodes and bonds as branches. Now, the tree of the scanned compound can be traversed in order to find the subtree of the substructure.
8. What are structural information?
-
2D/3D structure:
The 2D structure of compounds can be stored as molfiles. A molfile contains information about atoms, bonds, connectivity and coordinates of the molecule.
It can be divided into the following parts: header information, the Connection Table (CT) that contains atom coordinates, bond types and connections,
and a section for more complex information.
For the representation of 3D structures Jmol is used. This interactive web browser applet displays molecules in different modes (e.g. “ball and stick”) and supports the zoom and turn feature.
-
SMILES:
SMILES (Simplified Molcular Input Line Entry System) is a chemical language with which atom and bond symbols can be represented by using the ASCII characters.
It is a unique string that can be used as a universal identifier for a specific chemical structure with which molecules or reactions can be symbolized.
-
InChI:
InChI (IUPAC International Chemical Identifier) shows characters which uniquely represent a chemical substance. Therefore, IUPAC determines a nomenclature.
InChIs are created in three steps: Normalization; Canonicalization; Serialization. The information is structured as a sequence of layers:
- Main layer
- Charge layer
- Stereochemical layer
- Isotopic layer
- Fixed-H layer
- Reconnected layer
-
Fingerprint:
Fingerprints represent certain structural features of a molecule. There are two processes for which fingerprints are used: calculations and screenings for similarity measures. A fingerprint is a boolean array or bitmap, but unlike a structural key there is no assigned meaning to each bit. The patterns for a molecule's fingerprint are generated from the molecule itself. The fingerprinting algorithm examines the molecule and generates:
- a pattern for each atom
- a pattern representing each atom and its nearest neighbours (plus the bonds that join them)
- a pattern representing each group of atoms and bonds connected by paths up to two bonds long
- atoms and bonds connected by paths up to three bonds long
- continuing, with paths up to 4, 5, 6, and 7 bonds long.
9. What's KEGG?
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of database resources for linking genomes to life and the environment. It is a database of biological systems, consisting of:
- genetic building blocks of genes and proteins (KEGG GENES)
- chemical building blocks of both endogenous and exogenous substances (KEGG LIGAND)
- molecular wiring diagrams of interaction and reaction networks (KEGG PATHWAY)
- hierarchies and relationships of various biological objects (KEGG BRITE)
10. What are toxicity classes?
There are many ways to classify toxic compounds. Classifications are typically based on results of acute toxicity studies such as the determination of LD50 values in animal experiments. The experimental design measures the acute death rate of an agent. The Environmental Protection Agency distinguishes four toxicity classes:
- Toxicity Class I
- most toxic
- signal word: “Danger-Poison”, with skull and crossbones symbol
- Toxicity Class II
- moderate toxic
- signal word: “Warning”
- Toxicity Class III
- slightly toxic
- signal word: “Caution”
- Toxicity Class IV
- practically nontoxic
- no signal word required since 2002
- Class Ia: extremely hazardous
- Class Ib: highly hazardous
- Class II: moderately hazardous
- Class III: slightly hazardous
- Class I: very toxic
- Class II: toxic
- Class III: harmful
If you wish to add a new toxic compound to