SMILES molecule parser.
More...
#include <SMILES_grammar.hh>
|
void | addAtom (const std::string &label) const |
|
void | addBond (const int atom1, const int atom2, const std::string &label) const |
|
This class defines the rules of the Daylight's (tm) SMILES
BNF grammar. It allows for the parsing of a SMILES string to generate
a molecule graph of the encodes molecule. The graph is represented as
a boost graph (Molecule) and the atom and bond labels will be
stored in the property_maps of the given PropNodeLabel and
PropEdgeLabel.
BNF grammar of Daylight's SMILES
smiles ::= chain (chain | branch)*
chain ::= bond? (simple_atom | complex_atom) ringclosure*
branch ::= '(' chain (chain | branch)* ')'
ringclosure ::= digit | ('%' digit digit)
simple_atom ::= simple_symbol
complex_atom ::= '[' isotope? (simple_symbol | complex_symbol | group_symbol) chirality? hcount? charge? name? ']'
isotope ::= integer
simple_symbol ::= 'Br' | 'Cl' | 'B' | 'c' | etc.
complex_symbol ::= 's' | 'p' | 'o' | 'Zn' | etc.
group_symbol ::= '{' anyChar+ '}'
chirality ::= '@' '@'?
hcount ::= 'H' integer?
charge ::= '+' ('+'* | integer)
| '-' ('-'* | integer)
name ::= ':' integer
bond ::= bond_symbol
bond_symbol ::= '-' | '=' | '#' | ':' | etc.
integer ::= digit+
digit ::= [1-9]
NOTE : chirality, and isotope information is currently ignored !
NOTE : Supported atom labels are defined by MoleculeUtil::getAtomData().
NOTE : Supported bond labels are defined by MoleculeUtil::getBondData().
NOTE further : we allow for an extension of the SMILES encoding:
complex atoms are allowed to hold a group_symbol ID strings
enclosed in brackets of the form '{SOMEID}'. They are replaced by
according group subgraphs if found in the provided group map.
Otherwise the parsing is aborted.
- Author
- Christoph Flamm (c) 2008 http://www.tbi.univie.ac.at/~xtof/
-
Martin Mann (c) 2008 http://www.bioinf.uni-freiburg.de/~mmann/
Definition at line 99 of file SMILES_grammar.hh.
ggl::chem::SMILES_grammar::SMILES_grammar |
( |
Molecule & |
toFill | ) |
|
|
explicit |
Constructs the definitions of a Daylight's SMILES grammar to parse a SMILES string and to fill the encoded molecule into a given boost graph object.
- Parameters
-
toFill | the boost graph object to add nodes and edges to |
ggl::chem::SMILES_grammar::SMILES_grammar |
( |
Molecule & |
toFill, |
|
|
const GroupMap & |
groups |
|
) |
| |
|
explicit |
Constructs the definitions of a Daylight's SMILES grammar to parse a SMILES string and to fill the encoded molecule into a given boost graph object.
- Parameters
-
toFill | the boost graph object to add nodes and edges to |
groups | a container that holds group IDs where each matching node has to be replaced by the according mapped subgraph |
void ggl::chem::SMILES_grammar::addAtom |
( |
const std::string & |
label | ) |
const |
|
protected |
Adds an atom to the internal molecule graph to fill
- Parameters
-
label | the atom label to set |
void ggl::chem::SMILES_grammar::addBond |
( |
const int |
atom1, |
|
|
const int |
atom2, |
|
|
const std::string & |
label |
|
) |
| const |
|
protected |
Adds a bond to the internal molecule graph to fill
- Parameters
-
atom1 | the first bond partner |
atom2 | the second bond partner |
label | the bond label to set |
static std::pair< Molecule, int > ggl::chem::SMILES_grammar::parseSMILES |
( |
const std::string & |
SMILES_string | ) |
throw (std::invalid_argument) |
|
static |
Parses a SMILES string and generates a graph representation of the molecule
- Parameters
-
SMILES_string | the string to parse |
- Returns
- pair.first = the graph encoding of the molecule pair.second = -1 if parsing was successful, in error case it returns the string position that caused the parsing error
- Exceptions
-
std::invalid_argument | in case a check fails |
static std::pair< Molecule, int > ggl::chem::SMILES_grammar::parseSMILES |
( |
const std::string & |
SMILES_string, |
|
|
const GroupMap & |
groups |
|
) |
| throw (std::invalid_argument) |
|
static |
Parses a SMILES string and generates a graph representation of the molecule
- Parameters
-
SMILES_string | the string to parse |
groups | a container that holds group IDs where each matching node has to be replaced by the according mapped subgraph |
- Returns
- pair.first = the graph encoding of the molecule pair.second = -1 if parsing was successful, in error case it returns the string position that caused the parsing error
- Exceptions
-
std::invalid_argument | in case a check fails |
Molecule& ggl::chem::SMILES_grammar::g2fill |
|
protected |
The boost graph object that is filled to represent the next parsed SMILES string.
Definition at line 106 of file SMILES_grammar.hh.
const GroupMap* ggl::chem::SMILES_grammar::groups |
|
protected |
Container that holds group IDs where each matching node has to be replaced by the according mapped subgraph
Definition at line 110 of file SMILES_grammar.hh.
The documentation for this class was generated from the following file: