Marvin Documents
Codename: mrv
Contents
Marvin Documents format
An XML based format that is capable to store graphics objects (lines,
text boxes, etc.) and molecule objects.
The following tags are recognized:
- <cml> - root element
Children:
- <MDocument>
- top level container of a record
Attributes:
- atomSetRGB, bondSetRGB - atom/bond set colors.
Comma separated list of entries in
"k:color" format, where k
is the set sequence number and color is the color specification
in one of the following forms: "#RRGGBB" - RGB components
as a 6-digit hexadecimal number, "D" - default set color,
"N" - no set color (normal atom/bond coloring is used).
- multipageEnabled - enables the multipage molecular document.
Its value is "true" or "false".
- multipageSelectedPage - the selected page in multipage molecular document.
Its value is "k" where k is a non-negativ integer.
- multipageColumnCount - number of columns in multipage molecular document.
Its value is "k" where k is a non-negativ integer.
- multipageRowCount - number of rows in multipage molecular document.
Its value is "k" where k is a non-negativ integer.
- multipageWidth - width of a page in multipage molecular document.
Its value is "d" where d is a floating point number.
- multipageHeight - height of a page in multipage molecular document.
Its value is "d" where d is a floating point number.
- multipageLeft - left margin of a page in multipage molecular document.
Its value is "d" where d is a floating point number.
- multipageRight - right of a page in multipage molecular document.
Its value is "d" where d is a floating point number.
- multipageTop - top margin of a page in multipage molecular document.
Its value is "d" where d is a floating point number.
- multipageBottom - bottom margin of a page in multipage molecular document.
Its value is "d" where d is a floating point number.
Children:
- <MChemicalStruct>
- chemical structure
Children:
- <molecule> - It can contain both
CML
and Marvin attributes. Currently, there is only one Marvin attribute:
Children:
- <atomArray> - It can contain both
CML
and Marvin attributes. Marvin attributes are explained below.
- residueType, residueId - residue type/ID
value or zero
- residueAtomName - PDB atom name or zero.
- radical - Radical center value or zero.
- reactionStereo - Reaction stereo value or zero.
- rgroupRef - R-group reference value or zero.
- sgroupRef - S-group reference value or zero.
- attachmentPoint - R-group attachment point value or
zero.
- mrvValence - Valence or `-'.
- mrvQueryProps - Query atom properties or zero.
- mrvAlias - Atom alias or zero.
- mrvExtraLabel - Atom extra label or zero.
- mrvPseudo - Pseudoatom name or zero.
- mrvStereoGroup, mrvMap
- mrvSetSeq - Atom set sequence number.
- mrvLinkNodeRep - Number of repetitions for link nodes.
A number n in the list is the maximum number of
repetitions for the corresponding atom (value 1 means that
the atom is not a link node), m-n sets both
the minimum and maximum values.
- mrvLinkNodeOut - Outer bond references for link nodes.
Comma separated numbers in the list defines the indices of bonds
(amongst bonds of the link atom) leading to the outer atoms
(non-repeating neighbours) of the link nodes, "-" means no
outer bonds.
- mrvSpecIsotopeSymbolPreferred - Special symbols
are preferred for Hydrogen isotopes (D, T) if the value
is 1, normal element symbol (H) is used if the value is
0.
Children:
- <atom> - It can contain both
CML
and Marvin attributes. Marvin attributes are explained
below.
- residueType, residueId - They were
present in CML1 but removed later.
- residueAtomName - PDB atom name.
- radical - Radical center:
"monovalent" (doublet), "divalent",
"divalent1" (singlet), "divalent3"
(triplet), "trivalent", "trivalent2"
(doublet) or "trivalent4" (quartet).
- reactionStereo - Reaction stereo:
"Inv" (inversion) or
"Ret" (retention).
- rgroupRef - R-group reference. Currently, only
positive integer values are accepted.
- sgroupRef - S-group reference.
- attachmentPoint - R-group attachment point:
"1", (on first site) "2" (on second
site) or "both" (on both sites).
- mrvValence - Valence.
- mrvQueryProps - Query atom properties.
Format:
[atom Type:][Query properties][;]
[str:query string],
where atom type can be
"A" (any atom), "Q" (heteroatom),
"L,element1,..." (inclusive atom list)
or "L!element1!..." (exclusive atom
list).
Query properties is a semicolon separated list.
An element of the list starts with prefix:
- "H" — total hydrogen count,
see SMARTS H,
- "h" — implicit hydrogen count,
see SMARTS h,
- "X" — total connection count,
see SMARTS X,
- "D" — degree,
see SMARTS D,
- "R" — SSSR ring count,
see SMARTS R,
- "r" — smallest ring size in SSSR,
see SMARTS r,
- "a" — aromatic ("a1") or aliphatic ("a0"),
see SMARTS a/A,
- "u" — unsaturated atom,
see MDL M UNS,
- "s" — substitution count,
see MDL M SUB,
- "rb" — ring bond count),
see MDL M RBC.
The following characters may be digits representing an
integer value or "*" in case of s and rb query
properties.
The query string contains properties unknown by
Marvin or known properties in a logical relation that
cannot be represented by Marvin.
Examples: "A:" (any atom),
"L!O!S:H1,R1" (atom is not oxygen, not sulfur,
1 hydrogen is connected to it and it is inside a ring).
- mrvAlias - Atom alias.
- mrvExtraLabel - Atom extra label.
- mrvPseudo - Pseudoatom name.
- mrvStereoGroup, mrvMap
- mrvSetSeq - Atom set sequence number.
- mrvLinkNodeRep - Number of repetitions for a
link node in format "n" (maximum number
of repetitions)
or "m-n" (minimum and
maximum).
- mrvLinkNodeOut - Outer bond references for a link
node in comma separated list of bond indices (amongst
bonds of the link atom) leading to the outer atoms
(non-repeating neighbours) of the link nodes, "-" means
no outer bonds.
- mrvSpecIsotopeSymbolPreferred - Special symbols
are preferred for Hydrogen isotopes (D, T) if the value
is 1, normal element symbol (H) is used if the value is
0 (default).
An <atom> tag is recognized at import even if the
atomArray container is not present.
- <bondArray>
- <bond atomRefs2="a1 a2"
order="order" >
The atom references a1 and a2 must be valid
atom ids.
The order value can be "1", "S"
(single), "2", "D" (double),
"3", "T" (triple) or "A"
(aromatic).
The following attributes can be present, which are not
included in CML:
- queryType - Query bond type: "SD"
(single or double), "SA" (single or aromatic),
"DA" (double or aromatic) or
"Any".
- mrvQueryProps - Query bond properties.
Format:
str:query string,
where query string contains query bond properties
unknown by Marvin or known properties in a logical
relation that cannot be represented by Marvin.
- mrvSetSeq - Bond set sequence number.
- <bond atomRefs2="a1 a2"
convention="convention" >
The atom references a1 and a2 must be valid
atom ids.
The convention value can be "cxn:coord" (coordinate bond).
- <molecule id="sg1"
role="SuperatomSgroup">
- contracted Superatom S-group
- <atomArray>
- <atom>
- <bondArray>
- <bond>
- <molecule id="sg2"
role="SuperatomSgroup" />
- expanded Superatom S-group
- <molecule id="sg3"
role="MultipleSgroup"
atomRefs="a1 a2 ... "/>
- Multiple S-group
- <molecule id="sg3"
role="DataSgroup"
fieldName="fieldName"
[fieldType="F|N|T"]
[units="unit"]
[x="x coordinate"]
[y="y coordinate"]
[dataDisplayed="displayed|not displayed"]
[placement="Relative|Absolute"]
[unitsDisplayed="Unit displayed|not displayed"]
[displayedChars="number of characters displayed per line"]
[displayedLines="number of lines to display"]
[tag="tag"]
[pos="0-9"]
[queryType="mQ|IQ|MQ|?Q"]
[queryOp="<|>|<>|<=|>=|=|like|between|contains"]
[fieldData="first line of data"]
[fieldData1="second line of data"]...
/>
- Data S-group
- <molecule id="sg1"
role="ComponentSgroup"
title="c"
molID="m1"
atomRefs="a1 a2 ... ">
- Component S-group
- <bracket
coordinates="x1 y1 z1 x2 y2 z2"/>
- coordinates of left bracket-endpoints
- <bracket
coordinates="x1 y1 z1 x2 y2 z2"/>
- coordinates of right bracket-endpoints
- <molecule id="sg1"
role="MixtureSgroup"
title="mix"
molID="m1"
atomRefs="a1 a2 ... ">
- Mixture (unordered mixture) S-group
- <bracket
coordinates="x1 y1 z1 x2 y2 z2"/>
- coordinates of left bracket-endpoints
- <bracket
coordinates="x1 y1 z1 x2 y2 z2"/>
- coordinates of right bracket-endpoints
- <molecule id="sg1"
role="FormulationSgroup"
title="f"
molID="m1"
atomRefs="a1 a2 ... ">
- Formulation (ordered mixture) S-group
- <bracket
coordinates="x1 y1 z1 x2 y2 z2"/>
- coordinates of left bracket-endpoints
- <bracket
coordinates="x1 y1 z1 x2 y2 z2"/>
- coordinates of right bracket-endpoints
- <molecule id="sg1"
role="SruSgroup"
title="name"
molID="m1"
atomRefs="a1 a2 ... "
correspondence="b1 b2 ... "
bondList="b1 b2 ... "
connect="hh|ht|eu "
>
- SRU S-group, where
- name is single letter: a-z or A-Z.
- for correspondence see MDL M CRS
- for bondlist see MDL M SBL
- for connect see MDL M SCN
- <molecule id="sg1"
role="GenericSgroup"
molID="m1"
atomRefs="a1 a2 ... ">
- generic S-group
- <molecule id="sg1"
role="MerSgroup"
title="mer"
molID="m1"
atomRefs="a1 a2 ... ">
- mer S-group
- <molecule id="sg1"
role="MonomerSgroup"
title="mon"
molID="m1"
atomRefs="a1 a2 ... ">
- monomer S-group
- <molecule id="sg1"
role="AnyPolymerSgroup"
title="any"
molID="m1"
atomRefs="a1 a2 ... ">
- anypolymer S-group
- <molecule id="sg1"
role="AlternatingCopolymerSgroup"
title="alt"
molID="m1"
atomRefs="a1 a2 ... ">
- alternating copolymer S-group
- <molecule id="sg1"
role="BlockCopolymerSgroup"
title="blk"
molID="m1"
atomRefs="a1 a2 ... ">
- block copolymer S-group
- <molecule id="sg1"
role="RandomCopolymerSgroup"
title="ran"
molID="m1"
atomRefs="a1 a2 ... ">
- random S-group
- <molecule id="sg1"
role="CopolymerSgroup"
title="co"
molID="m1"
atomRefs="a1 a2 ... ">
- copolymer S-group
- <molecule id="sg1"
role="CrosslinkSgroup"
title="xl"
molID="m1"
atomRefs="a1 a2 ... ">
- crosslink S-group
- <molecule id="sg1"
role="GraftSgroup"
title="grf"
molID="m1"
atomRefs="a1 a2 ... ">
- grafted S-group
- <molecule id="sg1"
role="ModificationSgrop"
title="mod"
molID="m1"
atomRefs="a1 a2 ... ">
- modification S-group
- <molecule id="sg1"
role="MulticenterSgroup"
molID="m1"
atomRefs="a1 a2 ... "
center="a "
>
- multicenter S-group to represent coordination compounds and
markush structures (depending on bond type connencting to the center)
- <propertyList>
- <property>
- <reaction> - It can contain both
CML
and Marvin attributes. Marvin attributes are explained below.
Its children can be the same as it is discussed in the
CML as well as new ones that are described below.
- absStereo
- arrowType - reaction arrow type: "DEFAULT" (->),
"RESONANCE" (<->),
"RETROSYNTHETIC" (=>),
"EQUILIBRIUM" (=)
Children:
- <agentList>List of agents in this reaction.
- <Rgroup rgroupID="rgroupID">
- R-group.
- <molecule> - an R-group member in
CML
- <MMoleculeMovie>
- animation of a chemical process
Children:
- <MPolyline>
- line, arc, polyline and/or arrow.
Attributes:
- headSkip, tailSkip - Distance of (visible) head
or tail from the corresponding line end point.
- headWidth, tailWidth - Arrow head/tail width.
- headLength, tailLength - Arrow head/tail length.
- headFlags, tailFlags - Arrow head/tail options.
- arcAngle - Arc central angle or 0.
Children:
- <MPoint x="x"
y="y"
[z="z"]>
- Represents a location in space
- <MAtomSetPoint atomRefs="..."
[weights="..."]>
- Represents an atom or atom pair
(bond or incipient bond). The atomRefs argument is a space
separated list of atoms, weights is a space separated list
of floating point numbers. The atom set's location is the
weighted average of the atom locations.
- <MMidPoint lineRef="lineRef"
[pos="position"]>
- Middle point of a line or a section of a polyline.
Attributes:
- lineRef - Reference to the MPolyline object.
- position - Polyline section index (0, ..., n-1),
default: 0.
- <MRectanglePoint pos="position"
rectRef="rectRef">
- A corner of a rectangle or a middle point of one of its edges.
Attributes:
- position - Integer value between 0 and 7.
Top left corner=0, top right corner=1,
bottom right corner=2, bottom left corner=3,
top middle point=4, right middle point=5,
bottom middle point=6, left middle point=7.
- rectRef - Reference to the MRectangle object.
- <MEFlow> - curved electron flow arrow.
MEFlow is a subclass of MPolyline thus it has the same attributes,
but it can only contain two points.
Children:
- <MRectangle [toption="toption"]
[tcenter="tcenter"]> - rectangle object
Optional attributes:
- toption
- Transformation option:
ALLOW_ALL (default, all transformations are allowed),
NOROT (only scaling is allowed, the rectangle is not rotatable).
- tcenter - Central point:
NW (top left corner), NE (top right corner),
SE (bottom right corner), SW (bottom left corner),
CENTER (geometrical center),
N (top middle point), E (right middle point),
S (bottom middle point), W (left middle point).
Children:
- <MPoint x="x1"
y="y1"
[z="z1"] />
- Top left corner.
- <MPoint x="x2"
y="y2"
[z="z2"] />
- Top right corner.
- <MPoint x="x3"
y="y3"
[z="z3"] />
- Bottom right corner.
- <MPoint x="x4"
y="y4"
[z="z4"] />
- Bottom left corner.
- <MTextBox [text="The text"]
[fontScale="fontScale"]
- text object.
It is derived from MRectangle, thus it inherits the
toption and
tcenter attributes.
Extra attributes:
- text - The text. This attribute can only be used to define a
single text line. In case of multiple lines the
<Field name="text"> tag must be used.
The string can contain \uXXXX unicode escapes. Backslash characters
are always escaped with another backslash.
- fontScale - Base font size (default: 10)
Children:
- <Field name="text">The text
</Field>
- <MPoint x="x1"
y="y1"
[z="z1"] />
- Top left corner.
- <MPoint x="x2"
y="y2"
[z="z2"] />
- Top right corner.
- <MPoint x="x3"
y="y3"
[z="z3"] />
- Bottom right corner.
- <MPoint x="x4"
y="y4"
[z="z4"] />
- Bottom left corner.
Export options
See the basic export options.