Keyword or Symbol
|
Description
|
/BOU
|
When
this keyword occurs in a context string or cluster, it identifies the
beginning of the utterance. It is important to identify /BOU
as the left context of every state
that may begin the grammar. This may be done easily in most cases
by putting the /BOU keyword in the same cluster as the silence
model.
|
/EOU
|
When
this keyword occurs in a context string or cluster, it identifies the
end of the utterance. It is important to identify /EOU as
the right context of every state that may end the grammar. This
may be done easily in most cases by putting the /EOU keyword in
the same cluster as the silence model.
|
lexicon
|
The lexicon
keyword at the beginning of a line, followed by a space,
followed by a URI that is surrounded by
angle brackets is reserved for defining a pronunciation dictionary of
words.
If this keyword is encountered, the pronunciations for the words
specified
in the grammar will be obtained from the file indicated by the URI.
|
oneLevelExpansion
|
The oneLevelExpansion keyword at the
beginning of a line, followed by a space, followed by either 0 or 1, is
reserved for specifying how many levels of rule expansion are to be
applied (one level or many levels). The default for oneLevelExpansion is 0, meaning
that rules are applied until expansion is no longer possible. See
comments in the description of the variable oneLevelValue, above.
|
=
|
The
equals sign defines the boundary between a token name and its rule.
|
;
|
A
semicolon defines the end of a rule in the file-based form of a
grammar. In the Tcl-list based form of
a grammar, the end of a rule is defined by the end of the list item.
|
< ... >
|
Angle
brackets have two special uses. First, if a URI is expected
(based on a
preceding keyword), then the brackets surround the URI. Second,
in a
rule, brackets may specify that the preceding item is to
be repeated a number of times. (In this case, the brackets and
contents within the brackets are called a "repeat operator".) If
the symbol inside the brackets
is "*" or "0-", then the repetition will occur
zero or more times.
If the symbol inside the brackets is "+" or "1-", then the
repetition will occur one or more times. Other types of repeat
operators in ABNF form are not yet supported. If the symbols < ... > are not consistent with a repeat
operator, then the entire token is treated as any other token. If
the symbols are consistent with a repeat operator, then a space does not have to separate the repeat
operator from the previous item. (For example, "$word<+>"
and "$word <+>" are equivalent).
|
//
|
When
these characters occur in sequence in a grammar file, they define the
beginning of a comment. The comment ends when the end of the line
is reached. Anything within the comment is ignored by the
Statenet package. If the grammar is specified using Tcl lists,
comments are not allowed.
|
/* ... */
|
These
characters define a comment in a grammar file. The comment begins
with /* and ends with */. Anything within the
comment is ignored by the Statenet package. If the grammar is
specified using Tcl lists, comments are not allowed.
|
:=
|
These
symbols in sequence define the separation between a cluster name and a
sequence of tokens that are to be clustered.
|
->
|
These
symbols in sequence define the boundary between a token name and a
context-dependent rule.
|
::
|
These
symbols in sequence define the boundary between a context-dependent
rule and the context in which the rule is applied.
|
__
|
These
symbols in sequence define the boundary between the left and right
context in a context-dependent rule. |
( ... )
|
Rounded
parentheses define a grouping of tokens within a rule.
Parentheses are useful if more than one word is to be repeated
using the repeat operator <*> or <+>, or
for grouping tokens with the "or" operator.
|
|
|
The
"or" operator specifies that the token (or grouping) to the left of the
| and the token (or grouping) to the right of the | have a
parallel structure in the state network. As an example of
grouping, the grammar "one two | three" is equivalent to "one (two |
three)", and different from "(one two) | three".
|
[ ... ]
|
Square
brackets identify whatever is within the brackets as optional.
|
\
|
The
backslash may be used in front of any special character to prevent the
parsing interpretations. For
example, the SAMPA symbol for the vowel in the word "bat" is {.
This
symbol is reserved in ABNF form. To ensure that this symbol will
be treated
as
a token and not parsed, it may be preceded by a backslash, e.g. \{.
|
$
|
The
dollar sign is NOT reserved. However, it is considered
good form to use token names that begin
with a dollar sign at the level of the top grammar.
|