The Format object provides the concept of output formatting for dynamic content contained in Variable, Expression, Script, Collection, and Layer objects, which are presented to the caller in an Output object. It offers various formatting capabilities for voice and video applications based on text-to-speech (TTS) synthesis as well as text-to-audio (TTA) processing using prerecorded audio files.
Custom algorithms, also for formatting in video, text and Web applications, can be added using the Formatting bus. Examples are custom phone number or date formatting, or account number formatting in a banking application. For more information on how to embed custom formatting algorithms to a VoiceObjects installation, refer to Appendix B – How to Use the Formatting Bus in the Administration Guide.
The Object Definition below covers the configuration of the Format object with VoiceObjects Desktop. For information on how to define this object type using VoiceObjectsXML, refer to the VoiceObjectsXML Definition paragraph.
The Format object belongs to the object category Resources.
The following dialog flow example as part of a flight information service demonstrates an Output object containing two prerecorded Audio snippets and two Variable objects to announce the terminal and gate information for a particular flight. The two Variable objects containing the terminal and gate data are played back with text-to-audio processing.
|
Object |
Dialog Flow |
||
|
|
[ Your flight will leave on time out of terminal: ] |
||
|
|
[ B ] |
||
|
|
[ gate: ] |
||
|
|
[ fifty five ] |
||
The second dialog flow example uses an Output object to read back a phone number (123 456 7890), which has been provided by the caller in an earlier dialog step and is stored in a Variable object. The assigned Format object reads back the phone number to the caller digit by digit in four blocks with a short pause in between, using text-to-speech synthesis.
|
Object |
Dialog Flow |
||
|
Text |
I have the following phone number: |
||
|
|
one two three |
||
The Definition of the Format object provides one section:
· Format

For further details regarding additional object configuration, refer to Properties in this Object Reference.
In the Format section, define how the formatted content is played back within an Output object. To be able to specify different formatting depending on channel and potentially language, define separate Format Items and set the layers correspondingly, i.e. Language, Channel, and Layer.
The Format object natively supports formatting for the voice channel only. Two different categories of formatting types are available:
· Text-To-Speech – (TTS) means that the media platform will process the formatted content with text-to-speech synthesis.
· Text-To-Audio – (TTA) means that the server will transform the formatted content into a sequence of references to audio files.
i8 Note: Using the special formatting type TTA-XML you can define your own markup to be rendered by the server. This way you could instruct the server to also render corresponding formatting code for the video, text or Web channel, e.g. a <video> element for video applications, or a <b> element for Web applications.
The drop-down list of the Type field offers various choices for both formatting categories. Each list entry has either the prefix TTA or TTS. The default formatting type is set to plain text. In the voice and video channel, the text is played back through TTS; in the text and Web channel, the text is displayed as is, with no special formatting.
A description of the selected format type is shown in a text field right below the Type field.

Refer to Text-To-Speech Formatting Types and Text-To-Audio Formatting Types for further details on the provided algorithms.
The optional Options field provides more precise TTS formatting for content types that can be expressed in diverse formats (e.g. date information). For example when using the formatting type TTS – Date the options string dmy specifies that the relevant date value will be presented in the order day, month and year. Refer to Formatting options for further details.
In case the object to be formatted is a Variable object and the type is one of the predefined TTA types, the pronunciation value of the variable will be taken as its value, not the value itself. Only if no pronunciation pattern is defined for the variable is the value taken. For more information on associating pronunciation patterns with a Variable object, refer to Input in this Object Reference.
The following five fields are only required when using a TTA formatting type to specify how and where the relevant audio files are stored and can be fetched from at call time.

The Audio location specifies the location of the relevant audio files (using a Resource Locator object).
The File prefix specifies a prefix that is applied to the filename. The file prefix can also be provided dynamically during call time by assigning a Variable, Expression, Layer, or Script object.
The File suffix specifies a suffix that is applied to the filename. The file suffix can also be provided dynamically during call time by assigning a Variable, Expression, Layer, or Script object.
The Random variations defines whether randomizing is enabled for the audio files. To enable randomizing, select a number from the drop-down list. If randomizing is enabled and multiple audio files are provided with the specified filename extended by different numbers, the server will randomly pick one of the available files with a random file extension between 1 and the selected number whenever the Audio object is used. True randomizing ensures that all available audio files are played before a repetition occurs. By default, the field is set to Disabled indicating that no randomizing is used. For more information on randomizing see the Output object in this Object Reference.
The File extension specifies the audio file extension. Instead of selecting one of the provided file extensions from the drop-down list, the extension can also be supplied by using a Variable, Expression, Layer, or Script object.
The complete filename for each audio file is generated at call time according to the following formula:
|
[prefix] + file + [suffix] + [random number] + [extension] |
The specified TTA algorithm generates the parameter file in the formula.
The three fields at the bottom of the editor specify the following properties:

The Intermediate silence field lets you play back a silence phase of a specified duration between each audio file when using TTA processing. For example when playing back a lengthy confirmation number a silence phase of 250ms can be added in between each digit.
The drop-down list offers a set of predefined silence timeframes. Alternatively a Variable, Expression, Layer, or Script object can be used to specify the silence phase at call time. The value needs to be numeric and will be interpreted in seconds.
The optional Value substitution field lets you specify a Collection object, which substitutes the formatted value with another value before presenting it to the caller. This mechanism can be used to transform an internal code or abbreviation into a complete name that makes sense to the caller, e.g. to substitute the state abbreviation CA with the full name California. Value substitution can be used for all TTS and TTA formatting types, including Default (Text). In both cases, the substitution is performed first and then subsequent formatting is applied to the resulting substituted value.
Refer to Value substitution collection for further details.
Finally the Custom separators field is used to overwrite the default set of separator tokens, which are used to split the formatted content into snippets (e.g. words and/or numbers when using TTA – Words/Numbers). The list of custom separator tokens needs to be concatenated without any separating character, e.g. use “_,” to split the formatted content at each occurrence of underscore and comma. The default set of separator tokens contains: White space, dot, comma, semicolon, colon, dash, underscore, forward slash and backslash.
The formatting type TTS without any specific instruction is the standard formatting setting. There are several choices available if the formatted content is known at design time (e.g. a time designation) and can therefore be mapped onto one of the listed formatting capabilities for the most optimal pronunciation and quality using text-to-speech synthesis:
· TTS – Acronym The formatted value is an acronym like IBM and is spoken as eye bee em.
· TTS – Currency Amount The formatted value is a currency amount. For example $22 is spoken as twenty two dollars.
· TTS – Date The formatted value is a partial or complete date. For example the value 10/2001 is spoken as October, two thousand and one when using the TTS format my. The supported date formats indicate the form of the value and are usually dmy, mdy, ymd,ym, my, md, y, m and d. The desired output format can be specified in the Options field (see further below).
· TTS – Digits The formatted value object is a sequence of digits. For example 12 is spoken as one two.
· TTS – Duration The formatted value is a temporal duration. For example the value 10 is spoken as ten hours when using the TTS format h. The supported duration formats indicate the form of the value and are usually hms, hm, ms, h, m and s. The desired output format can be specified in the Options field (see further below).
· TTS – E-mail Address The formatted value is an Internet e-mail address. For example name@company.com is spoken as name at company dot com.
· TTS – Literal The formatted value is a sequence of characters. For example Smith is spoken as individual characters ess em eye tea aitch.
· TTS – Measurement The formatted value is a measurement. For example 10ft is spoken as ten feet.
· TTS – Name The formatted value is a name. For example VoiceObjects is spoken as voice objects.
· TTS – Cardinal Number The formatted value is a cardinal number. For example 12 is spoken as twelve.
· TTS – Ordinal Number The formatted value is an ordinal number. For example 12 is spoken as twelfth.
· TTS – Postal Address The formatted value is a text describing a postal address with street, city, state, and zip code separated by commas. For example 8080 Leesburg Pike, Reston, VA, 20190 is spoken as eighty eighty <pause> Leesburg Pike <pause> Reston <pause> Virginia <pause> two zero one nine zero.
· TTS – Telephone Number The formatted value is a telephone number. For example the phone number 123 456 7890 is spoken as one two three <pause> four five six <pause> seven eight <pause> nine zero.
· TTS – Time The formatted value is a partial or complete time designation. For example the value 8 is spoken as eight o'clock when using the TTS format h. The supported time formats indicate the form of the value and are usually hms, hm and h. The desired output format can be specified in the Options field (see further below).
· TTS – URI The formatted value is an Internet identifier such as www.company.com. It will be spoken out as double u double u double u dot company dot com.
i8 Note: Some of the certified media platforms may not support all of the listed TTS types. The support can also differ depending on the integrated TTS engine or on the language used. Further details can be found in the corresponding platform documentation. A few of the certified media platforms support various additional TTS formatting capabilities, e.g.:
· Airport The formatted value is an airport code. For example CGN is spoken as Cologne.
· Airline The formatted value is an airline code. For example LH is spoken as Lufthansa.
· Equity The formatted value is an equity symbol. For example CSCO is spoken as Cisco Systems.
The output type can also be defined by referencing a Variable, Expression, Layer, or Script object instead of selecting a predefined formatting type. The provided string will be passed into the <say-as> element of the generated markup code (e.g. VoiceXML). Here is an example of providing the string MyOutputType, depending on the supported VoiceXML standard as defined in the media platform driver of VoiceObjects Server:
· <sayas> element with the class attribute (VoiceXML 1.0 Specification - March 2000)
|
· <sayas class=”MyOutputType”> · [Formatted value] · </sayas> |
· <say-as> element with the type attribute (VoiceXML 2.0 Specification - October 2001)
|
· <say-as type=”MyOutputType”> · [Formatted value] · </say-as> |
· <say-as> element with the interpret-as attribute (VoiceXML 2.0 Specification - March 2004)
|
· <say-as interpret-as=”MyOutputType”> · [Formatted value] · </say-as> |
If required, an additional TTS options string can be supplied (refer to Formatting options below). This string is usually concatenated with the formatting type, using a colon in between. For example, the TTS options string MyFormat together with the formatting type string MyOutputType will be rendered in the following way:
· <say-as> element with the class attribute (VoiceXML 1.0 Specification - March 2000)
|
· <sayas class=”MyOutputType:MyFormat”> · [Formatted value] · </sayas> |
· <say-as> element with the type attribute (VoiceXML 2.0 Specification - October 2001)
|
· <say-as type=”MyOutputType:MyFormat”> · [Formatted value] · </say-as> |
· <say-as> element with the interpret-as attribute but not supporting the format attribute (VoiceXML 2.0 Specification - March 2004)
|
· <say-as interpret-as=”MyOutputType:MyFormat”> · [Formatted value] · </say-as> |
When the media platform supports the format attribute within the VoiceXML <say-as> element, the TTS options string will be assigned to the format attribute and not concatenated with the string for the output type.
· <say-as> element with the interpret-as attribute and support for the format attribute (VoiceXML 2.0 Specification - March 2004)
|
· <say-as interpret-as=”MyOutputType” · [Formatted value] · </say-as> |
The following three formatting types offer additional formatting options:
TTS – Date
Date information can be provided as a partial date (e.g. only year and month) or as a complete date (containing year, month, and day). In either case the TTS processor needs to know what to do with each part of the information to produce a suitable output to the caller. For example the date information 01/02/03 is ambiguous. By giving the TTS processor the formatting hint dmy, which translates into day, month, and year, the date can be synthesized to first of February two thousand and three. The following formatting options are provided when using a date designation:
· dmy day, month and year
· mdy month, day and year
· ymd year, month and day
· ym year and month
· my month and year
· md month and day
· y year only
· m month only
· d day only
TTS – Duration
If the formatted content is a duration value, it can be provided in several ways such as the number of hours (5) or the number of minutes and seconds (6:21). By giving the TTS processor the formatting hint ms, which translates into minutes and seconds, the duration 6:21 can be synthesized to six minutes and twenty-one seconds. When providing the formatting hint hm, it is synthesized to six hours and twenty-one minutes. The following formatting options are provided when using a duration value:
· hms hours, minutes and seconds
· hm hours and minutes
· ms minutes and seconds
· h hours only
· m minutes only
· s seconds only
TTS – Time
A time value can also be provided as partial or complete information. The following formatting options are available when using a time designation:
· hms hours, minutes and seconds
· hm hours and minutes
· h hours only
The Options field can either be specified as a constant string or via the value of a Variable, Expression, Layer, or Script object, which gets evaluated at call time.
i8 Note: In all cases the content of the Options field should be provided in lower case notation.
Text-to-audio (TTA) is an integrated formatting mechanism of VoiceObjects Server that converts data in its raw format to a sequence of references to prerecorded audio resources. The formatting type TTA provides five different ways to convert the formatted data content:
· TTA - Literals/Digits The formatting content is a sequence of characters and/or digits. Each character or digit is played via a separate audio file. The audio files are named with the corresponding character or digit (e.g. a.wav or 1.wav).
· TTA - Literals/Digits (ASCII) Each literal or digit of the formatted content is played back to the caller as a separate audio resource. The audio files are named with the corresponding 3-digit decimal ASCII code (e.g. 097.wav for character a or 049.wav for digit 1).
· TTA - Words/Numbers The formatted content is chopped into words and number blocks where each of them is played back as a separate audio resource. The files are named according to word or number block (e.g. airbus.wav or 380.wav).
· TTA - Complete Value The complete formatted content is played back as a single audio resource. The file is named according to content itself where white spaces are replaced with an underscore (e.g. airbus_a_380.wav).
· TTA - Files The formatted content is expected to be a single audio file or a white space separated list of audio files (e.g. airbus_a_380.vox boeing_747.vox boeing_777.vox).
· TTA - XML The formatted content is expected to be an XML string containing a set of audio files with optional alternative text representations. Each audio resource will be played to the caller in the order of the XML structure.
This special formatting type also allows defining markup to be rendered, so it can also be used to define formatting for the video, text and Web channel.
@8 Tip: You can create your own formatting algorithms and embed them in VoiceObjects Desktop and VoiceObjects Server. See Appendix B – How to Use the Formatting Bus in the Administration Guide for more information on how to do this.
If you are using sites and have setup a site prefix (Resource Locator URL) for external resources in your installation, this prefix will be applied to all TTA types. Custom algorithms embedded through the Formatting bus will not use the site prefix, though, unless the Format object defines an audio location. For more information on the site settings, refer to Chapter 3 – User Management - Managing Sites in the Administration Guide.
i8 Note: All audio files for TTA processing must be of a single audio type. Mixing wav and vox files in one playback sequence, for example, is not supported.
The TTA processing uses the rendered audio filename as the alternative text definition for TTS processing. If a missing audio file is requested by the media platform the corresponding filename will be played back via TTS. When using the TTA functionality it is very helpful during the development phase to specify a maintainer e-mail address in the corresponding Service object. In this case the media platform will send an e-mail to this address indicating the missing audio file.
In case of Variable objects, all predefined TTA algorithms use the pronunciation value of the variable, not the real value. Only if no pronunciation pattern is defined for the variable is the value taken. For more information on this, refer to Input in this Object Reference.
The following rules are applied when generating audio filenames character by character or digit by digit:
· Variable content is converted to lower case representation.
· Leading and trailing white spaces are removed.
· All white spaces and special characters not in [a-z] or [0-9] are removed.
· For each literal or digit a corresponding filename is generated according to the following formula:
|
[prefix] + [literal/digit] + [suffix] + [extension] |
Examples for generated filename sequences are:
|
Formatted Content |
Prefix |
Suffix |
Extension |
Filename Sequence |
|
Hello world! |
|
|
.wav |
h.wav |
|
- Hello world! - |
8kHz_ |
|
.wav |
8kHz_h.wav |
|
S F O |
|
|
.vox |
s.vox |
|
Müller |
|
|
.wav |
m.wav |
|
(0)171-549-0087 |
|
en |
.wav |
0en.wav |
|
+49 2205 845 100 |
de/ |
|
.wav |
de/4.wav |
Alternative text for each individual audio file is the corresponding original literal or digit.
Use cases for this processing include:
· Phone number
· Credit card number
· Account number
· Customer ID
· PIN
· Zip code
· Social security number
· Digits
· Spelling
The following rules are applied when generating audio filenames literal by literal or digit by digit by converting them to ASCII decimal code (3-digits):
· Variable content is converted to lower case representation.
· All white spaces are removed.
· For each literal or digit a corresponding filename is generated according to the following formula:
|
[prefix] + [ASCII code] + [suffix] + [extension] |
Examples for generated filename sequences are:
|
Formatted Content |
Prefix |
Suffix |
Extension |
Filename Sequence |
|
Hello world! |
|
|
.wav |
104.wav [h] |
|
S F O |
ascii_ |
|
|
ascii_115 [s] |
|
Müller |
|
|
.wav |
109.wav [m] |
|
+49-(0)171-5490087 |
|
en |
.wav |
043en.wav [+] |
Alternative text for each individual audio file is the corresponding original literal or digit.
Use cases for this processing include:
· Phone number
· Credit card number
· Account number
· Customer ID
· PIN
· Zip code
· Social security number
· Digits
· Spelling
The following rules are applied when generating audio filenames for each single word or number:
· Variable content is converted to lower case representation.
· Leading and trailing white spaces are removed.
· White space, dot, comma, semicolon, colon, dash, underscore, forward slash and back slash are used by default as separators to split the content into words and numbers. The set of separators can be customized using the Custom separators field.
· All remaining special characters not in [a-z] or [0-9] are removed.
· For each word or number a corresponding filename is generated according to the following formula:
|
[prefix] + [word / number] + [suffix] + [extension] |
Examples for generated filename sequences are:
|
Formatted Content |
Prefix |
Suffix |
Extension |
Filename Sequence |
|
Hello world! |
|
|
.wav |
hello.wav |
|
“ Hello world! ” |
|
_DE |
.mp3 |
hello_DE.mp3 |
|
- Hello world! - |
en/ |
|
.vox |
en/hello.vox |
|
gate D::47 |
|
|
.wav |
gate.wav |
|
http://www.company.com |
us. |
|
.au |
us.http.au |
|
“ Airbus: A-380 ” |
|
|
.wav |
airbus.wav |
|
Munich or Hamburg |
|
|
.wav |
munich.wav |
|
CA, FL, VA and TX |
|
|
.wav |
ca.wav |
|
Joe, Tom or optionally Ben |
|
|
.wav |
joe.wav |
|
(0)171-264-5597 |
|
|
.wav |
0171.wav |
|
IP: 255.255.255.0 |
|
|
.wav |
ip.wav |
Alternative text for each individual audio file is the corresponding original word or number.
Use cases for this processing include:
· Person, company, product and property names
· Country, city and state names
· Flight gate or railway track number
· All kind of codes (state, equity, currency, airline, airport, etc.)
· Predefined set of Internet addresses
· Predefined set of phone numbers
The following rules are applied when generating audio filenames for the complete content:
· Variable content is converted to lower case representation.
· Leading and trailing white spaces are removed.
· White space, dot, comma, semicolon, colon, dash, underscore, forward slash and back slash are used by default as separators to split the content into words and numbers. The set of separators can be customized using the Custom separators field.
· All remaining special characters not in [a-z] or [0-9] are removed.
· All separators are replaced by an underscore, where multiple separators are replaced by a single underscore.
· A corresponding filename is generated according to the following formula:
|
[prefix] + [modified content] + [suffix] + [extension] |
Examples for generated filename sequences are:
|
Formatted Content |
Prefix |
Suffix |
Extension |
Filename Sequence |
|
Hello world! |
8kHz_ |
|
.wav |
8kHz_hello_world.wav |
|
“ Hello world! ” |
|
_DE |
.mp3 |
hello_world_DE.mp3 |
|
- Hello world! - |
en/ |
|
.vox |
en/hello_world.vox |
|
New York City |
|
|
.wav |
new_york_city.wav |
|
gate D::47 |
|
|
.wav |
gate_d_47.wav |
|
Airbus: A-380 |
|
|
.wav |
airbus_a_380.wav |
|
S F O |
us. |
|
.au |
us.s_f_o.au |
|
Müller |
|
|
.wav |
mller.wav |
Alternative text for the audio file is the original unmodified content.
Use cases for this processing include:
· Person, company, product and property names
· Country, city and state names
· Flight gate or railway track number
· All kinds of codes (state, equity, currency, airline, airport, etc.)
The following rules are applied when the content already provides the reference of one or multiple audio files:
· Leading and trailing white spaces are removed.
· White space is used as separator to split the content into separate filenames or references.
· Each filename is processed as provided. Any specified TTA prefix, suffix or extension settings are ignored. Only the TTA locator (if provided) is used to generate each complete file reference.
Examples for generated filename sequences are:
|
Formatted Content |
Filename Sequence |
|
Hello.wav |
Hello.wav |
|
Hello-en-US.wav World-en-US.wav |
Hello-en-US.wav |
|
1.vox 7.vox 15.vox and.vox 22.vox |
1.vox 22.vox |
|
de/CA.wav de/or.wav de/VA.wav |
de/CA.wav |
|
http://255.255.255.0/jingle.mp3 |
http://255.255.255.0/jingle.mp3 |
|
file://MusicBox/track-14.v1.au |
file://MusicBox/track-14.v1.au |
Alternative text for each individual audio file is the corresponding filename.
Use cases for this processing include:
· Ring tones
· Voice mail messages
· Music tracks
· All kinds of prerecorded news (business, weather, traffic, sports, etc.)
· Dating personals
· Recordings
When using TTA-XML, a custom sequence of audio files with optional alternative texts can be provided. Alternatively, custom markup code can be provided that is used as-is inside the code rendered by the server. This option can be used to play video instead of audio, or to make use of specific formatting capabilities provided by certain media platforms or browsers. It can also be used to embed Web formatting, like <b> or <center>, into the XHTML code produced by the server in the Web channel.
The XML format is as shown below.
i8 Note: When using special characters as part of the filename or the alternative text definitions the provided content must be enclosed in CDATA elements.
|
<root> |
Any number of rows may be provided within the <root> element. Each row may contain a column named “markup”, a column named “objectReference”, or columns named “file” and (optionally) “text”.
In either case it is also allowed to provide a column named “status”, which contains a text message that is written to the service's error log file (see Chapter 2 – Configuring Servers and Services in the Deployment Guide for more details). It may be used to indicate error conditions, e.g. a wrongly formatted input string.
If a column named “markup” is provided, then its content is embedded into the code generated by the server. No processing or checking takes place on the markup code that is provided, so broken code inside a “markup” column may lead to invalid code being sent to the media platform or browser.
By using "markup" in the voice or video channel, proprietary code can be embedded into the rendered VoiceXML. Media platform vendors such as Genesys, Intervoice, Nortel, and Nuance, for instance, provide their own TTA-like algorithms to use through proprietary VoiceXML element extensions of <value> or <audio> (see, e.g., http://cafe.bevocal.com/docs/vxml/voices.html#279348). Another scenario is to embed video elements in this column, to play back content through a concatenation of video files.
If a column named “objectReference” is provided, its content is interpreted as the Reference ID of an object. This object is evaluated and the resulting markup code is inserted. Note that the referenced object must be one of Output, Audio, Video, Silence, Script, Variable, Collection, Expression, or Layer. Attempts to reference an object that cannot be evaluated lead to an Error-Internal. If the referenced object cannot be found, an error is logged but processing continues.
If a column named “file” and optionally a column named “text” is provided, then the following rule will be applied to the XML string containing files and optional alternative text representations:
· Each filename and alternative text is processed as provided. Any specified TTA prefix, suffix or extension settings are ignored. Only the TTA locator (if provided) is used to generate each complete file reference.
Note that the “file” column, when present, must not be empty, and that a “text” column may only be used in addition to a “file” column to define alternative text for that file.
Examples for generated filename sequences are:
|
Formatted Content |
Filename Sequence |
|
<root> |
|
|
<root> |
|
|
<root> |
|
|
<root> |
|
Examples for native markup code are:
|
Formatted Content |
|
|
|
|
|
|
|
|
The mechanism of value substitution is provided on the basis of a Collection object, which is used as a substitution dictionary. The XML format of the collection requires a column with the name key that defines the lookup key, which needs to map the formatted value. A column default can be provided to specify the corresponding substitution values. Alternatively (or in addition), multiple columns with language codes such as en-UK or fr-FR can be added to define the corresponding substitution values in a localized version.
The first example provides a substitution collection to replace all US state abbreviations with the full name. This example is using the column name default to define the substitution values:
|
<root> <row> |
The second example outlines a substitution collection containing the numerical representation of a month (1-12) with the localized name in US English (using the column name en-US) and German (using the column name de-DE):
|
<root> . . . <row> |
i8 Note: If a substitution collection is specified and the formatted value cannot be found in the key column, no substitution will take place. In this case the original value will be used when playing it back to the user via TTS or TTA processing. The same will happen when the dialog is processed in a particular language for which no column with the corresponding language code is defined.
The Format object is represented by the VoiceObjectsXML element <format>. It has no special attributes and one possible child <formatItem>.
In addition, the element has the standard attributes described in the XDK Guide.
· +<formatItem>
Defines the list of available Format items.
<format name=”Digits”>
<formatItem channel=”voice” type=”tts-digits”/>
<formatItem channel=”textWeb” type=”tts”/>
</format>
· label
A text string providing a name for the Format item.
· language
Defines the language for which this Format item is valid. Can be default or a valid language code (e.g. de-DE, en-US, etc.). If not specified, defaults to default.
Appendix A – Language Codes contains a list of all language codes available in VoiceObjects together with the respective language they represent.
· channel
Defines the channel(s) for which this Format item is valid. Can be default, voice, video, text, web, voiceVideo, or textWeb. If not specified, defaults to default.
· layer
Defines the layer for the Format item. Can either be a reference to a Collection, Expression, Script, or Variable object; or a layer state reference of the form “Layer=State” or “Layer!=State” where “State” is the label of a state for the layer “Layer”.
· type [required]
Defines the type of output formatting to be used. Legal values are tts, tts-acronym, tts-cardinal, tts-currency, tts-date, tts-digits, tts-duration, tts-email, tts-literals, tts-measurement, tts-name, tts-ordinal, tts-address, tts-phone, tts-time, tts-uri, tta-complete, tta-words, tta-files, tta-xml, tta-literals, tta-ascii. If not specified, defaults to tts.
If you want to use a custom TTA algorithm, set the type attribute to the name of the desired algorithm, which is set in the TTA definition file.
· options
Specifies options to be applied when processing the output formatting.
· audioLocation
Defines the location from where audio files are to be retrieved. Must be a reference to a resource locator.
· prefix
Defines a prefix that is attached to all audio files referenced in the formatted output. May be static text or a reference to a Variable, Expression, or Script object.
· suffix
Defines a suffix that is attached to all audio files referenced in the formatted output. May be static text or a reference to a Variable, Expression, or Script object.
· random
Indicates whether randomization should be used. Can be either disabled or an integer >= 2 specifying the number of versions in which the file is available. If not specified, defaults to disabled.
· extension
Specifies the file extension for audio files as either a constant value or a reference to a Variable, Expression, or Script object. Legal constant values are none, wav, aif, aiff, dwd, mp3, snd, au, voc, vox. If not specified, defaults to wav.
· silence
Defines the duration of a pause between every two consecutive audio files. Either disabled or a numerical value interpreted as seconds. If not specified, defaults to disabled. May be static text or a reference to a Variable, Expression, or Script object.
When importing an XDK application into the VoiceObjects Metadata Repository via VoiceObjects Desktop, this value must be one of those presented in the Desktop drop-down list (0.1, 0.25, 0.5, 0.75, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 110, 120), or an object reference.
· valueSubstitution
Reference to a Collection object defining a substitution table. This substitution is applied before processing the output formatting.
· separators
Defines a list of custom separators used when splitting data. May be static text or a reference to a Variable, Expression, or Script object.
<format name=”Digits”>
<formatItem channel=”voice” type=”tts-digits”/>
<formatItem channel=”textWeb” type=”tts”/>
</format>
The Format object is compatible with the following other objects which can contain dynamic content:
|
Icon |
Object Name |
Use Case Example |
|
|
A Format object can be used within a Collection object to format the collection values. |
|
|
|
A Format object can be used within an Expression object to format the evaluated expression result. |
|
|
|
A Format object can be used within a Script object to format the evaluated script result. |
|
|
|
A Format object can be used within a Variable object to format the variable value. |
|
|
|
A Format object can be used within a Layer object to format the active layer state(s). |
In order to leverage the capabilities of the integrated documentation of VoiceObjects it is important to provide intuitive and self-explanatory object names and descriptions.
The name of a Format object should be as short as possible and at the same time precise enough that other users will know immediately what the formatting definition is good for.
The table below is listing two examples as a general guideline:
|
Name |
Description |
|
|
Say content as phone number with text-to-speech processing. |
|
|
Say content as phone number with text-to-audio processing. Silence of 250ms seconds is added between each phone number digit. |