2010-05-29 by Takuya Nishimoto (nishimotz atmark gmail.com)

Brief description of this tutorial

We build small voice interaction system (for Japanese) with Galatea Toolkit for Linux in this tutorial book. We use the following tools.

  • Ubuntu Linux 8.10 / 9.04 / (9.10) / 10.04
    • Ubuntu Desktop 32bit CD-ROM
    • Japanese Remix is used for development. However, it is not necessary.
    • Japanese EUC-JP locale is necessary (see below).
    • We tested in dual boot environment with Windows by wubi. It is not tested with VMWare guest environment.
  • Galatea Toolkit for Linux (2009-10)

Installation

In the case of Ubuntu Linux, installation is simple.

Ubuntu 10.04

This procedure is tested with Ubuntu 10.04 LTS environment, installed using the CD-ROM image (non Japanese-Remix version).

# for Ubuntu 10.04
$ sudo chmod 666 /var/lib/locales/supported.d/local
$ sudo echo "ja_JP.EUC-JP EUC-JP" >> /var/lib/locales/supported.d/local
$ sudo locale-gen

$ sudo aptitude install ruby
$ sudo aptitude install freeglut3
$ sudo aptitude install openjdk-6-jre
$ sudo aptitude install chasen
$ sudo aptitude install rhino
$ sudo aptitude install libreadline5
$ sudo aptitude install ttf-sazanami-gothic

$ sudo dpkg -i galatea-ja-chaone_1.3.2-1_i386.deb
$ sudo dpkg -i galatea-ja-unidic_20090604-1_i386.deb
$ sudo dpkg -i galatea-engine_20090604-1_i386.deb
$ sudo dpkg -i galatea-dialog_20100529-1_i386.deb

If want to build better Japanese desktop envrionment, procedures are as follows:

$ wget -q https://www.ubuntulinux.jp/ubuntu-ja-archive-keyring.gpg -O- | sudo apt-key add -
$ wget -q https://www.ubuntulinux.jp/ubuntu-jp-ppa-keyring.gpg -O- | sudo apt-key add -
$ sudo wget https://www.ubuntulinux.jp/sources.list.d/lucid.list -O /etc/apt/sources.list.d/ubuntu-ja.list
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install ubuntu-desktop-ja

or visit http://www.ubuntulinux.jp/products/JA-Localized

Ubuntu 8.10 / 9.04 (JA-remix)

Please configure hardware when the following functions are not effective.

  • Audio input and audio output (16KHz, 16bit, mono)
  • 3D-Graphics by OpenGL (if the 3D Desktop effect is available, it's ok)
$ sudo chmod 666 /var/lib/locales/supported.d/local
$ sudo echo "ja_JP.EUC-JP EUC-JP" >> /var/lib/locales/supported.d/local
$ sudo locale-gen
# for Ubuntu 9.04
$ sudo aptitude install libstdc++5
$ sudo aptitude install ruby
$ sudo aptitude install freeglut3
$ sudo aptitude install openjdk-6-jre
$ sudo aptitude install chasen
$ sudo aptitude install rhino

If the face image presentation module (FSM) is not compatible with 3D Desktop effect at present (screen is distracted), please set the following.

 System >> Appearance >> Visual Effects >> No effects(N)

Please install Galatea Toolkit. (Version numbers included in the filenames may be different.)

$ sudo dpkg -i galatea-ja-chaone_1.3.2-1_i386.deb
$ sudo dpkg -i galatea-ja-unidic_20090604-1_i386.deb
$ sudo dpkg -i galatea-engine_20090604-1_i386.deb
$ sudo dpkg -i galatea-dialog_20091005-1_i386.deb

First run (galatea-runner)

Commands of galatea-runner and galatea-generate are available in /usr/local/bin.

When you run galatea-runner with no option, a default dialog file starts.

$ galatea-runner

Two windows are displayed as follows.

When you speak either of "kon-nichi-wa" (hello) or "sayou-nara" (good-bye) to microphone, the agent may say "You said hello" or "You said good-bye" in Japanese.

Run/Pause button of Galatea Dialog Studio window suspends the dialog manager (It doesn't mean stopping speech recognition and voice synthesis directly).

When you talk "maiku-tesuto" (microphone test), the speech recognition will accept, but the agent does not reply.

If you want to finish, please press Ctrl-C in the terminal that the galatea-runner is running.

Close look of dialog file

Dialog files are described using VoiceXML. When you choose Source tab, we can see the default dialog file.

<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.0" xml:lang="ja">

<form id="init">
 <var expr="'to @AM-MCL set Mask = woman01 HAPPY 10 0 0 0'" name="mask_woman"/>
 <block>
  <native expr="mask_woman"/>
  <native>to @FS-MCL set HeadMoveRatio = 1.0 0.9</native>
  <native>to @FS-MCL set AutoMove = 1</native>
  <native>to @DIM set AutoGaze = 1</native>
  <goto next="#form1"/>
 </block>
</form>

<form id="form1">

 <grammar root="#greeting" version="1.0">
 <rule id="greeting">
 <one-of>
  <item> <token sym="まいくてすと">マイクテスト</token> </item>
  <item> <token slot="field1" sym="こんにちは">こんにちは</token> </item>
  <item> <token slot="field1" sym="さようなら">さようなら</token> </item>
 </one-of>
 </rule>
 </grammar>

 <field name="field1">
  <prompt>
    <native>to @FS-MCL set Emotion = HAPPY 10</native>
    <break time="600s"/>
  </prompt>

  <filled>
    あなたは<value expr="field1"/>と言いました。
    <clear namelist="field1"/>
  </filled>
 </field>

</form>

</vxml>

We can confirm execution of speech recognition in Logger tab. For example, event of the utterance acceptance is displayed when we utter with "microphone test" in the above-mentioned example. Only execution log of speech recognition is displayed in initial setup. Every inside module program unit of interactive system, it is possible for change of component that wants to display log.

In addition, with Grammar tab, we can confirm it how grammar element of VoiceXML mentioned above was converted into grammar of speech recognition engine (Julius).

/home/nishi/.galatea/vxml_rule2.dfa


[voca]
% NS_B
silB: silB 
% NS_E
silE: silE 
% grm1_greeting
マイクテスト m a i k u t e s u t o 
% grm2_greeting
こんにちは@field1=こんにちは k o N n i ch i h a 
% grm3_greeting
さようなら@field1=さようなら s a y o: n a r a 


[grammar]
S : NS_B root NS_E 
root : greeting 
greeting : grm1_greeting 
greeting : grm2_greeting 
greeting : grm3_greeting 

Generate a project (galatea-generate)

We can manage the setting that is necessary for interactive system in one directory for a project.

Command to make project directory is galatea-generate.

$ galatea-generate myproject
mkdir -p myprojct
mkdir -p myprojct/config
mkdir -p myprojct/script
myprojct/script/runner generated.
myprojct/config/project.yml generated.

The command to carry out interactive system in a project is script/runner.

$ cd myproject
$ script/runner 
config script/../config/project.yml /usr/local/istc-galatea-dialog/files/galatea.yml
tmppath /home/nishi/.galatea
set broadcast = AM-MCL
set broadcast = DM
set broadcast = FS-MCL
set broadcast = PAR
LOG: START FS-MCL
LOG: START AM-MCL
LOG: START DIM
LOG: START PAR
LOG: START SSM
LOG: START DM
LOG: START FSM
Galatea Dialog Studio 2.2.4b2 (090214)
(c)2003-2009 Takuya NISHIMOTO (nishimoto [atmark] m.ieice.org)
Uses Mozilla Rhino from mozilla.org.
See http://www.mozilla.org/rhino/.
Rhino 1.7 release 1 2008 10 20

The --dry-run (-n) option of runner generates the configuration files of each engines and displays the commands.

$ script/runner --dry-run
config script/../config/project.yml /usr/local/istc-galatea-dialog/files/galatea.yml
tmppath /home/nishi/.galatea
[runner] export LANG=ja_JP.eucJP;export LC_ALL=ja_JP.eucJP;export PERL_BADLANG=0;export AUDIODEV=/dev/dsp; 
cd /usr/local/istc-galatea-dialog/files/Modules; 
/usr/bin/perl ./AgentManager-gdm.pl -C /home/nishi/.galatea/am.conf

The configuration file of each engine is generated in runner dynamically. The configuration file is put in ~/.galatea of user with initial condition. In addition, temporary file about grammar of speech recognition is made here, too.

$ ls ~/.galatea
am.conf          fsm.conf     ssm.conf         vxml_rule2.grammar  vxml_rule3.dfa      vxml_rule3.term
am_mcl.conf      gdm.conf     vxml_rule2.dfa   vxml_rule2.term     vxml_rule3.dict     vxml_rule3.voca
chasenrc-euc-jp  julius.conf  vxml_rule2.dict  vxml_rule2.voca     vxml_rule3.grammar

The configuration file peculiar to project is config/project.yml. galatea-generate generates this. If a user changes this file, in future release, we can customize various elements.

$ cat config/project.yml 
enginepath: /usr/local/istc-galatea-engine
dmpath: /usr/local/istc-galatea-dialog/files
tmppath: /home/nishi/.galatea

back to EnglishIndex