Julius is an open-source, high-performance large vocabulary continuous speech recognition (LVCSR) engine for speech-related researchs and developments. With HMM acoustic model and language model, you can construct your own speech recognition system.
Copyright (c) 1991-2005 Kawahara Lab., Kyoto University
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
Copyright (c) 2005 Julius project team, Nagoya Institute of Technology
All rights reserved
What's New in Julius-3.5?
Julius/Julian rev.3.5 is a major update that incorporates several new
functions that may be useful for realizing a speech interface.
Memory efficiency is also improved. Another big progress to system
developers is that the comments in source codes are fully re-written
to be able to cross-reference in HTML format.
Summary of changes in 3.5:
- New features
- Input verification / rejection concurrently with recognition process
based on one-state GMM scores
- Word graph output
- Arbitrary character set conversion for tty/module output
- Improved multi grammar support on Julian
- EsounD audio server support on Linux
- Linux and Windows version have been integrated into one source
- Multi-path version has been integrated to the original.
(can use by "--enable-multipath" at compilation time)
- Migrated from VC++ to minGW on Windows
- Improved memory size
- Remove redundant part of tree lexicon and beam work area for 1st pass.
- Compaction of word N-gram index (reduced from 32 bit to 24 bit).
- New N-gram binary format (can still read old binary).
- Many bug fixes
- Spectral subtraction now works.
- Fixed newline code problems in Win&Mac (grammar files and -filelist).
- Fix USB audio input in Linux.
- Many other fixes.
- Doxygen support (you can generate full source documents in English!)
- Remove old Japanese documents in doc.
All the changes are listed in "Release.txt".
Contents of Julius-3.5
(Files with "ja" are written in Japanese)
00readme.txt ReadMe (This file)
LICENSE.txt Terms and conditions of use
Release.txt Release note / ChangeLog
configure configure script
Sample.jconf Sample configuration file for Julius-3.5
Sample-julian.jconf Sample configuration file for Julian-3.5
julius/ Julius/Julian 3.5 sources
libsent/ Julius/Julian 3.5 library sources
adinrec/ Record one sentence utterance to a file
adintool/ Record/split/send/receive speech data
gramtools/ Tools to build and test recognition grammar
jcontrol/ A sample network client module
mkbingram/ Convert N-gram to binary format
mkbinhmm/ Convert ascii hmmdefs to binary format
mkgshmm/ Model conversion for Gaussian Mixture Selection
mkss/ Estimate noise spectrum from mic input
support/ some tools to compile julius/julian from source
olddoc/ ChangeLogs before 3.2
From rev.3.4, a grammar-based recognizer called "Julian" is also
included. the Julian can be compiled from Julius sources by
specifying configure option "--enable-julian". The grammar format
Julian uses is original one based on BNF. A grammar compiler that
converts the written BNF to finite state grammar, and several test
tools are included in this archive.
The overall document that contains installation procedure,
tutorial, model formats and more, are available at:
o New features:
- Input verification / rejection using GMM (-gmm, -gmmnum, -gmmreject)
- Word graph output (--enable-graphout, --enable-graphout-nbest)
- Pruning on 2nd pass based on local posterior CM (--enable-cmthres)
- Multiple/per-grammar recognition (-gram, -gramlist, -multigramout)
- Can specify multiple grammars at startup: "-gram prefix1,prefix2,..."
or "-gramlist listfile" where listfile contains list of prefixes.
- General output character set conversion "-charconv from to"
based on iconv (Linux) or Win32API+libjcode (Windows)
o Improved audio inputs on Linux:
- ALSA-1.x support. (--with-mictype=alsa)
- EsounD daemon input support. (--with-mictype=esd)
- Fixed some bugs on USB audio input.
- Audio capturing device can be specified via env. "AUDIODEV".
- Extra microphone API support using portaudio and spLib API.
o Performance improvements:
- Reduced memory size for beam operation on the 1st pass.
- Slightly optimized tree lexicon by removing redundant data.
- Reduced size of word N-gram index (reduced from 32 bit to 24 bit).
o Fixed bugs:
- Not working spectral subtraction.
- Memory leak when stack exhausted ("stack empty") on 2nd pass.
- Segmentation fault on a very short input of 1 to 4 frames.
- AM trained with no CMN cannot be used with waveform/mic input.
- Wrong short-pause word handling on successive decoding mode.
- No output of "maxcodebooksize" at startup.
- No output of the number of sentences found when stack exhausted.
- No output of "-separatescore" on module mode.
- Beam width does not adjusted when grammar has been changed and
full beam options (-b 0) is specified in Julian.
- Wrong update of category-aware cross-word triphones when
dynamically switching grammar on Julian.
- No output of grammar to stdout on multiple grammar mode.
- Unable to send/receive audio data between different endian machines.
- (Linux) crash when compiled with icc.
- (Linux) some strange behavior on USB audio.
- (Windows) confuse with CR/LF newline inputs in several text inputs.
- (Windows) mkdfa.pl could not work on cygwin.
- (Windows) sometimes fails to read a file when not using zlib.
- (Windows) wrong file suffix when recording with "-record" (.raw->.wav)
o Unified source code:
- Linux and Windows version are integrated into one source.
- Multi-path version has been integrated with the normal version
into one source. The multi-path version of Julius/Julian, that
allows any transitions of HMMs including model skip transition,
can be compiled by "--enable-multipath" option. The part of
source codes for the multi-path version can be identified
by the definition "MULTIPATH_VERSION".
o Other improvements:
- Now can be compiled on MinGW/MSYS on Windows
- Totally rewritten comments in entire source in Doxygen format.
You can generate fully browsable source documents in English.
Try "make doxygen" at the top directory (you need doxygen installed)
- Install additional executables of julius/julian with version and setting
names like "julius-3.5-fast" when "make install" is invoked.
- Updated LICENSE.txt with English translation for reference.
o Changed behaviors:
- Binary N-gram file format has been changed for smaller size.
The old files can still be read directly by julius, in which
case on-line conversion will be performed at startup.
You can convert the old files (3.4.2 and earlier) to the new
format with the new mkbingram by involing the command below:
"mkbingram -d oldbinary newbinary"
Please note that since mkbingram now output the new format
file, it can not be read by older Julius.
The binary N-gram file version can be detected by the first 17
bytes of the file: old format should be "julius_bingram_v3" and
new format should be "julius_bingram_v4".
- Byte order of audio stream via tcpip fixed to LITTLE ENDIAN.
- Now use built-in zlib by default for compressed files. This may
make the engine startup slower, and if you prefer, you can still
use the previous method using external gzip command by specifying
- (Windows) Changed the compilation procedure on VC++. You can build
Julian by only specifying "-DBUILD_JULIAN" at compiler option,
and do not need to alter "julius.h".