• R/O
  • HTTP
  • SSH
  • HTTPS

Commit

Tags
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

ソースコードの管理場所


Commit MetaInfo

Revisãob042492ba42d5532181dcd00bb3f84961bd16e6c (tree)
Hora2011-11-22 20:05:25
AutorHironori Kitagawa <h_kitagawa2001@yaho...>
CommiterHironori Kitagawa

Mensagem de Log

Changed the reference for ptexenc.

Mudança Sumário

Diff

--- a/doc/ajt-devel-ltja.tex
+++ b/doc/ajt-devel-ltja.tex
@@ -1,1235 +1,1235 @@
1-%#!lualatex ajt-devel-ltja
2-\documentclass{ajt}
3-
4-%%% Packages used in this paper
5-
6-%%% Font setting for \LuaTeX; this is extract from ajt.cls
7-\makeatletter
8- \if@print
9- \RequirePackage{fontspec,xunicode}
10- \RequirePackage{luatextra}
11- \setmainfont[Mapping=tex-text]{Palatino LT Std}
12- \setsansfont[Mapping=tex-text]{Optima LT Std}
13- \else
14- \RequirePackage{fontspec,luatextra}
15- \setmainfont[Mapping=tex-text]{TeX Gyre Pagella} % \simeq Palatino
16- \fi
17-
18-%%% LuaTeX-ja
19-\usepackage{luatexja,luatexja-fontspec}
20-\ltjsetparameter{jacharrange={-3,-8}}
21-\DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.92489] file:ipam.ttf:jfm=ujis}{}
22-\DeclareFontShape{JY3}{gt}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=ujis}{}
23-% quick hack: monospaced Japanese font by \ttfamily
24-\DeclareKanjiFamily{JY3}{\ttdefault}{}{}
25-\DeclareFontShape{JY3}{\ttdefault}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=mono}{}
26-
27-
28-%%% LTXexample environment
29-\usepackage{showexpl,lltjlisting}
30-\lstset{basicstyle=\ttfamily\small, width=0.3\textwidth, basewidth=.5em}
31-
32-%%% Verbatim environment
33-\usepackage{fancyvrb}
34-\CustomVerbatimEnvironment{code}{Verbatim}%
35-{numbers=left,xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
36-\CustomVerbatimEnvironment{codewithoutnum}{Verbatim}%
37-{xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
38-\CustomVerbatimEnvironment{codewithoutnumsmall}{Verbatim}%
39-{xleftmargin=1.5em,baselinestretch=1.0,fontsize=\footnotesize}
40-\DefineShortVerb{\|}
41-
42-%%% Others
43-\usepackage{mflogo,booktabs}
44-\definecolor{grayx}{gray}{0.85}
45-\hyphenation{
46- kanjiskip
47- xkanjiskip
48-}
49-
50-%%% Mandatory article metadata %%%
51-\title{Development of \LuaTeX-ja package}
52-\author[北川 弘典]{Hironori Kitagawa}
53-\address{\LuaTeX-ja project team}
54-\email{h\_kitagawa2001@yahoo.co.jp}
55-
56-\keywords{\TeX, p\TeX, \LuaTeX, \LuaTeX-ja, Japanese}
57-\abstract{%
58-\LuaTeX-ja package is a macro package for typesetting Japanese
59-documents under \LuaTeX. The package has more flexibility of
60-typesetting than \pTeX, which is widely used Japanese extension of \TeX,
61-and has corrected some unwanted features of \pTeX.
62-In this paper, we describe specifications, the current status and some
63-internal processing methods of \LuaTeX-ja.
64-}
65-
66-\newcommand{\parname}[1]{\textsf{#1}}
67-\newcommand{\jstrut}{\vrule width0pt height\cht depth\cdp}
68-\newcommand{\imagfm}[1]{\ifvmode\leavevmode\fi%
69- \hbox{\fboxsep=0pt\fbox{\setbox0=\hbox{#1}\copy0\kern-\wd0
70- \smash{\vrule width \wd0 height 0.4pt depth0.4pt}}}}
71-\begin{document}
72-
73-%%% Do not forget to start with \maketitle!
74-\maketitle
75-
76-\section{Introduction}
77-\subsection{History}
78-To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has
79-been widely used in Japan. There are other methods---for example, using
80-Omega and OTP~\cite{omega}, or with the CJK package---to do so, however,
81-these alternative methods did not become majority. The author thinks
82-that this is because \pTeX\ enables us to produce high-quality documents
83-(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is
84-earlier than that of alternatives described above.
85-
86-However, \pTeX\ has been left behind from the extensions of \TeX\ such
87-as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent
88-years, the situation has become better, by development of
89-|ptexenc|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}),
90-$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex}
91-by Takuji Tanaka (田中琢爾). However, continuing this approach, namely,
92-to develop an engine extension localized for Japanese, is not wise. This
93-approach needs lots of work for \emph{each} engine. In addition, if we
94-use \LuaTeX, the necessity of an engine extension is getting smaller
95-because \LuaTeX\ has an ability to hook \TeX's internal process by using
96-Lua callbacks.
97-
98-
99-There were several experimental attempts to typeset
100-Japanese documents with \LuaTeX\ before. Here we cite three examples:
101-\begin{itemize}
102-\item |luaums.sty|~\cite{luaums} developed by the author. This
103- experimental package is for creating a certain Japanese-based presentation
104- with \LuaTeX.
105-\item the \emph{luajalayout} package~\cite{luajalayout}, formerly known as the
106- \emph{jafontspec} package, by Kazuki Maeda (前田一貴). This package is based on
107- \LaTeXe\ and \emph{fontspec} package.
108-\item the \emph{luajp-test} package~\cite{luajp-test}, a test package made by
109- Atsuhito Kohda (香田温人), based on articles on the web page~\cite{joylua}.
110-\end{itemize}
111-However, these packages are based on \LaTeXe, and do not have much
112-ability to control the typesetting rule. And it is inefficient that more
113-than one people separately develop similar packages. Development of the
114-\LuaTeX-ja package is started initially by the author and Kazuki Maeda, because of
115-these situations.
116-
117-\subsection{Development policy of \LuaTeX-ja}
118-\label{ssec-pol}
119-The first aim of \LuaTeX-ja project was to implement features (from the
120-`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is
121-much affected by \pTeX. However, as development proceeded, some
122-technical/conceptual difficulties arose. Hence we changed the aim
123-of the project as follows:
124-\begin{itemize}
125-\item\emph{\LuaTeX-ja offers at least the same flexibility of
126- typesetting that p\TeX\ has.}
127-
128- We are not satisfied with the ability of producing outputs conformed to
129- JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for
130- typesetting, or to a technical note~\cite{w3c} by W3C;
131- if one wants to produce very incoherent outputs for some reason, it
132- should be possible.
133-In this point, previous attempts of Japanese typesetting with \LuaTeX\
134- which we cited in the previous subsection are inadequate.
135-
136-\pTeX\ has some flexibility of typesetting, by changing internal
137- parameters such as |\kanjiskip| or |\prebreakpenalty|, and by using
138- custom JFM (Japanese TFM). Therefore we decided to include these
139- functionality to \LuaTeX-ja.
140-
141-\item\emph{\LuaTeX-ja isn't mere re-implementation or porting of \pTeX;
142- some (technically and/or conceptually) inconvenient features of
143- \pTeX\ are modified.}
144-
145- We describe this point in more detail at the next section.
146-\end{itemize}
147-
148-
149-\subsection{Overview of the processes}
150-\label{ssec-over}
151-We describe an outline of \LuaTeX-ja's process in order.
152-
153-\begin{itemize}
154-\item In the |process_input_buffer| callback: treatment of breaking
155- lines after a Japanese character (in Subsection~\ref{ssec-line}).
156-
157-\item In the |hyphenate| callback: font replacement.
158-
159-\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If
160- the character represented by $p$ is considered as a Japanese
161- character, the font used at $p$ is replaced by the value of
162- |\ltj@curjfnt|, an attribute for `the current Japanese font'
163- at~$p$.
164-
165-Furthermore, the subtype of $p$ is subtracted by 1 to suppress
166- hyphenation around $p$ by \LuaTeX, because later processes of
167- \LuaTeX-ja take care of all things about Japanese characters.
168-
169-\item In |pre_linebreak_filter| and |hpack_filter| callbacks:
170-
171-\begin{enumerate}
172-\item \LuaTeX-ja has its own stack system, and the current horizontal
173- list is traversed in this stage to determine what the level of
174- \LuaTeX-ja's internal stack at the end of the list is. We will
175- discuss it in Subsection~\ref{ssec-stack}.
176-
177-\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese
178- typesetting in the list. This is the core routine of \LuaTeX-ja.
179- We will discuss it in Subsections
180- \ref{ssec-jglue}~and~\ref{ssec-jspec} .
181-
182-\item To make a match between a metric and a real font, sometimes
183- adjustument of the position of (Japanese) glyphs are performed.
184- We will discuss it in Subsection~\ref{ssec-width}.
185-\end{enumerate}
186-\item In the |mlist_to_hlist| callback: treatment of Japanese characters
187- in math formulas. This stage is similar to adjustment of the
188- position of glyphs (see above), so we omit to describe this stage
189- from this paper.
190-\end{itemize}
191-
192-In this paper, a \emph{alphabetic character} means a non-Japanese
193-character. Similarly, we use the word an \emph{alphabetic font} as the
194-counterpart of a jJpanese font.
195-
196-\subsection{Contents of this paper}
197-Here we describe the contents of the rest of this paper briefly. In
198-Section~\ref{sec:differences_with_ptex}, we describe major differences
199-between \pTeX\ and \LuaTeX-ja. The next section,
200-Section~\ref{sec:distinction_of_characters}, is concentrated on a
201-problem how we distinguish between Japanese characters and alphabetic
202-characters. In Section~\ref{sec:current_status}, we show current
203-development status of the package. Finally, in
204-Section~\ref{sec:implementation}, we describe some internal routines of
205-\LuaTeX-ja.
206-
207-\subsection{General information of the project}
208-This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki
209-is located on
210-\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is
211-no stable version on October 22, 2011, however a set of developer sources can be
212-obtained from the git repository. Members of the project team are as follows
213-(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato,
214-Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda,
215-and~Shuzaburo Saito.
216-
217-
218-\section{Major differences with \pTeX}
219-\label{sec:differences_with_ptex}
220-In this section, we explain several major differences between \pTeX\
221-and our \LuaTeX-ja. For general information of Japanese typesetting and the
222-overview of \pTeX, please see Okumura~\cite{ptexjp}.
223-
224-
225-\subsection{Names of control sequences}
226-\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's
227-original \TeX82 engine, some of the additional primitives take a form that is
228-very difficult to be simulated by a macro. For example, an additional
229-primitive |\prebreakpenalty|$\langle\hbox{\it
230-char\_code}\rangle$|[=]|$\langle\hbox{\it penalty}\rangle$ in \pTeX\
231-sets the amount of penalty inserted before a character whose code is
232-$\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it
233-penalty}\rangle$, and this form |\prebreakpenalty|$\langle\hbox{\it
234-char\_code}\rangle$ can be also used for retrieving the value.
235-
236-Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a
237-horizontal box or that of a paragraph are valid in whole box or
238-paragraph. However, the implementation of these parameters in
239-\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}.
240-
241-From above two problems discussed above, the assignment and retrieval
242-of most parameters in \LuaTeX-ja are summarized into the following
243-three control sequences:
244-\begin{itemize}
245-\item |\ltjsetparameter{|$\langle\hbox{\it
246- name}\rangle$|=|$\langle\hbox{\it value}\rangle$|,...}|: for local
247- assignment.
248-\item |\ltjglobalsetparameter|: for global assignment. Note that these two control
249- sequences obey the value of |\globaldefs| primitive.
250-\item |\ltjgetparameter{|$\langle\hbox{\it
251- name}\rangle$|}[{|$\langle\hbox{\it optional
252- argument}\rangle$|}]|: for retrieval. The returned value is always
253- a string.
254-\end{itemize}
255-
256-\subsection{Line-break after a Japanese character}
257-\label{ssec-line}
258-
259-Japanese texts can break lines almost everywhere, in contrast with
260-alphabetic texts can break lines only between words (or use
261-hyphenation). Hence, \pTeX's input processor is modified so that a
262-line-break after a Japanese character doesn't emit a space. However,
263-there is no way to customize the input processor of \LuaTeX, other than
264-to hack its CWEB-source. All a macro package can do is to modify an input line before
265-when \LuaTeX\ begin to process it, inside the |process_input_buffer|
266-callback.
267-
268-Hence, in \LuaTeX-ja, a comment letter (we reserve U+FFFFF for this
269-purpose) will be appended to an input line, if this line ends with a Japanese
270-character.\footnote{Strictly speaking, it also requires that the catcode
271-of the end-line character is 5~(\emph{end-of-line}). This condition is
272-useful under the verbatim environment.} One might jump to a conclusion
273-that the treatment of a line-break by \pTeX\ and that of \LuaTeX-ja are
274-totally same, however they are different in the respect that \LuaTeX-ja's
275-judgement whether a comment letter will be appended the line is done
276-\emph{before} the line is actually processed by \LuaTeX.
277-
278-Figure~\ref{fig-linebreak} shows an example of this situation; the
279-command at the first line marks most of Japanese characters as
280-`non-Japanese characters'. In other words, from that command onward, the
281-letter `あ' will be treated as an alphabetic character by
282-\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in
283-the output, where the actual output in the figure does not so. This is
284-because `あ' is considered a Japanese character by \LuaTeX-ja,
285-when \LuaTeX-ja does the decision whether U+FFFFF will be added to the
286-input line~2.
287-
288-\begin{figure}
289-\begin{LTXexample}
290-\font\x=IPAMincho \x
291-\ltjsetparameter{jacharrange={-6}}xあ
292-y
293-\end{LTXexample}
294-\caption{A notable sample showing the treatment of a line-break after a
295-Japanese character.}\label{fig-linebreak}
296-\end{figure}
297-
298-\subsection{Separation between `real' fonts and metrics}
299-\label{ssec-sepmet}
300-
301-Traditionally, most Japanese fonts used in typesetting are not
302-proportional, that is, most glyphs have same size (in most cases,
303-square-shaped). Hence, it is not rare that the contents of different
304-JFMs are essentially same, and only differ in their names. For example,
305-|min10.tfm| and |goth10.tfm|, which are JFMs shipped with \pTeX\ for
306-seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family,
307-differ their |FAMILY| and |FACE| only. Moreover, |jis.tfm| and
308-|jisg.tfm|, which is included in the \emph{jis} font metric, which is
309-used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦),
310-are totally same as binary files. Considering this situation, we
311-decided to separate `real' fonts and metrics used for them in
312-\LuaTeX-ja. Typical declarations of Japanese fonts in the style of plain
313-\TeX\ are shown in Figure~\ref{fig-jfdef}. We would like to add several
314-remarks:
315-\begin{itemize}
316-\item A control sequence |\jfont| must be used for Japanese fonts, instead of |\font|.
317-\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so
318- \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be
319- used as the first line in Figure~\ref{fig-jfdef}.
320-\item The |jfm| key specifies the metric for the font. In
321- Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a
322- Lua script named |jfm-ujis.lua|. This metric is the standard
323- metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf}
324- package~\cite{otf}.
325-\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded
326- fonts. When one displays a pdf with these fonts, actual fonts which
327- will be used for them depend on a pdf reader.
328-\end{itemize}
329-The specification of a metric for \LuaTeX-ja is similar to that of a JFM
330-(see \cite{ptexjp}); characters are grouped into several classes, the
331-size information of characters are specified for each class, and
332-glue/kern insertions are specified for each pair of classes. Although
333-the author have not tried, it may be possible to develop a program that
334-`converts' a JFM to a metric for \LuaTeX-ja. \LuaTeX-ja offers three
335-metrics by default; |jfm-ujis.lua|, |jfm-jis.lua| based on the
336-\emph{jis} font metric, and |jfm-min.lua| based on old |min10.tfm|.
337-
338- Note that |-kern| in features
339-is important, because kerning information from a real font itself will
340-clash with glue/kern information from the metric.
341-
342-\begin{figure}
343-\begin{verbatim}
344-\jfont\foo=file:ipam.ttf:jfm=ujis;script=latn;-kern;+jp04 at 12pt
345-\jfont\bar=psft:Ryumin-Light:jfm=ujis at 10pt
346-\end{verbatim}
347-\caption{Typical declarations of Japanese fonts.}
348-\label{fig-jfdef}
349-\end{figure}
350-
351-\subsection{Insertion of glues/kerns for Japanese typesetting: timing}
352-\label{ssec-jglue}
353-
354-As described in \cite{luatexref}, \LuaTeX's kerning and ligaturing
355-processes are totally different from those of \TeX82. \TeX82's process is
356-done just when a (sequence of) character is appended to the current
357-list. Thus we can interrupt this process by writing as
358-|f{}irm|. However, \LuaTeX's process is \emph{node-based}, that is, the
359-process will be done when a horizontal box or a paragraph is ended, so
360-|f{}irm| and |firm| yield same outputs under \LuaTeX.
361-
362-The situation for Japanese characters is more complicated.
363-Glues (and kerns) which are needed for Japanese
364-typesetting are divided into the following three categories:
365-\begin{itemize}
366-\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue},
367- for short).
368-
369-\item Default glue between a Japanese character and an alphabetic
370- character (\emph{xkanjiskip}, for short), usually 1/4 of
371- full-width (\emph{shibuaki}) with some stretch and shrink for
372- justifying each line.
373-\item Default glue between two consecutive Japanese characters
374- (\emph{kanjiskip}, for short). The main reason of this glue is to
375- enable breaking lines almost everywhere in Japanese texts. In most
376- cases, its natural width is zero, and some stretch/shrink for
377- justifying each line.
378-\end{itemize}
379-In \pTeX, these three kinds of glues are treated differently. A JFM glue
380-is inserted when a (sequence of) Japanese character is appended to the
381-current list, same as the case of alphabetic characters in \TeX82. This
382-means that one can interrupt the insertion process by saying |{}|. A
383-\emph{xkanjiskip} is inserted just before `hpack' or line-breaking of a
384-paragraph; this timing is somewhat similar to that of \LuaTeX's kerning
385-process. Finally, A \emph{kanjiskip} is not appeared as a node anywhere;
386-only appears implicitly in calculation of the width of a horizontal box,
387-that of breaking lines, and the actual output process to a DVI
388-file. These specifications have made \pTeX's behavior very hard to
389-understand.
390-
391-\LuaTeX-ja inserts glues in all three categories simultaneously inside
392-|hpack_filter| and |pre_linebreak_filter| callbacks. The reasons of
393-this specification are to behave like alphabetic characters in \LuaTeX\
394-(as described in the first paragraph in this subsection), and to clarify
395-the specification for \LuaTeX-ja's process.
396-
397-\subsection{Insertion of glues/kerns for Japanese typesetting: specification}
398-\label{ssec-jspec}
399-
400-\begin{table}
401-\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.}
402-\label{tab-jfmglue}
403-\begin{center}
404-\begin{tabular}{llllllll}
405-\toprule
406-&\multicolumn{1}{c}{(1)}&\multicolumn{1}{c}{(2)}&\multicolumn{1}{c}{(3)}&\multicolumn{1}{c}{(4)}\\
407-Input &|あ】{}【〕\/〔| &|い』\/a| &|う)\hbox{}(| &|え]\special{}[|\\\midrule
408-\pTeX &あ】\hbox{}【〕\hbox{}〔&い』\/a &う)\hbox{}( &え]\hbox{}[\\
409-\LuaTeX-ja &あ】{}【〕\/〔 &い』\/a &う)\hbox{}( &え]\special{}[\\
410-\bottomrule
411-\end{tabular}
412-\end{center}
413-\end{table}
414-
415-\begin{figure}
416-\begin{center}
417-\fontsize{40}{40}\selectfont
418-\imagfm{\jstrut あ}%
419-\imagfm{\jstrut 】\inhibitglue}%
420-\imagfm{\jstrut\kern.5\zw}%
421-\imagfm{\jstrut\kern.5\zw}%
422-\imagfm{\jstrut\inhibitglue【}%
423-\imagfm{\jstrut 〕\inhibitglue}%
424-\imagfm{\jstrut\kern.5\zw}%
425-\imagfm{\jstrut\kern.5\zw}%
426-\imagfm{\jstrut\inhibitglue〔}%
427-\end{center}
428-\caption{Detail of the output of \pTeX\ in the input~(1) in Table~\ref{tab-jfmglue}.}
429-\label{fig-ptexjfm}
430-\end{figure}
431-
432-Now we will take a look at the insertion process itself through four points.
433-
434-\begin{description}
435-\item[Ignored nodes]
436-As noted in the previous subsection, the insertion process in \pTeX\ can
437- be interrupted by saying |{}| or anything else.\footnote{This
438- is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for
439- \texttt{min10.tfm} and other `old' JFMs work.} This leads the
440- second row in Table~\ref{tab-jfmglue}, or
441- Figure~\ref{fig-ptexjfm}. Here `the process is interrupted'
442- means that \pTeX\ does not think the letter `】\inhibitglue'
443- is followed by `\inhibitglue【', hence two half-width glues
444- are inserted between `】\inhibitglue' and `\inhibitglue【',
445- where the left one is from `】\inhibitglue' and the right one
446- is from `\inhibitglue【'.
447-
448- On the other hand, in \LuaTeX-ja, the process is done inside
449- |hpack_filter| and |pre_linebreak_filter| callbacks. Hence,
450- \emph{anything that does not make any node will be
451- ignored}\ in \LuaTeX-ja, as shown in (1) in
452- Table~\ref{tab-jfmglue}. \LuaTeX-ja also ignores any nodes
453- which does not make any contribution to current horizontal
454- list---\emph{ins\_node}, \emph{adjust\_node},
455- \emph{mark\_node}, \emph{whatsit\_node} and
456- \emph{penalty\_node}---, as shown in (4).
457-
458-
459-By the way, around a \emph{glyph\_node} $p$ there may be some nodes
460- attached to~$p$. These are an accent and kerns for
461- moving it to the right place, and a kern from the italic
462- correction\footnote{\TeX82 (and \LuaTeX) does not distinguish
463- between explicit kern and a kern for italic correction. To
464- distinguish them, an additional subtype for a kern is introduced
465- in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and
466- redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that
467- these attachments should be ignored inside the process. Hence
468- \LuaTeX-ja takes this approach, as the latest version of
469- \pTeX\ (version~p3.2). This explains (2) in the Table~\ref{tab-jfmglue}.
470-
471-Summerizing above, one should put an empty horizontal box |\hbox{}| to
472- where he/she wants to interrupt the insertion process in
473- \LuaTeX-ja as (3) in the Table~\ref{tab-jfmglue}.
474-
475-\item[Fonts with the same metric]
476-Recall that \LuaTeX-ja separates `real' fonts and metrics, as in Subsection~\ref{ssec-sepmet}.
477-Consider the following input, where all Japanese fonts use same metric
478- (in \LuaTeX-ja), and |\gt| selects \emph{gothic} family for
479- the current Japanese font family:
480-\begin{quote}
481-\begin{verbatim}
482-明朝)\gt (ゴシック
483-\end{verbatim}
484-\end{quote}
485-If the above input is processed by \pTeX, because the insertion process is
486- interrupt by |\gt|, the result looks like
487-\begin{quote}
488-\mc 明朝)\hbox{}\gt (ゴシック
489-\end{quote}
490-However this seems to be unnatural, since two Japanese fonts in the
491- output use the same metric, i.e.,~the same
492- typesetting rule. Hence, we decided that Japanese fonts with
493- the same metric are treated as one font in the insertion
494- process of \LuaTeX-ja. Thus, the output from the above input
495- in \LuaTeX-ja looks like:
496-\begin{quote}
497-\mc 明朝)\gt (ゴシック
498-\end{quote}
499-One might have the situation that this default behavior is not
500- suitable. \LuaTeX-ja offers a way to handle this situation, but
501- we leave it to the manual~\cite{man}.
502-
503-\item[Fonts with different metrics]
504-The case where two consecutive Japanese characters use different metrics and/or
505- different size is similar. Consider the following input where
506- the \emph{mincho} family and the \emph{gothic} family use
507- different metrics:
508-\begin{quote}
509-\begin{verbatim}
510-漢)\gt (漢)\large (大
511-\end{verbatim}
512-\end{quote}
513-As the previous paragraph, this input yields the following, by \pTeX:
514-\begin{quote}
515-\mc 漢)\hbox{}\gt (漢)\hbox{}\large (大
516-\end{quote}
517-We had thought that amounts of spaces between parentheses in above output
518- are too much. Hence we have changed the default behavior of
519- \LuaTeX-ja, so that the amount of a glue between two Japanese
520- characters with different metrics is the \emph{average} of a glue
521- from the left character and that from the right
522- character. For example, Figure~\ref{fig-diffmet} shows the
523- output from above input. The width of glue indicated `(1)' is
524- $(a/2 + a/2)/2 = 0.5a$, and the width of glue indicated `(2)'
525- is $(a/2 + 1.2a/2)/2 = 0.55a$. This default behavior can be
526- changed by \textsf{diffrentmet} parameter of \LuaTeX-ja.
527-
528-\begin{figure}
529-\begin{center}
530-\fontsize{40}{40}\selectfont
531-\imagfm{\jstrut\smash{%
532- \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr漢\cr
533- \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$a$}\
534- \hrulefill\vrule height .5ex depth .5ex\cr}}}}%
535-\imagfm{\jstrut )\inhibitglue}%
536-\hbox to .5\zw{\hss\normalsize (1)\hss}%
537-\imagfm{\jstrut\inhibitglue\gt (}%
538-\imagfm{\jstrut\gt 漢}%
539-\imagfm{\jstrut\gt )\inhibitglue}%
540-\hbox to .55\zw{\hss\normalsize (2)\hss}%
541-\imagfm{\fontsize{48}{48}\selectfont\jstrut\gt\inhibitglue (}%
542-\imagfm{\fontsize{48}{48}\selectfont\jstrut\smash{%
543- \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr\gt 大\cr
544- \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$1.2a$}\
545- \hrulefill\vrule height .5ex depth .5ex\cr}}}}
546-\end{center}
547-\caption{Fonts with different metrics.}
548-\label{fig-diffmet}
549-\end{figure}
550-
551-\item[\emph{kanjiskip} and \emph{xkanjiskip}]
552-In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named
553- |\xkanjiskip|. A well-known defect of this implementation is
554- that the value of \emph{xkanjiskip} is not connected with the
555- size of the currnt Japanese font. It seems that |EXTRASPACE|,
556- |EXTRASTRETCH|, |EXTRASHRINK| parameters in a JFM are
557- reserved for specifying the default value of
558- \emph{xkanjiskip} in a unit of the design size, but \pTeX\
559- did not use these parameters, actually.
560-
561-Considering this situation of p\TeX, \LuaTeX-ja can use the value of
562- \emph{xkanjiskip} that specified in a metric. If the value of
563- \emph{xkanjiskip} on user side (this is the value of
564- \textsf{xkanjiskip} parameter of |\ltjsetparameter|) is
565- |\maxdimen|, then \LuaTeX-ja use the specification from
566- the current used metric as the actual value of
567- \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}.
568-\end{description}
569-
570-\section{Distinction of characters}
571-\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode
572-characters natively, it is a major problem that how we distinguish
573-Japanese characters and alphabetic characters. For example, the
574-multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1
575-Supplement in Unicode) and in the basic Japanese character set
576-JIS~X~0208. It is not desirable that this character is always treated as
577-an alphabetic character, because this symbol is often used in the sense
578-of `negative' in Japan.
579-
580-\subsection{Character ranges}
581-Before we describe the approach taken is \LuaTeX-ja, we review the
582-approach taken by u\pTeX. u\pTeX\ extends the |\kcatcode| primitive in
583-\pTeX, to use this primitive for setting how a character is treated
584-among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17),
585-\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18).
586-The assignment to |\kcatcode| can be done by a Unicode
587-block.\footnote{There are some exceptions. For example, U+FF00--FFEF
588-(Halfwidth and Fullwidth Forms) are divided into three blocks in recent
589-u\pTeX.}
590-
591-\LuaTeX-ja adopted a different approach. There are many Unicode blocks
592- in Basic Multilingual Plane which are not included in
593- Japanese fonts, therefore it is inconvenient if we process by a Unicode
594- block. Furthermore, JIS~X~0208 are not just union of Unicode
595- blocks; for example, the intersection of JIS~X~0208 and
596- Latin-1 Supplement is shown in
597- Table~\ref{tab-inter}. Considering these two points, to
598- customize the range of Japanese characters in \LuaTeX-ja, one
599- has to define ranges of character codes in his source in advance.
600-
601-
602-\begin{table}
603-\caption{Intersection of JIS~X~0208 and Latin-1 Supplement.}
604-\label{tab-inter}
605-\begin{center}
606-\begin{tabular}{llll}
607-\ltjjachar"A7 (U+00A7),&
608-\ltjjachar"A8 (U+00A8),&
609-\ltjjachar"B0 (U+00B0),&
610-\ltjjachar"B1 (U+00B1),\\
611-\ltjjachar"B4 (U+00B4),&
612-\ltjjachar"B6 (U+00B6),&
613-\ltjjachar"D7 (U+00D7),&
614-\ltjjachar"F7 (U+00F7)
615-\end{tabular}
616-\end{center}
617-\end{table}
618-
619-
620-We note that \LuaTeX-ja offers two additional control sequences,
621- |\ltjjachar| and |\ltjalchar|. They are similar to |\char|
622- primitive, however |\ltjjachar| always yields a Japanese character, provided that
623- the argument is more than or equal to 128, and |\ltjalchar| always
624- yields an alphabetic character, regardless of the argument.
625-
626-\subsection{Default setting of ranges}
627-Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character
628-ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are
629-just the union of Unicode blocks, and determined from the Adobe-Japan1-6
630-character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges,
631-the ranges~2, 3, 6, 7, and~8 are considered ranges of Japanese
632-characters, and others are considered ranges of alphabetic
633-characters.\footnote{Note that ranges 3~and~8 are considered ranges of
634-alphabetic characters in this paper.} We remark on ranges 2~and~8:
635-\begin{description}
636-\item[The range~2]
637-JIS~X~0208 includes Greek letters and Cyrillic letters, however, these
638- letters cannot be used for typesetting Greek or Russian, of
639- course. Hence it is reasonable that Greek letters and
640- Cyrillic consist another character range.
641-\item[The range~8]
642-If one want to use 8-bit TFMs, such as T1 or TS1 encodings, he should
643- mark this range~8 as a range of alphabetic characters by
644-\begin{quote}
645-|\ltjsetparameter{jacharrange={-8}}|
646-\end{quote}
647-This is because some 8-bit TFMs have a glyph in this range; for example,
648- the character `\OE' is located at |"D7| in the T1 encoding. %"
649-\end{description}
650-
651-
652-\begin{table}
653-\caption{Predefined ranges in \LuaTeX-ja.}
654-\label{tab-chrrng}
655-\begin{center}
656-\begin{tabular}{@{\bf}rl}
657-1&(Additional) Latin characters which are not belonged in the range~8.\\
658-2&Greek and Cyrillic letters.\\
659-3&Punctuations and miscellaneous symbols.\\
660-4&Unicode blocks which does not intersect with Adobe-Japan1-6.\\
661-5&Surrogates and supplementary private use Areas.\\
662-6&Characters used in Japanese typesetting.\\
663-7&Characters possibly used in CJK typesetting, but not in Japanese.\\
664-8&Characters in Table~\ref{tab-inter}.
665-\end{tabular}
666-\end{center}
667-\end{table}
668-
669-\subsection{Control sequences producing Unicode characters}
670-\label{ssec-unichar}
671-
672-The \emph{fontspec} package\footnote{Preciously saying, it is the
673-\emph{xunicode} package, originally a package for \XeTeX and
674-automatically loaded by the \emph{fontspec} package.} offers various
675-control sequences that produce Unicode characters. However, these
676-control sequences as it stands cannot work correctly with the default
677-range setting of \LuaTeX-ja. For example, |\textquotedblleft| is just
678-an abbreviation of |\char"201C\relax|, and the character U+201C (LEFT %"
679-DOUBLE QUOTATION MARK) is treated as an Japanese character, because it
680-belongs to the range~3. This problem is resolved by using |\ltjalchar|
681-instead of the |\char| primitive. It is included in an optional package
682-named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt}
683-shows several ways o typeset a character , both as a Japanese character
684-and as as an alphabetic characters.
685-
686-\begin{figure}
687-\begin{LTXexample}
688-×, \char`×, % depend on range setting
689-\ltjalchar`×, % alphabetic char
690-\ltjjachar`×, % Japanese char
691-\texttimes % alph. char (by fontspec)
692-\end{LTXexample}
693-\caption{Control sequences producing a Unicode character.}
694-\label{fig-unitxt}
695-\end{figure}
696-
697-The situation looks similar in math formulas, but in fact it differs.
698-Each control sequence that represents an ordinary symbol defined by the
699-\emph{unicode-math} package is just synonym of a character. For example,
700-the meaning of |\otimes| is just the character U+2297 (CIRCLED TIMES),
701-which is included in the range~3. However, it is difficult to define a
702-control sequence like |\ltjalUmathchar| as a counterpart of
703-|\Umathchar|, since an input like `|\sum^\ltjalUmathchar ...|' has to be
704-permitted.
705-
706-However, we couldn't develop a satisfactory solution to this problem in
707-time for this paper, due to a lack of time. We are just testing a
708-solution below:
709-\begin{itemize}
710-\item \LuaTeX-ja has a list of character codes which will be always reated as
711- alphabetic characters in math mode. Considering 8-bit TFMs for
712- math symbols, this list includes natural numbers between |"80| and
713- |"FF| by default.
714-\item Redefine internal commands defined in the \emph{unicode-math}
715- package so that
716-codes of characters which are mentioned in the \emph{unicode-math}
717- package will be included in the list.
718-\end{itemize}
719-
720-
721-We would like to extend treatments described in this subsection to 8-bit
722-font encodings, but we leave it to further development too.
723-
724-\section{Current status of development}
725-\label{sec:current_status}
726-At the moment, \LuaTeX-ja can be used under plain \TeX, and under
727-\LaTeXe. Generally speaking, one only has to read |luatexja.sty|, by
728-|\input| command or |\usepackage| (in~\LaTeXe), if you merely want to
729-typeset Japanese characters. We look more detail by parts.
730-
731-\subsection{`Engine extension'}
732-The lowest part of \LuaTeX-ja corresponds to the \pTeX\ extension as
733-\emph{an engine extension of \TeX}. We, the project menbers, think that
734-this part is almost done. There is one more feature of \LuaTeX-ja which
735-we are going to explain:
736-
737-\begin{description}
738-\item[Shifting baseline]
739-In order to make a match between Japanese fonts and alphabetic fonts,
740- sometimes shifting the baseline of alphabetic characters may
741- be needed. \pTeX\ has a dimension |\ybaselineshift|, which
742- corresponds to the amount of shifting down the baseline of alphabetic
743- characters. This is useful for Japanese-based documents, but
744- not for documents mainly in languages with alphabetic
745- characters.
746-
747-Hence, \LuaTeX-ja extends \pTeX's |\ybaselineshift| to Japanese
748- characters. Namely, \LuaTeX-ja offers two parameters,
749- \textsf{yjabaselineshift} and \textsf{yalbaselineshift}, for the
750- amount of shifting the baseline of Japanese characters and
751- that of alphabetic characters, respectively.
752-\begin{figure}
753-\begin{center}
754-\fontsize{40}{40}\selectfont\fboxsep0mm
755-\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
756-\hbox to 0.9\linewidth{%
757-\hfil
758-\raise-10pt\imagfm{\jstrut 漢}%
759-\raise-10pt\imagfm{\jstrut 字}\hskip.25\zw%
760-\imagfm{p}%
761-\imagfm{h}%
762-\hfil\hfil
763-\imagfm{\jstrut 漢}%
764-\imagfm{\jstrut 字}\hskip.25\zw%
765-\raise-10pt\imagfm{p}%
766-\raise-10pt\imagfm{h}%
767-\hfil
768-}
769-\end{center}
770-
771-\caption{First example of shifting baseline.}
772-\label{fig-bls}
773-\end{figure}
774-
775-\begin{figure}
776-\begin{center}
777-\fontsize{30}{30}\selectfont\fboxsep0mm
778-\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
779-\hbox to 0.9\linewidth{%
780-\hfil
781-\imagfm{a}%
782-\imagfm{b}\hskip.25\zw%
783-\imagfm{\jstrut 本}%
784-\imagfm{\jstrut 文}\hskip.33333\zw%
785-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut\inhibitglue (}%
786-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 注}%
787-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 釈}\hskip.1666667\zw%
788-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont c}%
789-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont o}%
790-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
791-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
792-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont e}%
793-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont n}%
794-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont t}%
795-\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut )\inhibitglue}%
796-\hskip.33333\zw%
797-\imagfm{\jstrut 本}%
798-\imagfm{\jstrut 文}%
799-\hfil
800-}
801-\end{center}
802-
803-\caption{Second example of shifting baseline.}
804-\label{fig-small}
805-\end{figure}
806-
807-An example output is shown in Figure~\ref{fig-bls}. The left half is the
808- output when \textsf{yjabaselineshift} is positive, hence the
809- baseline of Japanese characters is shifted down. On the other
810- hand, the right half is the output when
811- \textsf{yalbaselineshift} is positive, hence the baseline of
812- alphabetic characters is shifted down. Figure~\ref{fig-small}
813- shows an intresting use of these parameters.
814-
815-\end{description}
816-Note that \LuaTeX-ja doesn't support vertical typesetting, \emph{tategaki}, for now.
817-
818-\subsection{Patches for plain \TeX\ and \LaTeXe}
819-\pTeX\ has a patch for plain \TeX, namely |ptex.tex|, that for \LaTeXe\
820-macro (this patch and \LaTeXe\ consist \emph{p\LaTeXe}), and
821-|kinsoku.tex| which includes the default setting of \emph{kinsoku
822-shori}, the Japanese hyphenation. We ported them to \LuaTeX-ja, except
823-the codes related to vertical typesetting, because \LuaTeX-ja doesn't
824-support vertical typesetting yet. We remark one point related to the
825-porting:
826-\begin{description}
827-
828-\item[Behavior of\/ {\tt\char92fontfamily\/}]
829-The control sequence |\fontfamily| in p\LaTeXe\ changes the current alphabetic
830- font family and/or the current Japanese font family,
831- depending the argument. More concretely,
832- |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
833- current alphabetic font family to $\langle\hbox{\it
834- arg\/}\rangle$, if and only if one of the following
835- conditions are satisfied:
836-\begin{itemize}
837-\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ in
838- \emph{some} alphabetic encoding is already defined in the document.
839-\item There exists an alphabetic encoding $\langle\hbox{\it
840- enc\/}\rangle$ already defined in the document such that a font
841- definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
842- arg\/}\rangle$|.fd| (all lowercase) exists.
843-\end{itemize}
844-The same criterion is used for changing Japanese font family.
845-
846-To work this behavior well, a list of all (alphabetic) encodings defined
847- already in the document is needed. However, since \LuaTeX-ja
848- is loaded as a package, \LuaTeX-ja cannot have this list.
849- Hence \LuaTeX-ja adopted a different approach, namely
850- |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
851- current alphabetic font family to $\langle\hbox{\it
852- arg\/}\rangle$, if and only if:
853-\begin{itemize}
854-\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$
855- in the current alphabetic encoding $\langle\hbox{\it
856- enc\/}\rangle$ is already defined in the document.
857-\item A font definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
858- arg\/}\rangle$|.fd| (all lowercase) exists.
859-\end{itemize}
860-
861-
862-\end{description}
863-
864-
865-
866-\subsection{Classes for Japanese documents}
867-To produce `high-quality' Japanese documents, we need not only that
868-Japanese characters are correctly placed, but also class files for
869-Japanese documents. Two major families of classes are widely used in Japan:
870-\emph{jclasses} which is distributed with the official p\LaTeXe\ macros,
871-and \emph{jsclasses}. At the present, \LuaTeX-ja
872-simply contains their counterparts: \emph{ltjclasses} and
873-\emph{ltjsclasses}. However, the policy on classes is not determined
874-now, and we hope to have another family of classes which are useful for
875-commercial printing. In the author's opinion, \emph{ltjclasses} is
876-better to stay as an example of porting of class files for \pTeX\ to
877-\LuaTeX-ja.
878-
879-\subsection{Patches for packages}
880-Apart from patches for the \LaTeXe~kernel and classes for Japanese
881-documents, we need to make patches for several packages. At the present,
882-we considered the following packages, and made patches or porting for
883-the former two packages.
884-
885-\begin{description}
886-\item[The \emph{fontspec} package] The \emph{fontspec} package is built
887- on NFSS2, hence control sequences offered by the
888- \emph{fontspec} package, such as |\setmainfont|, are only
889- effective for alphabetic fonts if \LuaTeX-ja is loaded.
890- \texttt{luatexja-\penalty0fontspec.sty} (not automatically
891- loaded) offers these counterparts for Japanese fonts, with
892- additional `j' in the name of control sequences, such as
893- |\setmainjfont|. As described in
894- Subsection~\ref{ssec-unichar}, it also includes a patch for
895- control sequences producing Unicode characters.
896-
897-\item[The \emph{otf} package]
898-This package is widely used in \pTeX\ for typesetting characters which is
899-not in JIS~X~0208, and for using more than one weight in \emph{mincho}
900-and \emph{gothic} font families. Therefore \LuaTeX-ja supports features
901-in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty}
902- manually. Note that characters by |\UTF{xxxx}| and
903- |\CID{xxxx}| are not appended to the current list as a
904- \emph{glyph\_node}, to avoid from callbacks by the
905- \emph{luaotfload} package. We have another remark; |\CID|
906- does not work with TrueType fonts, since |\CID| use the
907- conversion table between CID and the glyph order of the
908- current Japanese font.
909-
910-\item[The \emph{listings} package]
911-It is known for users of \pTeX\ that there is a patch |jlisting.sty| for
912- the \emph{listings} package, to use Japanese characters in
913- the |lstlisting| environment. Generally speaking, it also can
914- be used in \LuaTeX-ja. However, it seems to be that a
915- Japanese character after a space does not recieve any process
916- of the \emph{listings} package; this is inconvinient when we
917- use the \emph{showexpl} package.
918-
919-There is another way to use characters above 256 with the
920- \emph{listings} package (described in\cite{apl}). However,
921- this method is not suitable for Japanese, since the number of
922- Japanese characters is very large. We hope that the
923- \emph{listings} package will be able to handle all characters above
924- 256 without any patch, in the future.
925-
926-
927-\end{description}
928-
929-
930-
931-\section{Implementation}
932-\label{sec:implementation}
933-\subsection{Handling of Japanese fonts}
934-In \pTeX, there are three slots for maintaining current fonts, namely
935-|\font| for alphabetic fonts, |\jfont| for Japanese fonts (in horizontal
936-direction) and |\tfont| for Japanese fonts (in vertical direction). With
937-these slots, we can manage the current font for alphabetic characters
938-and that for Japanese characters separately in \pTeX. However, \LuaTeX\
939-has only one slot for maintaining the current font, as \TeX82. This
940-situation leads a problem: how can we maintain the `current Japanese
941-font'?
942-
943-There are three approaches for this problem. One approach is to make a
944-mapping table from alphabetic fonts to corresponding Japanese fonts
945-(here we don't assume that NFSS2 is available). Another approach is
946-that we always use composite fonts with alphabetic fonts and Japanese
947-fonts. The third approach is that the information of the current
948-Japanese font is stored in an attribute. We adopted the third approach,
949-since \LuaTeX-ja is much affected by \pTeX\ as we noted in
950-Subsection~\ref{ssec-pol}.
951-
952-As in Figure~\ref{fig-jfdef}, \LuaTeX-ja uses |\jfont| for defining
953-Japanese fonts, as \pTeX. However, because the information of the current
954-Japanese font is stored into an attribute, control sequences defined by
955-|\jfont| (e.g.,~|\foo| and |\bar| in Figure~\ref{fig-jfdef}) is
956-not representing a font by the means of \TeX82. In other words, each of
957-these control sequences is just an assignment to an attribute, therefore
958-they cannot be an argument of |\the|, |\fontname|, nor |\textfont|.
959-
960-
961-Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs
962-according to OpenType font features, are performed just after `Examination of
963-stack level' (see Subsections
964-\ref{ssec-over}~and~\ref{ssec-stack}). Also note that calculation of
965-character classes for each Japanese character is done \emph{after} the
966-these callbacks for now.
967-
968-\subsection{Stack management}
969-\label{ssec-stack}
970-
971-As we noted in Subsection~\ref{ssec-csname}, parameters that the values
972-at the end of a horizontal box or that of a paragraph are valid in
973-whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented
974-by internal integers or registers of other types in \TeX. We explain it
975-in this subsection.
976-
977-\begin{figure}
978-\begin{lstlisting}
979-void package(int c)
980-{
981- ...
982- d = box_max_depth;
983- unsave();
984- save_ptr -= 4;
985- if (cur_list.mode_field == -hmode) {
986- cur_box = filtered_hpack(cur_list.head_field,
987- cur_list.tail_field, saved_value(1),
988- saved_level(1), grp, saved_level(2));
989- subtype(cur_box) = HLIST_SUBTYPE_HBOX;
990- } else {
991-\end{lstlisting}
992-\caption{An extract of a CWEB-source \texttt{tex/packaging.w} of \LuaTeX.}
993-\label{fig-ltsrc}
994-\end{figure}
995-
996-Figure~\ref{fig-ltsrc} is an extract of a CWEB-source
997-\texttt{tex/packaging.w} of \LuaTeX\ (SVN revision 4358). This function
998-is called just when an explicit |\hbox{...}| or |\vbox{...}| is ended, and
999-the function |filtered_hpack()| is where the |hpack_filter| and then the
1000-actual `hpack' process are performed. Notice that the |unsave()|
1001-function is called before |filtered_hpack()|. This is the problem;
1002-because of |unsave()|, we can retrive only the values of registers
1003-\emph{outside} the box, even in the |hpack_filter| callback.
1004-
1005-To cope with this problem, \LuaTeX-ja has its own stack system, based on
1006-Lua codes in \cite{stack-mail}. Furthermore, \emph{whatsit} nodes whose
1007-\emph{user\_id} is 30112 (\emph{stack\_node}, for short) will be
1008-appended to the current horizontal list each time the current stack
1009-level is incremented, and their values are the values of
1010-|\currentgrouplevel| at that time. In the beginning of the |hpack_filter|
1011-callback, the list in question is traversed to determine whether the
1012-stack level at the end of the list and that outside the box coincides.
1013-
1014-Let $x$ be the value of |\currentgrouplevel|, and $y$ be the current
1015-stack level, both inside the |hpack_filter| callback, i.e.,~outside a
1016-horizontal box. Consider a list which represents the content of the box,
1017-then we have:
1018-\begin{itemize}
1019-\item A \emph{stack\_node} whose value is $x+1$ (because all materials
1020- in the box are included in a group |\hbox{...}|, the value of
1021- |\currentgrouplevel| inside the box is at least $x+1$) in the list
1022- corresponds to an assignment related to the stack system in just
1023- top-level of the list, like
1024-\begin{quote}
1025-\begin{verbatim}
1026-\hbox{...(assignment)...}
1027-\end{verbatim}
1028-\end{quote}
1029-In this case, the current stack level is incremented to $y+1$ after the assignment.
1030-\item A \emph{stack\_node} whose value is more than $x+1$ in the list corresponds
1031-to an assignment inside another group contained in the box. For example,
1032- the following input creates
1033-a \emph{stack\_node} whose value is $x+3=(x+1)+2$:
1034-\begin{quote}
1035-\begin{verbatim}
1036-\hbox{...{...{...(assignment)}...}...}
1037-\end{verbatim}
1038-\end{quote}
1039-\end{itemize}
1040-Thus, we can conclude that the stack level at the end of the list is
1041-$y+1$, if and only if there is a \emph{stack\_node} whose value is
1042-$x+1$. Otherwise, the stack level is just $y$.
1043-
1044-\subsection{Adjustment of the position of Japanese characters}
1045-\label{ssec-width}
1046-
1047-The size of a glyph specified in a metric and that of a real font
1048-usually differ. For example, the letter `\inhibitglue【' is half-width
1049-in |jfm-ujis.lua| or |jis.tfm|, while this letter is full-width like `【'
1050-in most TrueType fonts used in Japanese typesetting, such as
1051-IPA~Mincho. Hence the adjustment of position of such glyphs is
1052-needed. In the context of \pTeX, this process was performed using virtual fonts.
1053-
1054-On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph
1055-into a horizontal box. There are two main reasons why we adopted this
1056-method; one is that we feared Lua codes for coexisting with callbacks by
1057-the |luaotfload| package would be large if we use virtual fonts, and the
1058-other is to cope with shifting of the baseline of characters at the
1059-same time.
1060-
1061-\begin{figure}
1062-\begin{center}\unitlength=9pt\small
1063-\begin{picture}(15,12)(-1,-3)
1064-
1065-\color{grayx}% real glyph
1066-\put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
1067-
1068-\color{black}% real glyph :step1
1069-\thicklines
1070-\put(-1,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1071-\put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1072-\put(-1,5.5){\line(1,0){6}}
1073-\put(-1,-4){\line(1,0){6}}
1074-\put(-1,0){\makebox(0,0)[r]{\strut$R$\,}}
1075-
1076-\thicklines
1077-\put(0,0){\vector(0,1){9}\line(0,-1){3}\vector(1,0){12}}
1078-\put(12,9){\makebox(0,0)[rt]{\strut$M$\,}}
1079-\put(12,0){\line(0,1){9}\vector(0,-1){3}}
1080-\put(0,9){\line(1,0){12}}
1081-\put(0,-3){\line(1,0){12}}
1082-\put(0.2,4.5){\makebox(0,0)[l]{\texttt{height}}}
1083-\put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
1084-\put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
1085-
1086-\thicklines
1087-\put(3,0){\line(0,1){7}\line(0,-1){2.5}\line(1,0){6}}
1088-\put(9,0){\line(0,1){7}\line(0,-1){2.5}}
1089-\put(3,7){\line(1,0){6}}
1090-\put(3,-2.5){\line(1,0){6}}
1091-\newsavebox{\eqdist}
1092-\savebox{\eqdist}(0,0)[c]{%
1093- \thinlines
1094- \put(-0.08,0.2){\line(0,-1){0.4}}%
1095- \put(0.08,0.2){\line(0,-1){0.4}}}
1096-\put(1.5,0){\usebox{\eqdist}}
1097-\put(10.5,0){\usebox{\eqdist}}
1098-
1099-\thicklines
1100-\put(3,-1.5){\vector(-1,0){4}}
1101-\put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
1102-\put(3,0){\vector(0,-1){1.5}}
1103-\put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
1104-\end{picture}
1105-\end{center}
1106-\caption{The position of the `real' glyph.}
1107-\label{fig-pos}
1108-\end{figure}
1109-
1110-Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is
1111-the imaginary body specified in the metric, and a vertical
1112-rectangle is the imaginary body of a real glyph. First, the real glyph
1113-is aligned with respect to the width of $M$. In the figure, the real
1114-glyph is aligned `middle'; this setting is useful for the full-width
1115-middle dot `・'. We have other settings, `left' and `right'.
1116-After that, it is shifted according to the value of |left| and |down|,
1117-which are specified in the metric, too. The final position of the real glyph
1118-is shown by the gray rectangle~$R$. If the amount of shifting the baseline is
1119-not zero, $M$ (and hence the real glyph) is shifted by that amount.
1120-
1121-We would like to remark briefly on the vertical position of a real
1122-glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for
1123-it may have different height or depth. In that case, it may look better
1124-if the real glyph is shifted vertically to match the height-depth ratio
1125-specified in the metric, while any vertical adjustment except the
1126-adjustment by the |down| value does not performed in the present
1127-implementation of \LuaTeX-ja . This situation is carefully studied by
1128-Otobe~\cite{min10}. Here the policy on this problem is not determined
1129-now, however we would like to offer several solutions in future
1130-development.
1131-
1132-\section{Conclusion}
1133-We have discussed about our \LuaTeX-ja package, which is much affected
1134-by \pTeX. For now, it can be used for experimental use, however there
1135-are much refinements which are needed for regular use. The author hopes
1136-that this paper and \LuaTeX-ja project contribute the typesetting Japanese,
1137-and possibly other Asian languages, under \LuaTeX.
1138-
1139-\section*{Acknowledgements}
1140-The author would like to thank Ken Nakano and Hideaki Togashi for their
1141-development of ASCII \pTeX. The author is very grateful to Haruhiko
1142-Okumura for his leadership in the Japanese \TeX\ community. The author
1143-is also very grateful to members of \LuaTeX-ja project team for their
1144-valuable cooperation in development.
1145-
1146-%%% The style of the bibiliogrphy is `amsplain'.
1147-\providecommand{\bysame}{\leavevmode\hbox to3em{\hrulefill}\thinspace}
1148-\providecommand{\href}[2]{#2}
1149-\begin{thebibliography}{99}
1150-
1151-\bibitem{aj16}
1152-Adobe Systems Incorporated, \emph{Adobe-Japan1-6 Character Collection
1153- for CID-Keyed Fonts}, Technical Note~\#5078, 2004.
1154-\url{http://partners.adobe.com/public/developer/en/font/5078.Adobe-Japan1-6.pdf}
1155-
1156-\bibitem{ptex}
1157-ASCII MEDIA WORKS,アスキー日本語\TeX\ (\pTeX).\url{http://ascii.asciimw.jp/pb/ptex/}
1158-
1159-\bibitem{apl}
1160-John Baker, \emph{Typesetting UTF8 APL code with the \LaTeX\ lstlisting package}.
1161-\url{http://bakerjd99.wordpress.com/2011/08/15/}
1162-
1163-\bibitem{omega}
1164-Jin-Hwan~Cho and Haruhiko Okumura, \emph{Typesetting CJK Languages with Omega},
1165-\TeX, XML, and Digital Typography, Lecture Notes in Computer Science, vol.~3130,
1166-Springer, 2004, 139--148.
1167-
1168-\bibitem{joylua}
1169-Yannis Haralambous. \emph{The Joy of \LuaTeX}. \url{http://luatex.bluwiki.com/}
1170-
1171-\bibitem{jisx4051}
1172-Japanese Industrial Standards Committee. \emph{JIS~X~4051: Formatting
1173- rules for Japanese documents}, 1993, 1995, 2004.
1174-
1175-\bibitem{eptex}
1176-北川弘典,$\varepsilon$-\pTeX についてのwiki.
1177-\url{http://sourceforge.jp/projects/eptex/wiki/FrontPage}
1178-
1179-\bibitem{luaums}
1180-北川弘典,\LuaTeX で日本語.
1181-\url{http://oku.edu.mie-u.ac.jp/tex/mod/forum/discuss.php?d=378}
1182-
1183-\bibitem{luatexref}
1184-\LuaTeX\ development team, \emph{The \LuaTeX\ reference}.
1185-\url{http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf} (snapshot of SVN trunk)
1186-
1187-\bibitem{man}
1188-\LuaTeX-ja project team, \emph{The \LuaTeX-ja package}.
1189-Not completed for now. Available at |doc/man-en.pdf| (in English) or
1190- |doc/man-ja.pdf| (in Japanese)
1191-in the Git repository.
1192-
1193-\bibitem{luajp-test}
1194-香田温人,\LuaTeX と日本語.
1195-\url{http://www1.pm.tokushima-u.ac.jp/~kohda/tex/luatex-old.html}
1196-
1197-\bibitem{luajalayout}
1198-前田一貴,luajalayout パッケージ---Lua\LaTeX によ
1199- る日本語組版---.
1200-\url{http://www-is.amp.i.kyoto-u.ac.jp/lab/kmaeda/lualatex/luajalayout/}
1201-
1202-\bibitem{jsclasses}
1203-奥村晴彦,p\LaTeXe 新ドキュメントクラス.
1204-\url{http://oku.edu.mie-u.ac.jp/~okumura/jsclasses/}
1205-
1206-\bibitem{ptexjp}
1207-Haruhiko Okumura, \emph{\pTeX\ and Japanese Typesetting},
1208- The Asian Journal of \TeX\ \textbf{2}~(2008), 43--51.
1209-
1210-\bibitem{min10}
1211-乙部厳己,min10フォントについて.
1212-\url{http://argent.shinshu-u.ac.jp/~otobe/tex/files/min10.pdf}
1213-
1214-\bibitem{otf}
1215-齋藤修三郎,Open Type Font用VF.
1216-\url{http://psitau.kitunebi.com/otf.html}
1217-
1218-\bibitem{stack-mail}
1219-Jonathan Sauer, \emph{[Dev-luatex] tex.currentgrouplevel}.
1220-\url{http://www.ntg.nl/pipermail/dev-luatex/2008-August/001765.html}
1221-
1222-\bibitem{uptex}
1223-Takuji Tanaka, \emph{u\pTeX, up\LaTeX---unicode version of \pTeX, p\LaTeX}.
1224-\url{http://homepage3.nifty.com/ttk/comp/tex/uptex_en.html}
1225-
1226-\bibitem{ptexenc}
1227-Nobuyuki Tsuchimura, \emph{Development of a Japanese \TeX\ Distribution~`ptetex3'},
1228-Computer Software\ \textbf{24} (2007), no.~4, 40--50, (in Japanese).
1229-
1230-\bibitem{w3c}
1231-W3C Working Group, \emph{Requirements for Japanese Text Layout}.
1232-\url{http://www.w3.org/TR/jlreq/}
1233-\end{thebibliography}
1234-
1235-\end{document}
1+%#!lualatex ajt-devel-ltja
2+\documentclass{ajt}
3+
4+%%% Packages used in this paper
5+
6+%%% Font setting for \LuaTeX; this is extract from ajt.cls
7+\makeatletter
8+ \if@print
9+ \RequirePackage{fontspec,xunicode}
10+ \RequirePackage{luatextra}
11+ \setmainfont[Mapping=tex-text]{Palatino LT Std}
12+ \setsansfont[Mapping=tex-text]{Optima LT Std}
13+ \else
14+ \RequirePackage{fontspec,luatextra}
15+ \setmainfont[Mapping=tex-text]{TeX Gyre Pagella} % \simeq Palatino
16+ \fi
17+
18+%%% LuaTeX-ja
19+\usepackage{luatexja,luatexja-fontspec}
20+\ltjsetparameter{jacharrange={-3,-8}}
21+\DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.92489] file:ipam.ttf:jfm=ujis}{}
22+\DeclareFontShape{JY3}{gt}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=ujis}{}
23+% quick hack: monospaced Japanese font by \ttfamily
24+\DeclareKanjiFamily{JY3}{\ttdefault}{}{}
25+\DeclareFontShape{JY3}{\ttdefault}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=mono}{}
26+
27+
28+%%% LTXexample environment
29+\usepackage{showexpl,lltjlisting}
30+\lstset{basicstyle=\ttfamily\small, width=0.3\textwidth, basewidth=.5em}
31+
32+%%% Verbatim environment
33+\usepackage{fancyvrb}
34+\CustomVerbatimEnvironment{code}{Verbatim}%
35+{numbers=left,xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
36+\CustomVerbatimEnvironment{codewithoutnum}{Verbatim}%
37+{xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small}
38+\CustomVerbatimEnvironment{codewithoutnumsmall}{Verbatim}%
39+{xleftmargin=1.5em,baselinestretch=1.0,fontsize=\footnotesize}
40+\DefineShortVerb{\|}
41+
42+%%% Others
43+\usepackage{mflogo,booktabs}
44+\definecolor{grayx}{gray}{0.85}
45+\hyphenation{
46+ kanjiskip
47+ xkanjiskip
48+}
49+
50+%%% Mandatory article metadata %%%
51+\title{Development of \LuaTeX-ja package}
52+\author[北川 弘典]{Hironori Kitagawa}
53+\address{\LuaTeX-ja project team}
54+\email{h\_kitagawa2001@yahoo.co.jp}
55+
56+\keywords{\TeX, p\TeX, \LuaTeX, \LuaTeX-ja, Japanese}
57+\abstract{%
58+\LuaTeX-ja package is a macro package for typesetting Japanese
59+documents under \LuaTeX. The package has more flexibility of
60+typesetting than \pTeX, which is widely used Japanese extension of \TeX,
61+and has corrected some unwanted features of \pTeX.
62+In this paper, we describe specifications, the current status and some
63+internal processing methods of \LuaTeX-ja.
64+}
65+
66+\newcommand{\parname}[1]{\textsf{#1}}
67+\newcommand{\jstrut}{\vrule width0pt height\cht depth\cdp}
68+\newcommand{\imagfm}[1]{\ifvmode\leavevmode\fi%
69+ \hbox{\fboxsep=0pt\fbox{\setbox0=\hbox{#1}\copy0\kern-\wd0
70+ \smash{\vrule width \wd0 height 0.4pt depth0.4pt}}}}
71+\begin{document}
72+
73+%%% Do not forget to start with \maketitle!
74+\maketitle
75+
76+\section{Introduction}
77+\subsection{History}
78+To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has
79+been widely used in Japan. There are other methods---for example, using
80+Omega and OTP~\cite{omega}, or with the CJK package---to do so, however,
81+these alternative methods did not become majority. The author thinks
82+that this is because \pTeX\ enables us to produce high-quality documents
83+(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is
84+earlier than that of alternatives described above.
85+
86+However, \pTeX\ has been left behind from the extensions of \TeX\ such
87+as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent
88+years, the situation has become better, by development of
89+|ptexenc|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}),
90+$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex}
91+by Takuji Tanaka (田中琢爾). However, continuing this approach, namely,
92+to develop an engine extension localized for Japanese, is not wise. This
93+approach needs lots of work for \emph{each} engine. In addition, if we
94+use \LuaTeX, the necessity of an engine extension is getting smaller
95+because \LuaTeX\ has an ability to hook \TeX's internal process by using
96+Lua callbacks.
97+
98+
99+There were several experimental attempts to typeset
100+Japanese documents with \LuaTeX\ before. Here we cite three examples:
101+\begin{itemize}
102+\item |luaums.sty|~\cite{luaums} developed by the author. This
103+ experimental package is for creating a certain Japanese-based presentation
104+ with \LuaTeX.
105+\item the \emph{luajalayout} package~\cite{luajalayout}, formerly known as the
106+ \emph{jafontspec} package, by Kazuki Maeda (前田一貴). This package is based on
107+ \LaTeXe\ and \emph{fontspec} package.
108+\item the \emph{luajp-test} package~\cite{luajp-test}, a test package made by
109+ Atsuhito Kohda (香田温人), based on articles on the web page~\cite{joylua}.
110+\end{itemize}
111+However, these packages are based on \LaTeXe, and do not have much
112+ability to control the typesetting rule. And it is inefficient that more
113+than one people separately develop similar packages. Development of the
114+\LuaTeX-ja package is started initially by the author and Kazuki Maeda, because of
115+these situations.
116+
117+\subsection{Development policy of \LuaTeX-ja}
118+\label{ssec-pol}
119+The first aim of \LuaTeX-ja project was to implement features (from the
120+`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is
121+much affected by \pTeX. However, as development proceeded, some
122+technical/conceptual difficulties arose. Hence we changed the aim
123+of the project as follows:
124+\begin{itemize}
125+\item\emph{\LuaTeX-ja offers at least the same flexibility of
126+ typesetting that p\TeX\ has.}
127+
128+ We are not satisfied with the ability of producing outputs conformed to
129+ JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for
130+ typesetting, or to a technical note~\cite{w3c} by W3C;
131+ if one wants to produce very incoherent outputs for some reason, it
132+ should be possible.
133+In this point, previous attempts of Japanese typesetting with \LuaTeX\
134+ which we cited in the previous subsection are inadequate.
135+
136+\pTeX\ has some flexibility of typesetting, by changing internal
137+ parameters such as |\kanjiskip| or |\prebreakpenalty|, and by using
138+ custom JFM (Japanese TFM). Therefore we decided to include these
139+ functionality to \LuaTeX-ja.
140+
141+\item\emph{\LuaTeX-ja isn't mere re-implementation or porting of \pTeX;
142+ some (technically and/or conceptually) inconvenient features of
143+ \pTeX\ are modified.}
144+
145+ We describe this point in more detail at the next section.
146+\end{itemize}
147+
148+
149+\subsection{Overview of the processes}
150+\label{ssec-over}
151+We describe an outline of \LuaTeX-ja's process in order.
152+
153+\begin{itemize}
154+\item In the |process_input_buffer| callback: treatment of breaking
155+ lines after a Japanese character (in Subsection~\ref{ssec-line}).
156+
157+\item In the |hyphenate| callback: font replacement.
158+
159+\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If
160+ the character represented by $p$ is considered as a Japanese
161+ character, the font used at $p$ is replaced by the value of
162+ |\ltj@curjfnt|, an attribute for `the current Japanese font'
163+ at~$p$.
164+
165+Furthermore, the subtype of $p$ is subtracted by 1 to suppress
166+ hyphenation around $p$ by \LuaTeX, because later processes of
167+ \LuaTeX-ja take care of all things about Japanese characters.
168+
169+\item In |pre_linebreak_filter| and |hpack_filter| callbacks:
170+
171+\begin{enumerate}
172+\item \LuaTeX-ja has its own stack system, and the current horizontal
173+ list is traversed in this stage to determine what the level of
174+ \LuaTeX-ja's internal stack at the end of the list is. We will
175+ discuss it in Subsection~\ref{ssec-stack}.
176+
177+\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese
178+ typesetting in the list. This is the core routine of \LuaTeX-ja.
179+ We will discuss it in Subsections
180+ \ref{ssec-jglue}~and~\ref{ssec-jspec} .
181+
182+\item To make a match between a metric and a real font, sometimes
183+ adjustument of the position of (Japanese) glyphs are performed.
184+ We will discuss it in Subsection~\ref{ssec-width}.
185+\end{enumerate}
186+\item In the |mlist_to_hlist| callback: treatment of Japanese characters
187+ in math formulas. This stage is similar to adjustment of the
188+ position of glyphs (see above), so we omit to describe this stage
189+ from this paper.
190+\end{itemize}
191+
192+In this paper, a \emph{alphabetic character} means a non-Japanese
193+character. Similarly, we use the word an \emph{alphabetic font} as the
194+counterpart of a jJpanese font.
195+
196+\subsection{Contents of this paper}
197+Here we describe the contents of the rest of this paper briefly. In
198+Section~\ref{sec:differences_with_ptex}, we describe major differences
199+between \pTeX\ and \LuaTeX-ja. The next section,
200+Section~\ref{sec:distinction_of_characters}, is concentrated on a
201+problem how we distinguish between Japanese characters and alphabetic
202+characters. In Section~\ref{sec:current_status}, we show current
203+development status of the package. Finally, in
204+Section~\ref{sec:implementation}, we describe some internal routines of
205+\LuaTeX-ja.
206+
207+\subsection{General information of the project}
208+This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki
209+is located on
210+\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is
211+no stable version on October 22, 2011, however a set of developer sources can be
212+obtained from the git repository. Members of the project team are as follows
213+(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato,
214+Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda,
215+and~Shuzaburo Saito.
216+
217+
218+\section{Major differences with \pTeX}
219+\label{sec:differences_with_ptex}
220+In this section, we explain several major differences between \pTeX\
221+and our \LuaTeX-ja. For general information of Japanese typesetting and the
222+overview of \pTeX, please see Okumura~\cite{ptexjp}.
223+
224+
225+\subsection{Names of control sequences}
226+\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's
227+original \TeX82 engine, some of the additional primitives take a form that is
228+very difficult to be simulated by a macro. For example, an additional
229+primitive |\prebreakpenalty|$\langle\hbox{\it
230+char\_code}\rangle$|[=]|$\langle\hbox{\it penalty}\rangle$ in \pTeX\
231+sets the amount of penalty inserted before a character whose code is
232+$\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it
233+penalty}\rangle$, and this form |\prebreakpenalty|$\langle\hbox{\it
234+char\_code}\rangle$ can be also used for retrieving the value.
235+
236+Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a
237+horizontal box or that of a paragraph are valid in whole box or
238+paragraph. However, the implementation of these parameters in
239+\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}.
240+
241+From above two problems discussed above, the assignment and retrieval
242+of most parameters in \LuaTeX-ja are summarized into the following
243+three control sequences:
244+\begin{itemize}
245+\item |\ltjsetparameter{|$\langle\hbox{\it
246+ name}\rangle$|=|$\langle\hbox{\it value}\rangle$|,...}|: for local
247+ assignment.
248+\item |\ltjglobalsetparameter|: for global assignment. Note that these two control
249+ sequences obey the value of |\globaldefs| primitive.
250+\item |\ltjgetparameter{|$\langle\hbox{\it
251+ name}\rangle$|}[{|$\langle\hbox{\it optional
252+ argument}\rangle$|}]|: for retrieval. The returned value is always
253+ a string.
254+\end{itemize}
255+
256+\subsection{Line-break after a Japanese character}
257+\label{ssec-line}
258+
259+Japanese texts can break lines almost everywhere, in contrast with
260+alphabetic texts can break lines only between words (or use
261+hyphenation). Hence, \pTeX's input processor is modified so that a
262+line-break after a Japanese character doesn't emit a space. However,
263+there is no way to customize the input processor of \LuaTeX, other than
264+to hack its CWEB-source. All a macro package can do is to modify an input line before
265+when \LuaTeX\ begin to process it, inside the |process_input_buffer|
266+callback.
267+
268+Hence, in \LuaTeX-ja, a comment letter (we reserve U+FFFFF for this
269+purpose) will be appended to an input line, if this line ends with a Japanese
270+character.\footnote{Strictly speaking, it also requires that the catcode
271+of the end-line character is 5~(\emph{end-of-line}). This condition is
272+useful under the verbatim environment.} One might jump to a conclusion
273+that the treatment of a line-break by \pTeX\ and that of \LuaTeX-ja are
274+totally same, however they are different in the respect that \LuaTeX-ja's
275+judgement whether a comment letter will be appended the line is done
276+\emph{before} the line is actually processed by \LuaTeX.
277+
278+Figure~\ref{fig-linebreak} shows an example of this situation; the
279+command at the first line marks most of Japanese characters as
280+`non-Japanese characters'. In other words, from that command onward, the
281+letter `あ' will be treated as an alphabetic character by
282+\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in
283+the output, where the actual output in the figure does not so. This is
284+because `あ' is considered a Japanese character by \LuaTeX-ja,
285+when \LuaTeX-ja does the decision whether U+FFFFF will be added to the
286+input line~2.
287+
288+\begin{figure}
289+\begin{LTXexample}
290+\font\x=IPAMincho \x
291+\ltjsetparameter{jacharrange={-6}}xあ
292+y
293+\end{LTXexample}
294+\caption{A notable sample showing the treatment of a line-break after a
295+Japanese character.}\label{fig-linebreak}
296+\end{figure}
297+
298+\subsection{Separation between `real' fonts and metrics}
299+\label{ssec-sepmet}
300+
301+Traditionally, most Japanese fonts used in typesetting are not
302+proportional, that is, most glyphs have same size (in most cases,
303+square-shaped). Hence, it is not rare that the contents of different
304+JFMs are essentially same, and only differ in their names. For example,
305+|min10.tfm| and |goth10.tfm|, which are JFMs shipped with \pTeX\ for
306+seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family,
307+differ their |FAMILY| and |FACE| only. Moreover, |jis.tfm| and
308+|jisg.tfm|, which is included in the \emph{jis} font metric, which is
309+used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦),
310+are totally same as binary files. Considering this situation, we
311+decided to separate `real' fonts and metrics used for them in
312+\LuaTeX-ja. Typical declarations of Japanese fonts in the style of plain
313+\TeX\ are shown in Figure~\ref{fig-jfdef}. We would like to add several
314+remarks:
315+\begin{itemize}
316+\item A control sequence |\jfont| must be used for Japanese fonts, instead of |\font|.
317+\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so
318+ \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be
319+ used as the first line in Figure~\ref{fig-jfdef}.
320+\item The |jfm| key specifies the metric for the font. In
321+ Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a
322+ Lua script named |jfm-ujis.lua|. This metric is the standard
323+ metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf}
324+ package~\cite{otf}.
325+\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded
326+ fonts. When one displays a pdf with these fonts, actual fonts which
327+ will be used for them depend on a pdf reader.
328+\end{itemize}
329+The specification of a metric for \LuaTeX-ja is similar to that of a JFM
330+(see \cite{ptexjp}); characters are grouped into several classes, the
331+size information of characters are specified for each class, and
332+glue/kern insertions are specified for each pair of classes. Although
333+the author have not tried, it may be possible to develop a program that
334+`converts' a JFM to a metric for \LuaTeX-ja. \LuaTeX-ja offers three
335+metrics by default; |jfm-ujis.lua|, |jfm-jis.lua| based on the
336+\emph{jis} font metric, and |jfm-min.lua| based on old |min10.tfm|.
337+
338+ Note that |-kern| in features
339+is important, because kerning information from a real font itself will
340+clash with glue/kern information from the metric.
341+
342+\begin{figure}
343+\begin{verbatim}
344+\jfont\foo=file:ipam.ttf:jfm=ujis;script=latn;-kern;+jp04 at 12pt
345+\jfont\bar=psft:Ryumin-Light:jfm=ujis at 10pt
346+\end{verbatim}
347+\caption{Typical declarations of Japanese fonts.}
348+\label{fig-jfdef}
349+\end{figure}
350+
351+\subsection{Insertion of glues/kerns for Japanese typesetting: timing}
352+\label{ssec-jglue}
353+
354+As described in \cite{luatexref}, \LuaTeX's kerning and ligaturing
355+processes are totally different from those of \TeX82. \TeX82's process is
356+done just when a (sequence of) character is appended to the current
357+list. Thus we can interrupt this process by writing as
358+|f{}irm|. However, \LuaTeX's process is \emph{node-based}, that is, the
359+process will be done when a horizontal box or a paragraph is ended, so
360+|f{}irm| and |firm| yield same outputs under \LuaTeX.
361+
362+The situation for Japanese characters is more complicated.
363+Glues (and kerns) which are needed for Japanese
364+typesetting are divided into the following three categories:
365+\begin{itemize}
366+\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue},
367+ for short).
368+
369+\item Default glue between a Japanese character and an alphabetic
370+ character (\emph{xkanjiskip}, for short), usually 1/4 of
371+ full-width (\emph{shibuaki}) with some stretch and shrink for
372+ justifying each line.
373+\item Default glue between two consecutive Japanese characters
374+ (\emph{kanjiskip}, for short). The main reason of this glue is to
375+ enable breaking lines almost everywhere in Japanese texts. In most
376+ cases, its natural width is zero, and some stretch/shrink for
377+ justifying each line.
378+\end{itemize}
379+In \pTeX, these three kinds of glues are treated differently. A JFM glue
380+is inserted when a (sequence of) Japanese character is appended to the
381+current list, same as the case of alphabetic characters in \TeX82. This
382+means that one can interrupt the insertion process by saying |{}|. A
383+\emph{xkanjiskip} is inserted just before `hpack' or line-breaking of a
384+paragraph; this timing is somewhat similar to that of \LuaTeX's kerning
385+process. Finally, A \emph{kanjiskip} is not appeared as a node anywhere;
386+only appears implicitly in calculation of the width of a horizontal box,
387+that of breaking lines, and the actual output process to a DVI
388+file. These specifications have made \pTeX's behavior very hard to
389+understand.
390+
391+\LuaTeX-ja inserts glues in all three categories simultaneously inside
392+|hpack_filter| and |pre_linebreak_filter| callbacks. The reasons of
393+this specification are to behave like alphabetic characters in \LuaTeX\
394+(as described in the first paragraph in this subsection), and to clarify
395+the specification for \LuaTeX-ja's process.
396+
397+\subsection{Insertion of glues/kerns for Japanese typesetting: specification}
398+\label{ssec-jspec}
399+
400+\begin{table}
401+\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.}
402+\label{tab-jfmglue}
403+\begin{center}
404+\begin{tabular}{llllllll}
405+\toprule
406+&\multicolumn{1}{c}{(1)}&\multicolumn{1}{c}{(2)}&\multicolumn{1}{c}{(3)}&\multicolumn{1}{c}{(4)}\\
407+Input &|あ】{}【〕\/〔| &|い』\/a| &|う)\hbox{}(| &|え]\special{}[|\\\midrule
408+\pTeX &あ】\hbox{}【〕\hbox{}〔&い』\/a &う)\hbox{}( &え]\hbox{}[\\
409+\LuaTeX-ja &あ】{}【〕\/〔 &い』\/a &う)\hbox{}( &え]\special{}[\\
410+\bottomrule
411+\end{tabular}
412+\end{center}
413+\end{table}
414+
415+\begin{figure}
416+\begin{center}
417+\fontsize{40}{40}\selectfont
418+\imagfm{\jstrut あ}%
419+\imagfm{\jstrut 】\inhibitglue}%
420+\imagfm{\jstrut\kern.5\zw}%
421+\imagfm{\jstrut\kern.5\zw}%
422+\imagfm{\jstrut\inhibitglue【}%
423+\imagfm{\jstrut 〕\inhibitglue}%
424+\imagfm{\jstrut\kern.5\zw}%
425+\imagfm{\jstrut\kern.5\zw}%
426+\imagfm{\jstrut\inhibitglue〔}%
427+\end{center}
428+\caption{Detail of the output of \pTeX\ in the input~(1) in Table~\ref{tab-jfmglue}.}
429+\label{fig-ptexjfm}
430+\end{figure}
431+
432+Now we will take a look at the insertion process itself through four points.
433+
434+\begin{description}
435+\item[Ignored nodes]
436+As noted in the previous subsection, the insertion process in \pTeX\ can
437+ be interrupted by saying |{}| or anything else.\footnote{This
438+ is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for
439+ \texttt{min10.tfm} and other `old' JFMs work.} This leads the
440+ second row in Table~\ref{tab-jfmglue}, or
441+ Figure~\ref{fig-ptexjfm}. Here `the process is interrupted'
442+ means that \pTeX\ does not think the letter `】\inhibitglue'
443+ is followed by `\inhibitglue【', hence two half-width glues
444+ are inserted between `】\inhibitglue' and `\inhibitglue【',
445+ where the left one is from `】\inhibitglue' and the right one
446+ is from `\inhibitglue【'.
447+
448+ On the other hand, in \LuaTeX-ja, the process is done inside
449+ |hpack_filter| and |pre_linebreak_filter| callbacks. Hence,
450+ \emph{anything that does not make any node will be
451+ ignored}\ in \LuaTeX-ja, as shown in (1) in
452+ Table~\ref{tab-jfmglue}. \LuaTeX-ja also ignores any nodes
453+ which does not make any contribution to current horizontal
454+ list---\emph{ins\_node}, \emph{adjust\_node},
455+ \emph{mark\_node}, \emph{whatsit\_node} and
456+ \emph{penalty\_node}---, as shown in (4).
457+
458+
459+By the way, around a \emph{glyph\_node} $p$ there may be some nodes
460+ attached to~$p$. These are an accent and kerns for
461+ moving it to the right place, and a kern from the italic
462+ correction\footnote{\TeX82 (and \LuaTeX) does not distinguish
463+ between explicit kern and a kern for italic correction. To
464+ distinguish them, an additional subtype for a kern is introduced
465+ in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and
466+ redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that
467+ these attachments should be ignored inside the process. Hence
468+ \LuaTeX-ja takes this approach, as the latest version of
469+ \pTeX\ (version~p3.2). This explains (2) in the Table~\ref{tab-jfmglue}.
470+
471+Summerizing above, one should put an empty horizontal box |\hbox{}| to
472+ where he/she wants to interrupt the insertion process in
473+ \LuaTeX-ja as (3) in the Table~\ref{tab-jfmglue}.
474+
475+\item[Fonts with the same metric]
476+Recall that \LuaTeX-ja separates `real' fonts and metrics, as in Subsection~\ref{ssec-sepmet}.
477+Consider the following input, where all Japanese fonts use same metric
478+ (in \LuaTeX-ja), and |\gt| selects \emph{gothic} family for
479+ the current Japanese font family:
480+\begin{quote}
481+\begin{verbatim}
482+明朝)\gt (ゴシック
483+\end{verbatim}
484+\end{quote}
485+If the above input is processed by \pTeX, because the insertion process is
486+ interrupt by |\gt|, the result looks like
487+\begin{quote}
488+\mc 明朝)\hbox{}\gt (ゴシック
489+\end{quote}
490+However this seems to be unnatural, since two Japanese fonts in the
491+ output use the same metric, i.e.,~the same
492+ typesetting rule. Hence, we decided that Japanese fonts with
493+ the same metric are treated as one font in the insertion
494+ process of \LuaTeX-ja. Thus, the output from the above input
495+ in \LuaTeX-ja looks like:
496+\begin{quote}
497+\mc 明朝)\gt (ゴシック
498+\end{quote}
499+One might have the situation that this default behavior is not
500+ suitable. \LuaTeX-ja offers a way to handle this situation, but
501+ we leave it to the manual~\cite{man}.
502+
503+\item[Fonts with different metrics]
504+The case where two consecutive Japanese characters use different metrics and/or
505+ different size is similar. Consider the following input where
506+ the \emph{mincho} family and the \emph{gothic} family use
507+ different metrics:
508+\begin{quote}
509+\begin{verbatim}
510+漢)\gt (漢)\large (大
511+\end{verbatim}
512+\end{quote}
513+As the previous paragraph, this input yields the following, by \pTeX:
514+\begin{quote}
515+\mc 漢)\hbox{}\gt (漢)\hbox{}\large (大
516+\end{quote}
517+We had thought that amounts of spaces between parentheses in above output
518+ are too much. Hence we have changed the default behavior of
519+ \LuaTeX-ja, so that the amount of a glue between two Japanese
520+ characters with different metrics is the \emph{average} of a glue
521+ from the left character and that from the right
522+ character. For example, Figure~\ref{fig-diffmet} shows the
523+ output from above input. The width of glue indicated `(1)' is
524+ $(a/2 + a/2)/2 = 0.5a$, and the width of glue indicated `(2)'
525+ is $(a/2 + 1.2a/2)/2 = 0.55a$. This default behavior can be
526+ changed by \textsf{diffrentmet} parameter of \LuaTeX-ja.
527+
528+\begin{figure}
529+\begin{center}
530+\fontsize{40}{40}\selectfont
531+\imagfm{\jstrut\smash{%
532+ \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr漢\cr
533+ \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$a$}\
534+ \hrulefill\vrule height .5ex depth .5ex\cr}}}}%
535+\imagfm{\jstrut )\inhibitglue}%
536+\hbox to .5\zw{\hss\normalsize (1)\hss}%
537+\imagfm{\jstrut\inhibitglue\gt (}%
538+\imagfm{\jstrut\gt 漢}%
539+\imagfm{\jstrut\gt )\inhibitglue}%
540+\hbox to .55\zw{\hss\normalsize (2)\hss}%
541+\imagfm{\fontsize{48}{48}\selectfont\jstrut\gt\inhibitglue (}%
542+\imagfm{\fontsize{48}{48}\selectfont\jstrut\smash{%
543+ \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr\gt 大\cr
544+ \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$1.2a$}\
545+ \hrulefill\vrule height .5ex depth .5ex\cr}}}}
546+\end{center}
547+\caption{Fonts with different metrics.}
548+\label{fig-diffmet}
549+\end{figure}
550+
551+\item[\emph{kanjiskip} and \emph{xkanjiskip}]
552+In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named
553+ |\xkanjiskip|. A well-known defect of this implementation is
554+ that the value of \emph{xkanjiskip} is not connected with the
555+ size of the currnt Japanese font. It seems that |EXTRASPACE|,
556+ |EXTRASTRETCH|, |EXTRASHRINK| parameters in a JFM are
557+ reserved for specifying the default value of
558+ \emph{xkanjiskip} in a unit of the design size, but \pTeX\
559+ did not use these parameters, actually.
560+
561+Considering this situation of p\TeX, \LuaTeX-ja can use the value of
562+ \emph{xkanjiskip} that specified in a metric. If the value of
563+ \emph{xkanjiskip} on user side (this is the value of
564+ \textsf{xkanjiskip} parameter of |\ltjsetparameter|) is
565+ |\maxdimen|, then \LuaTeX-ja use the specification from
566+ the current used metric as the actual value of
567+ \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}.
568+\end{description}
569+
570+\section{Distinction of characters}
571+\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode
572+characters natively, it is a major problem that how we distinguish
573+Japanese characters and alphabetic characters. For example, the
574+multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1
575+Supplement in Unicode) and in the basic Japanese character set
576+JIS~X~0208. It is not desirable that this character is always treated as
577+an alphabetic character, because this symbol is often used in the sense
578+of `negative' in Japan.
579+
580+\subsection{Character ranges}
581+Before we describe the approach taken is \LuaTeX-ja, we review the
582+approach taken by u\pTeX. u\pTeX\ extends the |\kcatcode| primitive in
583+\pTeX, to use this primitive for setting how a character is treated
584+among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17),
585+\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18).
586+The assignment to |\kcatcode| can be done by a Unicode
587+block.\footnote{There are some exceptions. For example, U+FF00--FFEF
588+(Halfwidth and Fullwidth Forms) are divided into three blocks in recent
589+u\pTeX.}
590+
591+\LuaTeX-ja adopted a different approach. There are many Unicode blocks
592+ in Basic Multilingual Plane which are not included in
593+ Japanese fonts, therefore it is inconvenient if we process by a Unicode
594+ block. Furthermore, JIS~X~0208 are not just union of Unicode
595+ blocks; for example, the intersection of JIS~X~0208 and
596+ Latin-1 Supplement is shown in
597+ Table~\ref{tab-inter}. Considering these two points, to
598+ customize the range of Japanese characters in \LuaTeX-ja, one
599+ has to define ranges of character codes in his source in advance.
600+
601+
602+\begin{table}
603+\caption{Intersection of JIS~X~0208 and Latin-1 Supplement.}
604+\label{tab-inter}
605+\begin{center}
606+\begin{tabular}{llll}
607+\ltjjachar"A7 (U+00A7),&
608+\ltjjachar"A8 (U+00A8),&
609+\ltjjachar"B0 (U+00B0),&
610+\ltjjachar"B1 (U+00B1),\\
611+\ltjjachar"B4 (U+00B4),&
612+\ltjjachar"B6 (U+00B6),&
613+\ltjjachar"D7 (U+00D7),&
614+\ltjjachar"F7 (U+00F7)
615+\end{tabular}
616+\end{center}
617+\end{table}
618+
619+
620+We note that \LuaTeX-ja offers two additional control sequences,
621+ |\ltjjachar| and |\ltjalchar|. They are similar to |\char|
622+ primitive, however |\ltjjachar| always yields a Japanese character, provided that
623+ the argument is more than or equal to 128, and |\ltjalchar| always
624+ yields an alphabetic character, regardless of the argument.
625+
626+\subsection{Default setting of ranges}
627+Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character
628+ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are
629+just the union of Unicode blocks, and determined from the Adobe-Japan1-6
630+character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges,
631+the ranges~2, 3, 6, 7, and~8 are considered ranges of Japanese
632+characters, and others are considered ranges of alphabetic
633+characters.\footnote{Note that ranges 3~and~8 are considered ranges of
634+alphabetic characters in this paper.} We remark on ranges 2~and~8:
635+\begin{description}
636+\item[The range~2]
637+JIS~X~0208 includes Greek letters and Cyrillic letters, however, these
638+ letters cannot be used for typesetting Greek or Russian, of
639+ course. Hence it is reasonable that Greek letters and
640+ Cyrillic consist another character range.
641+\item[The range~8]
642+If one want to use 8-bit TFMs, such as T1 or TS1 encodings, he should
643+ mark this range~8 as a range of alphabetic characters by
644+\begin{quote}
645+|\ltjsetparameter{jacharrange={-8}}|
646+\end{quote}
647+This is because some 8-bit TFMs have a glyph in this range; for example,
648+ the character `\OE' is located at |"D7| in the T1 encoding. %"
649+\end{description}
650+
651+
652+\begin{table}
653+\caption{Predefined ranges in \LuaTeX-ja.}
654+\label{tab-chrrng}
655+\begin{center}
656+\begin{tabular}{@{\bf}rl}
657+1&(Additional) Latin characters which are not belonged in the range~8.\\
658+2&Greek and Cyrillic letters.\\
659+3&Punctuations and miscellaneous symbols.\\
660+4&Unicode blocks which does not intersect with Adobe-Japan1-6.\\
661+5&Surrogates and supplementary private use Areas.\\
662+6&Characters used in Japanese typesetting.\\
663+7&Characters possibly used in CJK typesetting, but not in Japanese.\\
664+8&Characters in Table~\ref{tab-inter}.
665+\end{tabular}
666+\end{center}
667+\end{table}
668+
669+\subsection{Control sequences producing Unicode characters}
670+\label{ssec-unichar}
671+
672+The \emph{fontspec} package\footnote{Preciously saying, it is the
673+\emph{xunicode} package, originally a package for \XeTeX and
674+automatically loaded by the \emph{fontspec} package.} offers various
675+control sequences that produce Unicode characters. However, these
676+control sequences as it stands cannot work correctly with the default
677+range setting of \LuaTeX-ja. For example, |\textquotedblleft| is just
678+an abbreviation of |\char"201C\relax|, and the character U+201C (LEFT %"
679+DOUBLE QUOTATION MARK) is treated as an Japanese character, because it
680+belongs to the range~3. This problem is resolved by using |\ltjalchar|
681+instead of the |\char| primitive. It is included in an optional package
682+named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt}
683+shows several ways o typeset a character , both as a Japanese character
684+and as as an alphabetic characters.
685+
686+\begin{figure}
687+\begin{LTXexample}
688+×, \char`×, % depend on range setting
689+\ltjalchar`×, % alphabetic char
690+\ltjjachar`×, % Japanese char
691+\texttimes % alph. char (by fontspec)
692+\end{LTXexample}
693+\caption{Control sequences producing a Unicode character.}
694+\label{fig-unitxt}
695+\end{figure}
696+
697+The situation looks similar in math formulas, but in fact it differs.
698+Each control sequence that represents an ordinary symbol defined by the
699+\emph{unicode-math} package is just synonym of a character. For example,
700+the meaning of |\otimes| is just the character U+2297 (CIRCLED TIMES),
701+which is included in the range~3. However, it is difficult to define a
702+control sequence like |\ltjalUmathchar| as a counterpart of
703+|\Umathchar|, since an input like `|\sum^\ltjalUmathchar ...|' has to be
704+permitted.
705+
706+However, we couldn't develop a satisfactory solution to this problem in
707+time for this paper, due to a lack of time. We are just testing a
708+solution below:
709+\begin{itemize}
710+\item \LuaTeX-ja has a list of character codes which will be always reated as
711+ alphabetic characters in math mode. Considering 8-bit TFMs for
712+ math symbols, this list includes natural numbers between |"80| and
713+ |"FF| by default.
714+\item Redefine internal commands defined in the \emph{unicode-math}
715+ package so that
716+codes of characters which are mentioned in the \emph{unicode-math}
717+ package will be included in the list.
718+\end{itemize}
719+
720+
721+We would like to extend treatments described in this subsection to 8-bit
722+font encodings, but we leave it to further development too.
723+
724+\section{Current status of development}
725+\label{sec:current_status}
726+At the moment, \LuaTeX-ja can be used under plain \TeX, and under
727+\LaTeXe. Generally speaking, one only has to read |luatexja.sty|, by
728+|\input| command or |\usepackage| (in~\LaTeXe), if you merely want to
729+typeset Japanese characters. We look more detail by parts.
730+
731+\subsection{`Engine extension'}
732+The lowest part of \LuaTeX-ja corresponds to the \pTeX\ extension as
733+\emph{an engine extension of \TeX}. We, the project menbers, think that
734+this part is almost done. There is one more feature of \LuaTeX-ja which
735+we are going to explain:
736+
737+\begin{description}
738+\item[Shifting baseline]
739+In order to make a match between Japanese fonts and alphabetic fonts,
740+ sometimes shifting the baseline of alphabetic characters may
741+ be needed. \pTeX\ has a dimension |\ybaselineshift|, which
742+ corresponds to the amount of shifting down the baseline of alphabetic
743+ characters. This is useful for Japanese-based documents, but
744+ not for documents mainly in languages with alphabetic
745+ characters.
746+
747+Hence, \LuaTeX-ja extends \pTeX's |\ybaselineshift| to Japanese
748+ characters. Namely, \LuaTeX-ja offers two parameters,
749+ \textsf{yjabaselineshift} and \textsf{yalbaselineshift}, for the
750+ amount of shifting the baseline of Japanese characters and
751+ that of alphabetic characters, respectively.
752+\begin{figure}
753+\begin{center}
754+\fontsize{40}{40}\selectfont\fboxsep0mm
755+\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
756+\hbox to 0.9\linewidth{%
757+\hfil
758+\raise-10pt\imagfm{\jstrut 漢}%
759+\raise-10pt\imagfm{\jstrut 字}\hskip.25\zw%
760+\imagfm{p}%
761+\imagfm{h}%
762+\hfil\hfil
763+\imagfm{\jstrut 漢}%
764+\imagfm{\jstrut 字}\hskip.25\zw%
765+\raise-10pt\imagfm{p}%
766+\raise-10pt\imagfm{h}%
767+\hfil
768+}
769+\end{center}
770+
771+\caption{First example of shifting baseline.}
772+\label{fig-bls}
773+\end{figure}
774+
775+\begin{figure}
776+\begin{center}
777+\fontsize{30}{30}\selectfont\fboxsep0mm
778+\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth
779+\hbox to 0.9\linewidth{%
780+\hfil
781+\imagfm{a}%
782+\imagfm{b}\hskip.25\zw%
783+\imagfm{\jstrut 本}%
784+\imagfm{\jstrut 文}\hskip.33333\zw%
785+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut\inhibitglue (}%
786+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 注}%
787+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 釈}\hskip.1666667\zw%
788+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont c}%
789+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont o}%
790+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
791+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}%
792+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont e}%
793+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont n}%
794+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont t}%
795+\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut )\inhibitglue}%
796+\hskip.33333\zw%
797+\imagfm{\jstrut 本}%
798+\imagfm{\jstrut 文}%
799+\hfil
800+}
801+\end{center}
802+
803+\caption{Second example of shifting baseline.}
804+\label{fig-small}
805+\end{figure}
806+
807+An example output is shown in Figure~\ref{fig-bls}. The left half is the
808+ output when \textsf{yjabaselineshift} is positive, hence the
809+ baseline of Japanese characters is shifted down. On the other
810+ hand, the right half is the output when
811+ \textsf{yalbaselineshift} is positive, hence the baseline of
812+ alphabetic characters is shifted down. Figure~\ref{fig-small}
813+ shows an intresting use of these parameters.
814+
815+\end{description}
816+Note that \LuaTeX-ja doesn't support vertical typesetting, \emph{tategaki}, for now.
817+
818+\subsection{Patches for plain \TeX\ and \LaTeXe}
819+\pTeX\ has a patch for plain \TeX, namely |ptex.tex|, that for \LaTeXe\
820+macro (this patch and \LaTeXe\ consist \emph{p\LaTeXe}), and
821+|kinsoku.tex| which includes the default setting of \emph{kinsoku
822+shori}, the Japanese hyphenation. We ported them to \LuaTeX-ja, except
823+the codes related to vertical typesetting, because \LuaTeX-ja doesn't
824+support vertical typesetting yet. We remark one point related to the
825+porting:
826+\begin{description}
827+
828+\item[Behavior of\/ {\tt\char92fontfamily\/}]
829+The control sequence |\fontfamily| in p\LaTeXe\ changes the current alphabetic
830+ font family and/or the current Japanese font family,
831+ depending the argument. More concretely,
832+ |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
833+ current alphabetic font family to $\langle\hbox{\it
834+ arg\/}\rangle$, if and only if one of the following
835+ conditions are satisfied:
836+\begin{itemize}
837+\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ in
838+ \emph{some} alphabetic encoding is already defined in the document.
839+\item There exists an alphabetic encoding $\langle\hbox{\it
840+ enc\/}\rangle$ already defined in the document such that a font
841+ definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
842+ arg\/}\rangle$|.fd| (all lowercase) exists.
843+\end{itemize}
844+The same criterion is used for changing Japanese font family.
845+
846+To work this behavior well, a list of all (alphabetic) encodings defined
847+ already in the document is needed. However, since \LuaTeX-ja
848+ is loaded as a package, \LuaTeX-ja cannot have this list.
849+ Hence \LuaTeX-ja adopted a different approach, namely
850+ |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the
851+ current alphabetic font family to $\langle\hbox{\it
852+ arg\/}\rangle$, if and only if:
853+\begin{itemize}
854+\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$
855+ in the current alphabetic encoding $\langle\hbox{\it
856+ enc\/}\rangle$ is already defined in the document.
857+\item A font definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it
858+ arg\/}\rangle$|.fd| (all lowercase) exists.
859+\end{itemize}
860+
861+
862+\end{description}
863+
864+
865+
866+\subsection{Classes for Japanese documents}
867+To produce `high-quality' Japanese documents, we need not only that
868+Japanese characters are correctly placed, but also class files for
869+Japanese documents. Two major families of classes are widely used in Japan:
870+\emph{jclasses} which is distributed with the official p\LaTeXe\ macros,
871+and \emph{jsclasses}. At the present, \LuaTeX-ja
872+simply contains their counterparts: \emph{ltjclasses} and
873+\emph{ltjsclasses}. However, the policy on classes is not determined
874+now, and we hope to have another family of classes which are useful for
875+commercial printing. In the author's opinion, \emph{ltjclasses} is
876+better to stay as an example of porting of class files for \pTeX\ to
877+\LuaTeX-ja.
878+
879+\subsection{Patches for packages}
880+Apart from patches for the \LaTeXe~kernel and classes for Japanese
881+documents, we need to make patches for several packages. At the present,
882+we considered the following packages, and made patches or porting for
883+the former two packages.
884+
885+\begin{description}
886+\item[The \emph{fontspec} package] The \emph{fontspec} package is built
887+ on NFSS2, hence control sequences offered by the
888+ \emph{fontspec} package, such as |\setmainfont|, are only
889+ effective for alphabetic fonts if \LuaTeX-ja is loaded.
890+ \texttt{luatexja-\penalty0fontspec.sty} (not automatically
891+ loaded) offers these counterparts for Japanese fonts, with
892+ additional `j' in the name of control sequences, such as
893+ |\setmainjfont|. As described in
894+ Subsection~\ref{ssec-unichar}, it also includes a patch for
895+ control sequences producing Unicode characters.
896+
897+\item[The \emph{otf} package]
898+This package is widely used in \pTeX\ for typesetting characters which is
899+not in JIS~X~0208, and for using more than one weight in \emph{mincho}
900+and \emph{gothic} font families. Therefore \LuaTeX-ja supports features
901+in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty}
902+ manually. Note that characters by |\UTF{xxxx}| and
903+ |\CID{xxxx}| are not appended to the current list as a
904+ \emph{glyph\_node}, to avoid from callbacks by the
905+ \emph{luaotfload} package. We have another remark; |\CID|
906+ does not work with TrueType fonts, since |\CID| use the
907+ conversion table between CID and the glyph order of the
908+ current Japanese font.
909+
910+\item[The \emph{listings} package]
911+It is known for users of \pTeX\ that there is a patch |jlisting.sty| for
912+ the \emph{listings} package, to use Japanese characters in
913+ the |lstlisting| environment. Generally speaking, it also can
914+ be used in \LuaTeX-ja. However, it seems to be that a
915+ Japanese character after a space does not recieve any process
916+ of the \emph{listings} package; this is inconvinient when we
917+ use the \emph{showexpl} package.
918+
919+There is another way to use characters above 256 with the
920+ \emph{listings} package (described in\cite{apl}). However,
921+ this method is not suitable for Japanese, since the number of
922+ Japanese characters is very large. We hope that the
923+ \emph{listings} package will be able to handle all characters above
924+ 256 without any patch, in the future.
925+
926+
927+\end{description}
928+
929+
930+
931+\section{Implementation}
932+\label{sec:implementation}
933+\subsection{Handling of Japanese fonts}
934+In \pTeX, there are three slots for maintaining current fonts, namely
935+|\font| for alphabetic fonts, |\jfont| for Japanese fonts (in horizontal
936+direction) and |\tfont| for Japanese fonts (in vertical direction). With
937+these slots, we can manage the current font for alphabetic characters
938+and that for Japanese characters separately in \pTeX. However, \LuaTeX\
939+has only one slot for maintaining the current font, as \TeX82. This
940+situation leads a problem: how can we maintain the `current Japanese
941+font'?
942+
943+There are three approaches for this problem. One approach is to make a
944+mapping table from alphabetic fonts to corresponding Japanese fonts
945+(here we don't assume that NFSS2 is available). Another approach is
946+that we always use composite fonts with alphabetic fonts and Japanese
947+fonts. The third approach is that the information of the current
948+Japanese font is stored in an attribute. We adopted the third approach,
949+since \LuaTeX-ja is much affected by \pTeX\ as we noted in
950+Subsection~\ref{ssec-pol}.
951+
952+As in Figure~\ref{fig-jfdef}, \LuaTeX-ja uses |\jfont| for defining
953+Japanese fonts, as \pTeX. However, because the information of the current
954+Japanese font is stored into an attribute, control sequences defined by
955+|\jfont| (e.g.,~|\foo| and |\bar| in Figure~\ref{fig-jfdef}) is
956+not representing a font by the means of \TeX82. In other words, each of
957+these control sequences is just an assignment to an attribute, therefore
958+they cannot be an argument of |\the|, |\fontname|, nor |\textfont|.
959+
960+
961+Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs
962+according to OpenType font features, are performed just after `Examination of
963+stack level' (see Subsections
964+\ref{ssec-over}~and~\ref{ssec-stack}). Also note that calculation of
965+character classes for each Japanese character is done \emph{after} the
966+these callbacks for now.
967+
968+\subsection{Stack management}
969+\label{ssec-stack}
970+
971+As we noted in Subsection~\ref{ssec-csname}, parameters that the values
972+at the end of a horizontal box or that of a paragraph are valid in
973+whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented
974+by internal integers or registers of other types in \TeX. We explain it
975+in this subsection.
976+
977+\begin{figure}
978+\begin{lstlisting}
979+void package(int c)
980+{
981+ ...
982+ d = box_max_depth;
983+ unsave();
984+ save_ptr -= 4;
985+ if (cur_list.mode_field == -hmode) {
986+ cur_box = filtered_hpack(cur_list.head_field,
987+ cur_list.tail_field, saved_value(1),
988+ saved_level(1), grp, saved_level(2));
989+ subtype(cur_box) = HLIST_SUBTYPE_HBOX;
990+ } else {
991+\end{lstlisting}
992+\caption{An extract of a CWEB-source \texttt{tex/packaging.w} of \LuaTeX.}
993+\label{fig-ltsrc}
994+\end{figure}
995+
996+Figure~\ref{fig-ltsrc} is an extract of a CWEB-source
997+\texttt{tex/packaging.w} of \LuaTeX\ (SVN revision 4358). This function
998+is called just when an explicit |\hbox{...}| or |\vbox{...}| is ended, and
999+the function |filtered_hpack()| is where the |hpack_filter| and then the
1000+actual `hpack' process are performed. Notice that the |unsave()|
1001+function is called before |filtered_hpack()|. This is the problem;
1002+because of |unsave()|, we can retrive only the values of registers
1003+\emph{outside} the box, even in the |hpack_filter| callback.
1004+
1005+To cope with this problem, \LuaTeX-ja has its own stack system, based on
1006+Lua codes in \cite{stack-mail}. Furthermore, \emph{whatsit} nodes whose
1007+\emph{user\_id} is 30112 (\emph{stack\_node}, for short) will be
1008+appended to the current horizontal list each time the current stack
1009+level is incremented, and their values are the values of
1010+|\currentgrouplevel| at that time. In the beginning of the |hpack_filter|
1011+callback, the list in question is traversed to determine whether the
1012+stack level at the end of the list and that outside the box coincides.
1013+
1014+Let $x$ be the value of |\currentgrouplevel|, and $y$ be the current
1015+stack level, both inside the |hpack_filter| callback, i.e.,~outside a
1016+horizontal box. Consider a list which represents the content of the box,
1017+then we have:
1018+\begin{itemize}
1019+\item A \emph{stack\_node} whose value is $x+1$ (because all materials
1020+ in the box are included in a group |\hbox{...}|, the value of
1021+ |\currentgrouplevel| inside the box is at least $x+1$) in the list
1022+ corresponds to an assignment related to the stack system in just
1023+ top-level of the list, like
1024+\begin{quote}
1025+\begin{verbatim}
1026+\hbox{...(assignment)...}
1027+\end{verbatim}
1028+\end{quote}
1029+In this case, the current stack level is incremented to $y+1$ after the assignment.
1030+\item A \emph{stack\_node} whose value is more than $x+1$ in the list corresponds
1031+to an assignment inside another group contained in the box. For example,
1032+ the following input creates
1033+a \emph{stack\_node} whose value is $x+3=(x+1)+2$:
1034+\begin{quote}
1035+\begin{verbatim}
1036+\hbox{...{...{...(assignment)}...}...}
1037+\end{verbatim}
1038+\end{quote}
1039+\end{itemize}
1040+Thus, we can conclude that the stack level at the end of the list is
1041+$y+1$, if and only if there is a \emph{stack\_node} whose value is
1042+$x+1$. Otherwise, the stack level is just $y$.
1043+
1044+\subsection{Adjustment of the position of Japanese characters}
1045+\label{ssec-width}
1046+
1047+The size of a glyph specified in a metric and that of a real font
1048+usually differ. For example, the letter `\inhibitglue【' is half-width
1049+in |jfm-ujis.lua| or |jis.tfm|, while this letter is full-width like `【'
1050+in most TrueType fonts used in Japanese typesetting, such as
1051+IPA~Mincho. Hence the adjustment of position of such glyphs is
1052+needed. In the context of \pTeX, this process was performed using virtual fonts.
1053+
1054+On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph
1055+into a horizontal box. There are two main reasons why we adopted this
1056+method; one is that we feared Lua codes for coexisting with callbacks by
1057+the |luaotfload| package would be large if we use virtual fonts, and the
1058+other is to cope with shifting of the baseline of characters at the
1059+same time.
1060+
1061+\begin{figure}
1062+\begin{center}\unitlength=9pt\small
1063+\begin{picture}(15,12)(-1,-3)
1064+
1065+\color{grayx}% real glyph
1066+\put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
1067+
1068+\color{black}% real glyph :step1
1069+\thicklines
1070+\put(-1,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1071+\put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
1072+\put(-1,5.5){\line(1,0){6}}
1073+\put(-1,-4){\line(1,0){6}}
1074+\put(-1,0){\makebox(0,0)[r]{\strut$R$\,}}
1075+
1076+\thicklines
1077+\put(0,0){\vector(0,1){9}\line(0,-1){3}\vector(1,0){12}}
1078+\put(12,9){\makebox(0,0)[rt]{\strut$M$\,}}
1079+\put(12,0){\line(0,1){9}\vector(0,-1){3}}
1080+\put(0,9){\line(1,0){12}}
1081+\put(0,-3){\line(1,0){12}}
1082+\put(0.2,4.5){\makebox(0,0)[l]{\texttt{height}}}
1083+\put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
1084+\put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
1085+
1086+\thicklines
1087+\put(3,0){\line(0,1){7}\line(0,-1){2.5}\line(1,0){6}}
1088+\put(9,0){\line(0,1){7}\line(0,-1){2.5}}
1089+\put(3,7){\line(1,0){6}}
1090+\put(3,-2.5){\line(1,0){6}}
1091+\newsavebox{\eqdist}
1092+\savebox{\eqdist}(0,0)[c]{%
1093+ \thinlines
1094+ \put(-0.08,0.2){\line(0,-1){0.4}}%
1095+ \put(0.08,0.2){\line(0,-1){0.4}}}
1096+\put(1.5,0){\usebox{\eqdist}}
1097+\put(10.5,0){\usebox{\eqdist}}
1098+
1099+\thicklines
1100+\put(3,-1.5){\vector(-1,0){4}}
1101+\put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
1102+\put(3,0){\vector(0,-1){1.5}}
1103+\put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
1104+\end{picture}
1105+\end{center}
1106+\caption{The position of the `real' glyph.}
1107+\label{fig-pos}
1108+\end{figure}
1109+
1110+Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is
1111+the imaginary body specified in the metric, and a vertical
1112+rectangle is the imaginary body of a real glyph. First, the real glyph
1113+is aligned with respect to the width of $M$. In the figure, the real
1114+glyph is aligned `middle'; this setting is useful for the full-width
1115+middle dot `・'. We have other settings, `left' and `right'.
1116+After that, it is shifted according to the value of |left| and |down|,
1117+which are specified in the metric, too. The final position of the real glyph
1118+is shown by the gray rectangle~$R$. If the amount of shifting the baseline is
1119+not zero, $M$ (and hence the real glyph) is shifted by that amount.
1120+
1121+We would like to remark briefly on the vertical position of a real
1122+glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for
1123+it may have different height or depth. In that case, it may look better
1124+if the real glyph is shifted vertically to match the height-depth ratio
1125+specified in the metric, while any vertical adjustment except the
1126+adjustment by the |down| value does not performed in the present
1127+implementation of \LuaTeX-ja . This situation is carefully studied by
1128+Otobe~\cite{min10}. Here the policy on this problem is not determined
1129+now, however we would like to offer several solutions in future
1130+development.
1131+
1132+\section{Conclusion}
1133+We have discussed about our \LuaTeX-ja package, which is much affected
1134+by \pTeX. For now, it can be used for experimental use, however there
1135+are much refinements which are needed for regular use. The author hopes
1136+that this paper and \LuaTeX-ja project contribute the typesetting Japanese,
1137+and possibly other Asian languages, under \LuaTeX.
1138+
1139+\section*{Acknowledgements}
1140+The author would like to thank Ken Nakano and Hideaki Togashi for their
1141+development of ASCII \pTeX. The author is very grateful to Haruhiko
1142+Okumura for his leadership in the Japanese \TeX\ community. The author
1143+is also very grateful to members of \LuaTeX-ja project team for their
1144+valuable cooperation in development.
1145+
1146+%%% The style of the bibiliogrphy is `amsplain'.
1147+\providecommand{\bysame}{\leavevmode\hbox to3em{\hrulefill}\thinspace}
1148+\providecommand{\href}[2]{#2}
1149+\begin{thebibliography}{99}
1150+
1151+\bibitem{aj16}
1152+Adobe Systems Incorporated, \emph{Adobe-Japan1-6 Character Collection
1153+ for CID-Keyed Fonts}, Technical Note~\#5078, 2004.
1154+\url{http://partners.adobe.com/public/developer/en/font/5078.Adobe-Japan1-6.pdf}
1155+
1156+\bibitem{ptex}
1157+ASCII MEDIA WORKS,アスキー日本語\TeX\ (\pTeX).\url{http://ascii.asciimw.jp/pb/ptex/}
1158+
1159+\bibitem{apl}
1160+John Baker, \emph{Typesetting UTF8 APL code with the \LaTeX\ lstlisting package}.
1161+\url{http://bakerjd99.wordpress.com/2011/08/15/}
1162+
1163+\bibitem{omega}
1164+Jin-Hwan~Cho and Haruhiko Okumura, \emph{Typesetting CJK Languages with Omega},
1165+\TeX, XML, and Digital Typography, Lecture Notes in Computer Science, vol.~3130,
1166+Springer, 2004, 139--148.
1167+
1168+\bibitem{joylua}
1169+Yannis Haralambous. \emph{The Joy of \LuaTeX}. \url{http://luatex.bluwiki.com/}
1170+
1171+\bibitem{jisx4051}
1172+Japanese Industrial Standards Committee. \emph{JIS~X~4051: Formatting
1173+ rules for Japanese documents}, 1993, 1995, 2004.
1174+
1175+\bibitem{eptex}
1176+北川弘典,$\varepsilon$-\pTeX についてのwiki.
1177+\url{http://sourceforge.jp/projects/eptex/wiki/FrontPage}
1178+
1179+\bibitem{luaums}
1180+北川弘典,\LuaTeX で日本語.
1181+\url{http://oku.edu.mie-u.ac.jp/tex/mod/forum/discuss.php?d=378}
1182+
1183+\bibitem{luatexref}
1184+\LuaTeX\ development team, \emph{The \LuaTeX\ reference}.
1185+\url{http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf} (snapshot of SVN trunk)
1186+
1187+\bibitem{man}
1188+\LuaTeX-ja project team, \emph{The \LuaTeX-ja package}.
1189+Not completed for now. Available at |doc/man-en.pdf| (in English) or
1190+ |doc/man-ja.pdf| (in Japanese)
1191+in the Git repository.
1192+
1193+\bibitem{luajp-test}
1194+香田温人,\LuaTeX と日本語.
1195+\url{http://www1.pm.tokushima-u.ac.jp/~kohda/tex/luatex-old.html}
1196+
1197+\bibitem{luajalayout}
1198+前田一貴,luajalayout パッケージ---Lua\LaTeX によ
1199+ る日本語組版---.
1200+\url{http://www-is.amp.i.kyoto-u.ac.jp/lab/kmaeda/lualatex/luajalayout/}
1201+
1202+\bibitem{jsclasses}
1203+奥村晴彦,p\LaTeXe 新ドキュメントクラス.
1204+\url{http://oku.edu.mie-u.ac.jp/~okumura/jsclasses/}
1205+
1206+\bibitem{ptexjp}
1207+Haruhiko Okumura, \emph{\pTeX\ and Japanese Typesetting},
1208+ The Asian Journal of \TeX\ \textbf{2}~(2008), 43--51.
1209+
1210+\bibitem{min10}
1211+乙部厳己,min10フォントについて.
1212+\url{http://argent.shinshu-u.ac.jp/~otobe/tex/files/min10.pdf}
1213+
1214+\bibitem{otf}
1215+齋藤修三郎,Open Type Font用VF.
1216+\url{http://psitau.kitunebi.com/otf.html}
1217+
1218+\bibitem{stack-mail}
1219+Jonathan Sauer, \emph{[Dev-luatex] tex.currentgrouplevel}.
1220+\url{http://www.ntg.nl/pipermail/dev-luatex/2008-August/001765.html}
1221+
1222+\bibitem{uptex}
1223+Takuji Tanaka, \emph{u\pTeX, up\LaTeX---unicode version of \pTeX, p\LaTeX}.
1224+\url{http://homepage3.nifty.com/ttk/comp/tex/uptex_en.html}
1225+
1226+\bibitem{ptexenc}
1227+Nobuyuki Tsuchimura and Yusuke Kuroki, \emph{Development of Japanese \TeX\ Environment},
1228+ The Asian Journal of \TeX\ \textbf{2}~(2008), 53--62.
1229+
1230+\bibitem{w3c}
1231+W3C Working Group, \emph{Requirements for Japanese Text Layout}.
1232+\url{http://www.w3.org/TR/jlreq/}
1233+\end{thebibliography}
1234+
1235+\end{document}