ソースコードの管理場所
Revisão | b042492ba42d5532181dcd00bb3f84961bd16e6c (tree) |
---|---|
Hora | 2011-11-22 20:05:25 |
Autor | Hironori Kitagawa <h_kitagawa2001@yaho...> |
Commiter | Hironori Kitagawa |
Changed the reference for ptexenc.
@@ -1,1235 +1,1235 @@ | ||
1 | -%#!lualatex ajt-devel-ltja | |
2 | -\documentclass{ajt} | |
3 | - | |
4 | -%%% Packages used in this paper | |
5 | - | |
6 | -%%% Font setting for \LuaTeX; this is extract from ajt.cls | |
7 | -\makeatletter | |
8 | - \if@print | |
9 | - \RequirePackage{fontspec,xunicode} | |
10 | - \RequirePackage{luatextra} | |
11 | - \setmainfont[Mapping=tex-text]{Palatino LT Std} | |
12 | - \setsansfont[Mapping=tex-text]{Optima LT Std} | |
13 | - \else | |
14 | - \RequirePackage{fontspec,luatextra} | |
15 | - \setmainfont[Mapping=tex-text]{TeX Gyre Pagella} % \simeq Palatino | |
16 | - \fi | |
17 | - | |
18 | -%%% LuaTeX-ja | |
19 | -\usepackage{luatexja,luatexja-fontspec} | |
20 | -\ltjsetparameter{jacharrange={-3,-8}} | |
21 | -\DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.92489] file:ipam.ttf:jfm=ujis}{} | |
22 | -\DeclareFontShape{JY3}{gt}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=ujis}{} | |
23 | -% quick hack: monospaced Japanese font by \ttfamily | |
24 | -\DeclareKanjiFamily{JY3}{\ttdefault}{}{} | |
25 | -\DeclareFontShape{JY3}{\ttdefault}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=mono}{} | |
26 | - | |
27 | - | |
28 | -%%% LTXexample environment | |
29 | -\usepackage{showexpl,lltjlisting} | |
30 | -\lstset{basicstyle=\ttfamily\small, width=0.3\textwidth, basewidth=.5em} | |
31 | - | |
32 | -%%% Verbatim environment | |
33 | -\usepackage{fancyvrb} | |
34 | -\CustomVerbatimEnvironment{code}{Verbatim}% | |
35 | -{numbers=left,xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small} | |
36 | -\CustomVerbatimEnvironment{codewithoutnum}{Verbatim}% | |
37 | -{xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small} | |
38 | -\CustomVerbatimEnvironment{codewithoutnumsmall}{Verbatim}% | |
39 | -{xleftmargin=1.5em,baselinestretch=1.0,fontsize=\footnotesize} | |
40 | -\DefineShortVerb{\|} | |
41 | - | |
42 | -%%% Others | |
43 | -\usepackage{mflogo,booktabs} | |
44 | -\definecolor{grayx}{gray}{0.85} | |
45 | -\hyphenation{ | |
46 | - kanjiskip | |
47 | - xkanjiskip | |
48 | -} | |
49 | - | |
50 | -%%% Mandatory article metadata %%% | |
51 | -\title{Development of \LuaTeX-ja package} | |
52 | -\author[北川 弘典]{Hironori Kitagawa} | |
53 | -\address{\LuaTeX-ja project team} | |
54 | -\email{h\_kitagawa2001@yahoo.co.jp} | |
55 | - | |
56 | -\keywords{\TeX, p\TeX, \LuaTeX, \LuaTeX-ja, Japanese} | |
57 | -\abstract{% | |
58 | -\LuaTeX-ja package is a macro package for typesetting Japanese | |
59 | -documents under \LuaTeX. The package has more flexibility of | |
60 | -typesetting than \pTeX, which is widely used Japanese extension of \TeX, | |
61 | -and has corrected some unwanted features of \pTeX. | |
62 | -In this paper, we describe specifications, the current status and some | |
63 | -internal processing methods of \LuaTeX-ja. | |
64 | -} | |
65 | - | |
66 | -\newcommand{\parname}[1]{\textsf{#1}} | |
67 | -\newcommand{\jstrut}{\vrule width0pt height\cht depth\cdp} | |
68 | -\newcommand{\imagfm}[1]{\ifvmode\leavevmode\fi% | |
69 | - \hbox{\fboxsep=0pt\fbox{\setbox0=\hbox{#1}\copy0\kern-\wd0 | |
70 | - \smash{\vrule width \wd0 height 0.4pt depth0.4pt}}}} | |
71 | -\begin{document} | |
72 | - | |
73 | -%%% Do not forget to start with \maketitle! | |
74 | -\maketitle | |
75 | - | |
76 | -\section{Introduction} | |
77 | -\subsection{History} | |
78 | -To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has | |
79 | -been widely used in Japan. There are other methods---for example, using | |
80 | -Omega and OTP~\cite{omega}, or with the CJK package---to do so, however, | |
81 | -these alternative methods did not become majority. The author thinks | |
82 | -that this is because \pTeX\ enables us to produce high-quality documents | |
83 | -(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is | |
84 | -earlier than that of alternatives described above. | |
85 | - | |
86 | -However, \pTeX\ has been left behind from the extensions of \TeX\ such | |
87 | -as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent | |
88 | -years, the situation has become better, by development of | |
89 | -|ptexenc|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}), | |
90 | -$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex} | |
91 | -by Takuji Tanaka (田中琢爾). However, continuing this approach, namely, | |
92 | -to develop an engine extension localized for Japanese, is not wise. This | |
93 | -approach needs lots of work for \emph{each} engine. In addition, if we | |
94 | -use \LuaTeX, the necessity of an engine extension is getting smaller | |
95 | -because \LuaTeX\ has an ability to hook \TeX's internal process by using | |
96 | -Lua callbacks. | |
97 | - | |
98 | - | |
99 | -There were several experimental attempts to typeset | |
100 | -Japanese documents with \LuaTeX\ before. Here we cite three examples: | |
101 | -\begin{itemize} | |
102 | -\item |luaums.sty|~\cite{luaums} developed by the author. This | |
103 | - experimental package is for creating a certain Japanese-based presentation | |
104 | - with \LuaTeX. | |
105 | -\item the \emph{luajalayout} package~\cite{luajalayout}, formerly known as the | |
106 | - \emph{jafontspec} package, by Kazuki Maeda (前田一貴). This package is based on | |
107 | - \LaTeXe\ and \emph{fontspec} package. | |
108 | -\item the \emph{luajp-test} package~\cite{luajp-test}, a test package made by | |
109 | - Atsuhito Kohda (香田温人), based on articles on the web page~\cite{joylua}. | |
110 | -\end{itemize} | |
111 | -However, these packages are based on \LaTeXe, and do not have much | |
112 | -ability to control the typesetting rule. And it is inefficient that more | |
113 | -than one people separately develop similar packages. Development of the | |
114 | -\LuaTeX-ja package is started initially by the author and Kazuki Maeda, because of | |
115 | -these situations. | |
116 | - | |
117 | -\subsection{Development policy of \LuaTeX-ja} | |
118 | -\label{ssec-pol} | |
119 | -The first aim of \LuaTeX-ja project was to implement features (from the | |
120 | -`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is | |
121 | -much affected by \pTeX. However, as development proceeded, some | |
122 | -technical/conceptual difficulties arose. Hence we changed the aim | |
123 | -of the project as follows: | |
124 | -\begin{itemize} | |
125 | -\item\emph{\LuaTeX-ja offers at least the same flexibility of | |
126 | - typesetting that p\TeX\ has.} | |
127 | - | |
128 | - We are not satisfied with the ability of producing outputs conformed to | |
129 | - JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for | |
130 | - typesetting, or to a technical note~\cite{w3c} by W3C; | |
131 | - if one wants to produce very incoherent outputs for some reason, it | |
132 | - should be possible. | |
133 | -In this point, previous attempts of Japanese typesetting with \LuaTeX\ | |
134 | - which we cited in the previous subsection are inadequate. | |
135 | - | |
136 | -\pTeX\ has some flexibility of typesetting, by changing internal | |
137 | - parameters such as |\kanjiskip| or |\prebreakpenalty|, and by using | |
138 | - custom JFM (Japanese TFM). Therefore we decided to include these | |
139 | - functionality to \LuaTeX-ja. | |
140 | - | |
141 | -\item\emph{\LuaTeX-ja isn't mere re-implementation or porting of \pTeX; | |
142 | - some (technically and/or conceptually) inconvenient features of | |
143 | - \pTeX\ are modified.} | |
144 | - | |
145 | - We describe this point in more detail at the next section. | |
146 | -\end{itemize} | |
147 | - | |
148 | - | |
149 | -\subsection{Overview of the processes} | |
150 | -\label{ssec-over} | |
151 | -We describe an outline of \LuaTeX-ja's process in order. | |
152 | - | |
153 | -\begin{itemize} | |
154 | -\item In the |process_input_buffer| callback: treatment of breaking | |
155 | - lines after a Japanese character (in Subsection~\ref{ssec-line}). | |
156 | - | |
157 | -\item In the |hyphenate| callback: font replacement. | |
158 | - | |
159 | -\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If | |
160 | - the character represented by $p$ is considered as a Japanese | |
161 | - character, the font used at $p$ is replaced by the value of | |
162 | - |\ltj@curjfnt|, an attribute for `the current Japanese font' | |
163 | - at~$p$. | |
164 | - | |
165 | -Furthermore, the subtype of $p$ is subtracted by 1 to suppress | |
166 | - hyphenation around $p$ by \LuaTeX, because later processes of | |
167 | - \LuaTeX-ja take care of all things about Japanese characters. | |
168 | - | |
169 | -\item In |pre_linebreak_filter| and |hpack_filter| callbacks: | |
170 | - | |
171 | -\begin{enumerate} | |
172 | -\item \LuaTeX-ja has its own stack system, and the current horizontal | |
173 | - list is traversed in this stage to determine what the level of | |
174 | - \LuaTeX-ja's internal stack at the end of the list is. We will | |
175 | - discuss it in Subsection~\ref{ssec-stack}. | |
176 | - | |
177 | -\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese | |
178 | - typesetting in the list. This is the core routine of \LuaTeX-ja. | |
179 | - We will discuss it in Subsections | |
180 | - \ref{ssec-jglue}~and~\ref{ssec-jspec} . | |
181 | - | |
182 | -\item To make a match between a metric and a real font, sometimes | |
183 | - adjustument of the position of (Japanese) glyphs are performed. | |
184 | - We will discuss it in Subsection~\ref{ssec-width}. | |
185 | -\end{enumerate} | |
186 | -\item In the |mlist_to_hlist| callback: treatment of Japanese characters | |
187 | - in math formulas. This stage is similar to adjustment of the | |
188 | - position of glyphs (see above), so we omit to describe this stage | |
189 | - from this paper. | |
190 | -\end{itemize} | |
191 | - | |
192 | -In this paper, a \emph{alphabetic character} means a non-Japanese | |
193 | -character. Similarly, we use the word an \emph{alphabetic font} as the | |
194 | -counterpart of a jJpanese font. | |
195 | - | |
196 | -\subsection{Contents of this paper} | |
197 | -Here we describe the contents of the rest of this paper briefly. In | |
198 | -Section~\ref{sec:differences_with_ptex}, we describe major differences | |
199 | -between \pTeX\ and \LuaTeX-ja. The next section, | |
200 | -Section~\ref{sec:distinction_of_characters}, is concentrated on a | |
201 | -problem how we distinguish between Japanese characters and alphabetic | |
202 | -characters. In Section~\ref{sec:current_status}, we show current | |
203 | -development status of the package. Finally, in | |
204 | -Section~\ref{sec:implementation}, we describe some internal routines of | |
205 | -\LuaTeX-ja. | |
206 | - | |
207 | -\subsection{General information of the project} | |
208 | -This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki | |
209 | -is located on | |
210 | -\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is | |
211 | -no stable version on October 22, 2011, however a set of developer sources can be | |
212 | -obtained from the git repository. Members of the project team are as follows | |
213 | -(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato, | |
214 | -Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda, | |
215 | -and~Shuzaburo Saito. | |
216 | - | |
217 | - | |
218 | -\section{Major differences with \pTeX} | |
219 | -\label{sec:differences_with_ptex} | |
220 | -In this section, we explain several major differences between \pTeX\ | |
221 | -and our \LuaTeX-ja. For general information of Japanese typesetting and the | |
222 | -overview of \pTeX, please see Okumura~\cite{ptexjp}. | |
223 | - | |
224 | - | |
225 | -\subsection{Names of control sequences} | |
226 | -\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's | |
227 | -original \TeX82 engine, some of the additional primitives take a form that is | |
228 | -very difficult to be simulated by a macro. For example, an additional | |
229 | -primitive |\prebreakpenalty|$\langle\hbox{\it | |
230 | -char\_code}\rangle$|[=]|$\langle\hbox{\it penalty}\rangle$ in \pTeX\ | |
231 | -sets the amount of penalty inserted before a character whose code is | |
232 | -$\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it | |
233 | -penalty}\rangle$, and this form |\prebreakpenalty|$\langle\hbox{\it | |
234 | -char\_code}\rangle$ can be also used for retrieving the value. | |
235 | - | |
236 | -Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a | |
237 | -horizontal box or that of a paragraph are valid in whole box or | |
238 | -paragraph. However, the implementation of these parameters in | |
239 | -\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}. | |
240 | - | |
241 | -From above two problems discussed above, the assignment and retrieval | |
242 | -of most parameters in \LuaTeX-ja are summarized into the following | |
243 | -three control sequences: | |
244 | -\begin{itemize} | |
245 | -\item |\ltjsetparameter{|$\langle\hbox{\it | |
246 | - name}\rangle$|=|$\langle\hbox{\it value}\rangle$|,...}|: for local | |
247 | - assignment. | |
248 | -\item |\ltjglobalsetparameter|: for global assignment. Note that these two control | |
249 | - sequences obey the value of |\globaldefs| primitive. | |
250 | -\item |\ltjgetparameter{|$\langle\hbox{\it | |
251 | - name}\rangle$|}[{|$\langle\hbox{\it optional | |
252 | - argument}\rangle$|}]|: for retrieval. The returned value is always | |
253 | - a string. | |
254 | -\end{itemize} | |
255 | - | |
256 | -\subsection{Line-break after a Japanese character} | |
257 | -\label{ssec-line} | |
258 | - | |
259 | -Japanese texts can break lines almost everywhere, in contrast with | |
260 | -alphabetic texts can break lines only between words (or use | |
261 | -hyphenation). Hence, \pTeX's input processor is modified so that a | |
262 | -line-break after a Japanese character doesn't emit a space. However, | |
263 | -there is no way to customize the input processor of \LuaTeX, other than | |
264 | -to hack its CWEB-source. All a macro package can do is to modify an input line before | |
265 | -when \LuaTeX\ begin to process it, inside the |process_input_buffer| | |
266 | -callback. | |
267 | - | |
268 | -Hence, in \LuaTeX-ja, a comment letter (we reserve U+FFFFF for this | |
269 | -purpose) will be appended to an input line, if this line ends with a Japanese | |
270 | -character.\footnote{Strictly speaking, it also requires that the catcode | |
271 | -of the end-line character is 5~(\emph{end-of-line}). This condition is | |
272 | -useful under the verbatim environment.} One might jump to a conclusion | |
273 | -that the treatment of a line-break by \pTeX\ and that of \LuaTeX-ja are | |
274 | -totally same, however they are different in the respect that \LuaTeX-ja's | |
275 | -judgement whether a comment letter will be appended the line is done | |
276 | -\emph{before} the line is actually processed by \LuaTeX. | |
277 | - | |
278 | -Figure~\ref{fig-linebreak} shows an example of this situation; the | |
279 | -command at the first line marks most of Japanese characters as | |
280 | -`non-Japanese characters'. In other words, from that command onward, the | |
281 | -letter `あ' will be treated as an alphabetic character by | |
282 | -\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in | |
283 | -the output, where the actual output in the figure does not so. This is | |
284 | -because `あ' is considered a Japanese character by \LuaTeX-ja, | |
285 | -when \LuaTeX-ja does the decision whether U+FFFFF will be added to the | |
286 | -input line~2. | |
287 | - | |
288 | -\begin{figure} | |
289 | -\begin{LTXexample} | |
290 | -\font\x=IPAMincho \x | |
291 | -\ltjsetparameter{jacharrange={-6}}xあ | |
292 | -y | |
293 | -\end{LTXexample} | |
294 | -\caption{A notable sample showing the treatment of a line-break after a | |
295 | -Japanese character.}\label{fig-linebreak} | |
296 | -\end{figure} | |
297 | - | |
298 | -\subsection{Separation between `real' fonts and metrics} | |
299 | -\label{ssec-sepmet} | |
300 | - | |
301 | -Traditionally, most Japanese fonts used in typesetting are not | |
302 | -proportional, that is, most glyphs have same size (in most cases, | |
303 | -square-shaped). Hence, it is not rare that the contents of different | |
304 | -JFMs are essentially same, and only differ in their names. For example, | |
305 | -|min10.tfm| and |goth10.tfm|, which are JFMs shipped with \pTeX\ for | |
306 | -seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family, | |
307 | -differ their |FAMILY| and |FACE| only. Moreover, |jis.tfm| and | |
308 | -|jisg.tfm|, which is included in the \emph{jis} font metric, which is | |
309 | -used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦), | |
310 | -are totally same as binary files. Considering this situation, we | |
311 | -decided to separate `real' fonts and metrics used for them in | |
312 | -\LuaTeX-ja. Typical declarations of Japanese fonts in the style of plain | |
313 | -\TeX\ are shown in Figure~\ref{fig-jfdef}. We would like to add several | |
314 | -remarks: | |
315 | -\begin{itemize} | |
316 | -\item A control sequence |\jfont| must be used for Japanese fonts, instead of |\font|. | |
317 | -\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so | |
318 | - \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be | |
319 | - used as the first line in Figure~\ref{fig-jfdef}. | |
320 | -\item The |jfm| key specifies the metric for the font. In | |
321 | - Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a | |
322 | - Lua script named |jfm-ujis.lua|. This metric is the standard | |
323 | - metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf} | |
324 | - package~\cite{otf}. | |
325 | -\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded | |
326 | - fonts. When one displays a pdf with these fonts, actual fonts which | |
327 | - will be used for them depend on a pdf reader. | |
328 | -\end{itemize} | |
329 | -The specification of a metric for \LuaTeX-ja is similar to that of a JFM | |
330 | -(see \cite{ptexjp}); characters are grouped into several classes, the | |
331 | -size information of characters are specified for each class, and | |
332 | -glue/kern insertions are specified for each pair of classes. Although | |
333 | -the author have not tried, it may be possible to develop a program that | |
334 | -`converts' a JFM to a metric for \LuaTeX-ja. \LuaTeX-ja offers three | |
335 | -metrics by default; |jfm-ujis.lua|, |jfm-jis.lua| based on the | |
336 | -\emph{jis} font metric, and |jfm-min.lua| based on old |min10.tfm|. | |
337 | - | |
338 | - Note that |-kern| in features | |
339 | -is important, because kerning information from a real font itself will | |
340 | -clash with glue/kern information from the metric. | |
341 | - | |
342 | -\begin{figure} | |
343 | -\begin{verbatim} | |
344 | -\jfont\foo=file:ipam.ttf:jfm=ujis;script=latn;-kern;+jp04 at 12pt | |
345 | -\jfont\bar=psft:Ryumin-Light:jfm=ujis at 10pt | |
346 | -\end{verbatim} | |
347 | -\caption{Typical declarations of Japanese fonts.} | |
348 | -\label{fig-jfdef} | |
349 | -\end{figure} | |
350 | - | |
351 | -\subsection{Insertion of glues/kerns for Japanese typesetting: timing} | |
352 | -\label{ssec-jglue} | |
353 | - | |
354 | -As described in \cite{luatexref}, \LuaTeX's kerning and ligaturing | |
355 | -processes are totally different from those of \TeX82. \TeX82's process is | |
356 | -done just when a (sequence of) character is appended to the current | |
357 | -list. Thus we can interrupt this process by writing as | |
358 | -|f{}irm|. However, \LuaTeX's process is \emph{node-based}, that is, the | |
359 | -process will be done when a horizontal box or a paragraph is ended, so | |
360 | -|f{}irm| and |firm| yield same outputs under \LuaTeX. | |
361 | - | |
362 | -The situation for Japanese characters is more complicated. | |
363 | -Glues (and kerns) which are needed for Japanese | |
364 | -typesetting are divided into the following three categories: | |
365 | -\begin{itemize} | |
366 | -\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue}, | |
367 | - for short). | |
368 | - | |
369 | -\item Default glue between a Japanese character and an alphabetic | |
370 | - character (\emph{xkanjiskip}, for short), usually 1/4 of | |
371 | - full-width (\emph{shibuaki}) with some stretch and shrink for | |
372 | - justifying each line. | |
373 | -\item Default glue between two consecutive Japanese characters | |
374 | - (\emph{kanjiskip}, for short). The main reason of this glue is to | |
375 | - enable breaking lines almost everywhere in Japanese texts. In most | |
376 | - cases, its natural width is zero, and some stretch/shrink for | |
377 | - justifying each line. | |
378 | -\end{itemize} | |
379 | -In \pTeX, these three kinds of glues are treated differently. A JFM glue | |
380 | -is inserted when a (sequence of) Japanese character is appended to the | |
381 | -current list, same as the case of alphabetic characters in \TeX82. This | |
382 | -means that one can interrupt the insertion process by saying |{}|. A | |
383 | -\emph{xkanjiskip} is inserted just before `hpack' or line-breaking of a | |
384 | -paragraph; this timing is somewhat similar to that of \LuaTeX's kerning | |
385 | -process. Finally, A \emph{kanjiskip} is not appeared as a node anywhere; | |
386 | -only appears implicitly in calculation of the width of a horizontal box, | |
387 | -that of breaking lines, and the actual output process to a DVI | |
388 | -file. These specifications have made \pTeX's behavior very hard to | |
389 | -understand. | |
390 | - | |
391 | -\LuaTeX-ja inserts glues in all three categories simultaneously inside | |
392 | -|hpack_filter| and |pre_linebreak_filter| callbacks. The reasons of | |
393 | -this specification are to behave like alphabetic characters in \LuaTeX\ | |
394 | -(as described in the first paragraph in this subsection), and to clarify | |
395 | -the specification for \LuaTeX-ja's process. | |
396 | - | |
397 | -\subsection{Insertion of glues/kerns for Japanese typesetting: specification} | |
398 | -\label{ssec-jspec} | |
399 | - | |
400 | -\begin{table} | |
401 | -\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.} | |
402 | -\label{tab-jfmglue} | |
403 | -\begin{center} | |
404 | -\begin{tabular}{llllllll} | |
405 | -\toprule | |
406 | -&\multicolumn{1}{c}{(1)}&\multicolumn{1}{c}{(2)}&\multicolumn{1}{c}{(3)}&\multicolumn{1}{c}{(4)}\\ | |
407 | -Input &|あ】{}【〕\/〔| &|い』\/a| &|う)\hbox{}(| &|え]\special{}[|\\\midrule | |
408 | -\pTeX &あ】\hbox{}【〕\hbox{}〔&い』\/a &う)\hbox{}( &え]\hbox{}[\\ | |
409 | -\LuaTeX-ja &あ】{}【〕\/〔 &い』\/a &う)\hbox{}( &え]\special{}[\\ | |
410 | -\bottomrule | |
411 | -\end{tabular} | |
412 | -\end{center} | |
413 | -\end{table} | |
414 | - | |
415 | -\begin{figure} | |
416 | -\begin{center} | |
417 | -\fontsize{40}{40}\selectfont | |
418 | -\imagfm{\jstrut あ}% | |
419 | -\imagfm{\jstrut 】\inhibitglue}% | |
420 | -\imagfm{\jstrut\kern.5\zw}% | |
421 | -\imagfm{\jstrut\kern.5\zw}% | |
422 | -\imagfm{\jstrut\inhibitglue【}% | |
423 | -\imagfm{\jstrut 〕\inhibitglue}% | |
424 | -\imagfm{\jstrut\kern.5\zw}% | |
425 | -\imagfm{\jstrut\kern.5\zw}% | |
426 | -\imagfm{\jstrut\inhibitglue〔}% | |
427 | -\end{center} | |
428 | -\caption{Detail of the output of \pTeX\ in the input~(1) in Table~\ref{tab-jfmglue}.} | |
429 | -\label{fig-ptexjfm} | |
430 | -\end{figure} | |
431 | - | |
432 | -Now we will take a look at the insertion process itself through four points. | |
433 | - | |
434 | -\begin{description} | |
435 | -\item[Ignored nodes] | |
436 | -As noted in the previous subsection, the insertion process in \pTeX\ can | |
437 | - be interrupted by saying |{}| or anything else.\footnote{This | |
438 | - is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for | |
439 | - \texttt{min10.tfm} and other `old' JFMs work.} This leads the | |
440 | - second row in Table~\ref{tab-jfmglue}, or | |
441 | - Figure~\ref{fig-ptexjfm}. Here `the process is interrupted' | |
442 | - means that \pTeX\ does not think the letter `】\inhibitglue' | |
443 | - is followed by `\inhibitglue【', hence two half-width glues | |
444 | - are inserted between `】\inhibitglue' and `\inhibitglue【', | |
445 | - where the left one is from `】\inhibitglue' and the right one | |
446 | - is from `\inhibitglue【'. | |
447 | - | |
448 | - On the other hand, in \LuaTeX-ja, the process is done inside | |
449 | - |hpack_filter| and |pre_linebreak_filter| callbacks. Hence, | |
450 | - \emph{anything that does not make any node will be | |
451 | - ignored}\ in \LuaTeX-ja, as shown in (1) in | |
452 | - Table~\ref{tab-jfmglue}. \LuaTeX-ja also ignores any nodes | |
453 | - which does not make any contribution to current horizontal | |
454 | - list---\emph{ins\_node}, \emph{adjust\_node}, | |
455 | - \emph{mark\_node}, \emph{whatsit\_node} and | |
456 | - \emph{penalty\_node}---, as shown in (4). | |
457 | - | |
458 | - | |
459 | -By the way, around a \emph{glyph\_node} $p$ there may be some nodes | |
460 | - attached to~$p$. These are an accent and kerns for | |
461 | - moving it to the right place, and a kern from the italic | |
462 | - correction\footnote{\TeX82 (and \LuaTeX) does not distinguish | |
463 | - between explicit kern and a kern for italic correction. To | |
464 | - distinguish them, an additional subtype for a kern is introduced | |
465 | - in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and | |
466 | - redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that | |
467 | - these attachments should be ignored inside the process. Hence | |
468 | - \LuaTeX-ja takes this approach, as the latest version of | |
469 | - \pTeX\ (version~p3.2). This explains (2) in the Table~\ref{tab-jfmglue}. | |
470 | - | |
471 | -Summerizing above, one should put an empty horizontal box |\hbox{}| to | |
472 | - where he/she wants to interrupt the insertion process in | |
473 | - \LuaTeX-ja as (3) in the Table~\ref{tab-jfmglue}. | |
474 | - | |
475 | -\item[Fonts with the same metric] | |
476 | -Recall that \LuaTeX-ja separates `real' fonts and metrics, as in Subsection~\ref{ssec-sepmet}. | |
477 | -Consider the following input, where all Japanese fonts use same metric | |
478 | - (in \LuaTeX-ja), and |\gt| selects \emph{gothic} family for | |
479 | - the current Japanese font family: | |
480 | -\begin{quote} | |
481 | -\begin{verbatim} | |
482 | -明朝)\gt (ゴシック | |
483 | -\end{verbatim} | |
484 | -\end{quote} | |
485 | -If the above input is processed by \pTeX, because the insertion process is | |
486 | - interrupt by |\gt|, the result looks like | |
487 | -\begin{quote} | |
488 | -\mc 明朝)\hbox{}\gt (ゴシック | |
489 | -\end{quote} | |
490 | -However this seems to be unnatural, since two Japanese fonts in the | |
491 | - output use the same metric, i.e.,~the same | |
492 | - typesetting rule. Hence, we decided that Japanese fonts with | |
493 | - the same metric are treated as one font in the insertion | |
494 | - process of \LuaTeX-ja. Thus, the output from the above input | |
495 | - in \LuaTeX-ja looks like: | |
496 | -\begin{quote} | |
497 | -\mc 明朝)\gt (ゴシック | |
498 | -\end{quote} | |
499 | -One might have the situation that this default behavior is not | |
500 | - suitable. \LuaTeX-ja offers a way to handle this situation, but | |
501 | - we leave it to the manual~\cite{man}. | |
502 | - | |
503 | -\item[Fonts with different metrics] | |
504 | -The case where two consecutive Japanese characters use different metrics and/or | |
505 | - different size is similar. Consider the following input where | |
506 | - the \emph{mincho} family and the \emph{gothic} family use | |
507 | - different metrics: | |
508 | -\begin{quote} | |
509 | -\begin{verbatim} | |
510 | -漢)\gt (漢)\large (大 | |
511 | -\end{verbatim} | |
512 | -\end{quote} | |
513 | -As the previous paragraph, this input yields the following, by \pTeX: | |
514 | -\begin{quote} | |
515 | -\mc 漢)\hbox{}\gt (漢)\hbox{}\large (大 | |
516 | -\end{quote} | |
517 | -We had thought that amounts of spaces between parentheses in above output | |
518 | - are too much. Hence we have changed the default behavior of | |
519 | - \LuaTeX-ja, so that the amount of a glue between two Japanese | |
520 | - characters with different metrics is the \emph{average} of a glue | |
521 | - from the left character and that from the right | |
522 | - character. For example, Figure~\ref{fig-diffmet} shows the | |
523 | - output from above input. The width of glue indicated `(1)' is | |
524 | - $(a/2 + a/2)/2 = 0.5a$, and the width of glue indicated `(2)' | |
525 | - is $(a/2 + 1.2a/2)/2 = 0.55a$. This default behavior can be | |
526 | - changed by \textsf{diffrentmet} parameter of \LuaTeX-ja. | |
527 | - | |
528 | -\begin{figure} | |
529 | -\begin{center} | |
530 | -\fontsize{40}{40}\selectfont | |
531 | -\imagfm{\jstrut\smash{% | |
532 | - \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr漢\cr | |
533 | - \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$a$}\ | |
534 | - \hrulefill\vrule height .5ex depth .5ex\cr}}}}% | |
535 | -\imagfm{\jstrut )\inhibitglue}% | |
536 | -\hbox to .5\zw{\hss\normalsize (1)\hss}% | |
537 | -\imagfm{\jstrut\inhibitglue\gt (}% | |
538 | -\imagfm{\jstrut\gt 漢}% | |
539 | -\imagfm{\jstrut\gt )\inhibitglue}% | |
540 | -\hbox to .55\zw{\hss\normalsize (2)\hss}% | |
541 | -\imagfm{\fontsize{48}{48}\selectfont\jstrut\gt\inhibitglue (}% | |
542 | -\imagfm{\fontsize{48}{48}\selectfont\jstrut\smash{% | |
543 | - \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr\gt 大\cr | |
544 | - \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$1.2a$}\ | |
545 | - \hrulefill\vrule height .5ex depth .5ex\cr}}}} | |
546 | -\end{center} | |
547 | -\caption{Fonts with different metrics.} | |
548 | -\label{fig-diffmet} | |
549 | -\end{figure} | |
550 | - | |
551 | -\item[\emph{kanjiskip} and \emph{xkanjiskip}] | |
552 | -In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named | |
553 | - |\xkanjiskip|. A well-known defect of this implementation is | |
554 | - that the value of \emph{xkanjiskip} is not connected with the | |
555 | - size of the currnt Japanese font. It seems that |EXTRASPACE|, | |
556 | - |EXTRASTRETCH|, |EXTRASHRINK| parameters in a JFM are | |
557 | - reserved for specifying the default value of | |
558 | - \emph{xkanjiskip} in a unit of the design size, but \pTeX\ | |
559 | - did not use these parameters, actually. | |
560 | - | |
561 | -Considering this situation of p\TeX, \LuaTeX-ja can use the value of | |
562 | - \emph{xkanjiskip} that specified in a metric. If the value of | |
563 | - \emph{xkanjiskip} on user side (this is the value of | |
564 | - \textsf{xkanjiskip} parameter of |\ltjsetparameter|) is | |
565 | - |\maxdimen|, then \LuaTeX-ja use the specification from | |
566 | - the current used metric as the actual value of | |
567 | - \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}. | |
568 | -\end{description} | |
569 | - | |
570 | -\section{Distinction of characters} | |
571 | -\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode | |
572 | -characters natively, it is a major problem that how we distinguish | |
573 | -Japanese characters and alphabetic characters. For example, the | |
574 | -multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1 | |
575 | -Supplement in Unicode) and in the basic Japanese character set | |
576 | -JIS~X~0208. It is not desirable that this character is always treated as | |
577 | -an alphabetic character, because this symbol is often used in the sense | |
578 | -of `negative' in Japan. | |
579 | - | |
580 | -\subsection{Character ranges} | |
581 | -Before we describe the approach taken is \LuaTeX-ja, we review the | |
582 | -approach taken by u\pTeX. u\pTeX\ extends the |\kcatcode| primitive in | |
583 | -\pTeX, to use this primitive for setting how a character is treated | |
584 | -among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17), | |
585 | -\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18). | |
586 | -The assignment to |\kcatcode| can be done by a Unicode | |
587 | -block.\footnote{There are some exceptions. For example, U+FF00--FFEF | |
588 | -(Halfwidth and Fullwidth Forms) are divided into three blocks in recent | |
589 | -u\pTeX.} | |
590 | - | |
591 | -\LuaTeX-ja adopted a different approach. There are many Unicode blocks | |
592 | - in Basic Multilingual Plane which are not included in | |
593 | - Japanese fonts, therefore it is inconvenient if we process by a Unicode | |
594 | - block. Furthermore, JIS~X~0208 are not just union of Unicode | |
595 | - blocks; for example, the intersection of JIS~X~0208 and | |
596 | - Latin-1 Supplement is shown in | |
597 | - Table~\ref{tab-inter}. Considering these two points, to | |
598 | - customize the range of Japanese characters in \LuaTeX-ja, one | |
599 | - has to define ranges of character codes in his source in advance. | |
600 | - | |
601 | - | |
602 | -\begin{table} | |
603 | -\caption{Intersection of JIS~X~0208 and Latin-1 Supplement.} | |
604 | -\label{tab-inter} | |
605 | -\begin{center} | |
606 | -\begin{tabular}{llll} | |
607 | -\ltjjachar"A7 (U+00A7),& | |
608 | -\ltjjachar"A8 (U+00A8),& | |
609 | -\ltjjachar"B0 (U+00B0),& | |
610 | -\ltjjachar"B1 (U+00B1),\\ | |
611 | -\ltjjachar"B4 (U+00B4),& | |
612 | -\ltjjachar"B6 (U+00B6),& | |
613 | -\ltjjachar"D7 (U+00D7),& | |
614 | -\ltjjachar"F7 (U+00F7) | |
615 | -\end{tabular} | |
616 | -\end{center} | |
617 | -\end{table} | |
618 | - | |
619 | - | |
620 | -We note that \LuaTeX-ja offers two additional control sequences, | |
621 | - |\ltjjachar| and |\ltjalchar|. They are similar to |\char| | |
622 | - primitive, however |\ltjjachar| always yields a Japanese character, provided that | |
623 | - the argument is more than or equal to 128, and |\ltjalchar| always | |
624 | - yields an alphabetic character, regardless of the argument. | |
625 | - | |
626 | -\subsection{Default setting of ranges} | |
627 | -Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character | |
628 | -ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are | |
629 | -just the union of Unicode blocks, and determined from the Adobe-Japan1-6 | |
630 | -character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges, | |
631 | -the ranges~2, 3, 6, 7, and~8 are considered ranges of Japanese | |
632 | -characters, and others are considered ranges of alphabetic | |
633 | -characters.\footnote{Note that ranges 3~and~8 are considered ranges of | |
634 | -alphabetic characters in this paper.} We remark on ranges 2~and~8: | |
635 | -\begin{description} | |
636 | -\item[The range~2] | |
637 | -JIS~X~0208 includes Greek letters and Cyrillic letters, however, these | |
638 | - letters cannot be used for typesetting Greek or Russian, of | |
639 | - course. Hence it is reasonable that Greek letters and | |
640 | - Cyrillic consist another character range. | |
641 | -\item[The range~8] | |
642 | -If one want to use 8-bit TFMs, such as T1 or TS1 encodings, he should | |
643 | - mark this range~8 as a range of alphabetic characters by | |
644 | -\begin{quote} | |
645 | -|\ltjsetparameter{jacharrange={-8}}| | |
646 | -\end{quote} | |
647 | -This is because some 8-bit TFMs have a glyph in this range; for example, | |
648 | - the character `\OE' is located at |"D7| in the T1 encoding. %" | |
649 | -\end{description} | |
650 | - | |
651 | - | |
652 | -\begin{table} | |
653 | -\caption{Predefined ranges in \LuaTeX-ja.} | |
654 | -\label{tab-chrrng} | |
655 | -\begin{center} | |
656 | -\begin{tabular}{@{\bf}rl} | |
657 | -1&(Additional) Latin characters which are not belonged in the range~8.\\ | |
658 | -2&Greek and Cyrillic letters.\\ | |
659 | -3&Punctuations and miscellaneous symbols.\\ | |
660 | -4&Unicode blocks which does not intersect with Adobe-Japan1-6.\\ | |
661 | -5&Surrogates and supplementary private use Areas.\\ | |
662 | -6&Characters used in Japanese typesetting.\\ | |
663 | -7&Characters possibly used in CJK typesetting, but not in Japanese.\\ | |
664 | -8&Characters in Table~\ref{tab-inter}. | |
665 | -\end{tabular} | |
666 | -\end{center} | |
667 | -\end{table} | |
668 | - | |
669 | -\subsection{Control sequences producing Unicode characters} | |
670 | -\label{ssec-unichar} | |
671 | - | |
672 | -The \emph{fontspec} package\footnote{Preciously saying, it is the | |
673 | -\emph{xunicode} package, originally a package for \XeTeX and | |
674 | -automatically loaded by the \emph{fontspec} package.} offers various | |
675 | -control sequences that produce Unicode characters. However, these | |
676 | -control sequences as it stands cannot work correctly with the default | |
677 | -range setting of \LuaTeX-ja. For example, |\textquotedblleft| is just | |
678 | -an abbreviation of |\char"201C\relax|, and the character U+201C (LEFT %" | |
679 | -DOUBLE QUOTATION MARK) is treated as an Japanese character, because it | |
680 | -belongs to the range~3. This problem is resolved by using |\ltjalchar| | |
681 | -instead of the |\char| primitive. It is included in an optional package | |
682 | -named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt} | |
683 | -shows several ways o typeset a character , both as a Japanese character | |
684 | -and as as an alphabetic characters. | |
685 | - | |
686 | -\begin{figure} | |
687 | -\begin{LTXexample} | |
688 | -×, \char`×, % depend on range setting | |
689 | -\ltjalchar`×, % alphabetic char | |
690 | -\ltjjachar`×, % Japanese char | |
691 | -\texttimes % alph. char (by fontspec) | |
692 | -\end{LTXexample} | |
693 | -\caption{Control sequences producing a Unicode character.} | |
694 | -\label{fig-unitxt} | |
695 | -\end{figure} | |
696 | - | |
697 | -The situation looks similar in math formulas, but in fact it differs. | |
698 | -Each control sequence that represents an ordinary symbol defined by the | |
699 | -\emph{unicode-math} package is just synonym of a character. For example, | |
700 | -the meaning of |\otimes| is just the character U+2297 (CIRCLED TIMES), | |
701 | -which is included in the range~3. However, it is difficult to define a | |
702 | -control sequence like |\ltjalUmathchar| as a counterpart of | |
703 | -|\Umathchar|, since an input like `|\sum^\ltjalUmathchar ...|' has to be | |
704 | -permitted. | |
705 | - | |
706 | -However, we couldn't develop a satisfactory solution to this problem in | |
707 | -time for this paper, due to a lack of time. We are just testing a | |
708 | -solution below: | |
709 | -\begin{itemize} | |
710 | -\item \LuaTeX-ja has a list of character codes which will be always reated as | |
711 | - alphabetic characters in math mode. Considering 8-bit TFMs for | |
712 | - math symbols, this list includes natural numbers between |"80| and | |
713 | - |"FF| by default. | |
714 | -\item Redefine internal commands defined in the \emph{unicode-math} | |
715 | - package so that | |
716 | -codes of characters which are mentioned in the \emph{unicode-math} | |
717 | - package will be included in the list. | |
718 | -\end{itemize} | |
719 | - | |
720 | - | |
721 | -We would like to extend treatments described in this subsection to 8-bit | |
722 | -font encodings, but we leave it to further development too. | |
723 | - | |
724 | -\section{Current status of development} | |
725 | -\label{sec:current_status} | |
726 | -At the moment, \LuaTeX-ja can be used under plain \TeX, and under | |
727 | -\LaTeXe. Generally speaking, one only has to read |luatexja.sty|, by | |
728 | -|\input| command or |\usepackage| (in~\LaTeXe), if you merely want to | |
729 | -typeset Japanese characters. We look more detail by parts. | |
730 | - | |
731 | -\subsection{`Engine extension'} | |
732 | -The lowest part of \LuaTeX-ja corresponds to the \pTeX\ extension as | |
733 | -\emph{an engine extension of \TeX}. We, the project menbers, think that | |
734 | -this part is almost done. There is one more feature of \LuaTeX-ja which | |
735 | -we are going to explain: | |
736 | - | |
737 | -\begin{description} | |
738 | -\item[Shifting baseline] | |
739 | -In order to make a match between Japanese fonts and alphabetic fonts, | |
740 | - sometimes shifting the baseline of alphabetic characters may | |
741 | - be needed. \pTeX\ has a dimension |\ybaselineshift|, which | |
742 | - corresponds to the amount of shifting down the baseline of alphabetic | |
743 | - characters. This is useful for Japanese-based documents, but | |
744 | - not for documents mainly in languages with alphabetic | |
745 | - characters. | |
746 | - | |
747 | -Hence, \LuaTeX-ja extends \pTeX's |\ybaselineshift| to Japanese | |
748 | - characters. Namely, \LuaTeX-ja offers two parameters, | |
749 | - \textsf{yjabaselineshift} and \textsf{yalbaselineshift}, for the | |
750 | - amount of shifting the baseline of Japanese characters and | |
751 | - that of alphabetic characters, respectively. | |
752 | -\begin{figure} | |
753 | -\begin{center} | |
754 | -\fontsize{40}{40}\selectfont\fboxsep0mm | |
755 | -\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth | |
756 | -\hbox to 0.9\linewidth{% | |
757 | -\hfil | |
758 | -\raise-10pt\imagfm{\jstrut 漢}% | |
759 | -\raise-10pt\imagfm{\jstrut 字}\hskip.25\zw% | |
760 | -\imagfm{p}% | |
761 | -\imagfm{h}% | |
762 | -\hfil\hfil | |
763 | -\imagfm{\jstrut 漢}% | |
764 | -\imagfm{\jstrut 字}\hskip.25\zw% | |
765 | -\raise-10pt\imagfm{p}% | |
766 | -\raise-10pt\imagfm{h}% | |
767 | -\hfil | |
768 | -} | |
769 | -\end{center} | |
770 | - | |
771 | -\caption{First example of shifting baseline.} | |
772 | -\label{fig-bls} | |
773 | -\end{figure} | |
774 | - | |
775 | -\begin{figure} | |
776 | -\begin{center} | |
777 | -\fontsize{30}{30}\selectfont\fboxsep0mm | |
778 | -\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth | |
779 | -\hbox to 0.9\linewidth{% | |
780 | -\hfil | |
781 | -\imagfm{a}% | |
782 | -\imagfm{b}\hskip.25\zw% | |
783 | -\imagfm{\jstrut 本}% | |
784 | -\imagfm{\jstrut 文}\hskip.33333\zw% | |
785 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut\inhibitglue (}% | |
786 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 注}% | |
787 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 釈}\hskip.1666667\zw% | |
788 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont c}% | |
789 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont o}% | |
790 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}% | |
791 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}% | |
792 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont e}% | |
793 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont n}% | |
794 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont t}% | |
795 | -\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut )\inhibitglue}% | |
796 | -\hskip.33333\zw% | |
797 | -\imagfm{\jstrut 本}% | |
798 | -\imagfm{\jstrut 文}% | |
799 | -\hfil | |
800 | -} | |
801 | -\end{center} | |
802 | - | |
803 | -\caption{Second example of shifting baseline.} | |
804 | -\label{fig-small} | |
805 | -\end{figure} | |
806 | - | |
807 | -An example output is shown in Figure~\ref{fig-bls}. The left half is the | |
808 | - output when \textsf{yjabaselineshift} is positive, hence the | |
809 | - baseline of Japanese characters is shifted down. On the other | |
810 | - hand, the right half is the output when | |
811 | - \textsf{yalbaselineshift} is positive, hence the baseline of | |
812 | - alphabetic characters is shifted down. Figure~\ref{fig-small} | |
813 | - shows an intresting use of these parameters. | |
814 | - | |
815 | -\end{description} | |
816 | -Note that \LuaTeX-ja doesn't support vertical typesetting, \emph{tategaki}, for now. | |
817 | - | |
818 | -\subsection{Patches for plain \TeX\ and \LaTeXe} | |
819 | -\pTeX\ has a patch for plain \TeX, namely |ptex.tex|, that for \LaTeXe\ | |
820 | -macro (this patch and \LaTeXe\ consist \emph{p\LaTeXe}), and | |
821 | -|kinsoku.tex| which includes the default setting of \emph{kinsoku | |
822 | -shori}, the Japanese hyphenation. We ported them to \LuaTeX-ja, except | |
823 | -the codes related to vertical typesetting, because \LuaTeX-ja doesn't | |
824 | -support vertical typesetting yet. We remark one point related to the | |
825 | -porting: | |
826 | -\begin{description} | |
827 | - | |
828 | -\item[Behavior of\/ {\tt\char92fontfamily\/}] | |
829 | -The control sequence |\fontfamily| in p\LaTeXe\ changes the current alphabetic | |
830 | - font family and/or the current Japanese font family, | |
831 | - depending the argument. More concretely, | |
832 | - |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the | |
833 | - current alphabetic font family to $\langle\hbox{\it | |
834 | - arg\/}\rangle$, if and only if one of the following | |
835 | - conditions are satisfied: | |
836 | -\begin{itemize} | |
837 | -\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ in | |
838 | - \emph{some} alphabetic encoding is already defined in the document. | |
839 | -\item There exists an alphabetic encoding $\langle\hbox{\it | |
840 | - enc\/}\rangle$ already defined in the document such that a font | |
841 | - definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it | |
842 | - arg\/}\rangle$|.fd| (all lowercase) exists. | |
843 | -\end{itemize} | |
844 | -The same criterion is used for changing Japanese font family. | |
845 | - | |
846 | -To work this behavior well, a list of all (alphabetic) encodings defined | |
847 | - already in the document is needed. However, since \LuaTeX-ja | |
848 | - is loaded as a package, \LuaTeX-ja cannot have this list. | |
849 | - Hence \LuaTeX-ja adopted a different approach, namely | |
850 | - |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the | |
851 | - current alphabetic font family to $\langle\hbox{\it | |
852 | - arg\/}\rangle$, if and only if: | |
853 | -\begin{itemize} | |
854 | -\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ | |
855 | - in the current alphabetic encoding $\langle\hbox{\it | |
856 | - enc\/}\rangle$ is already defined in the document. | |
857 | -\item A font definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it | |
858 | - arg\/}\rangle$|.fd| (all lowercase) exists. | |
859 | -\end{itemize} | |
860 | - | |
861 | - | |
862 | -\end{description} | |
863 | - | |
864 | - | |
865 | - | |
866 | -\subsection{Classes for Japanese documents} | |
867 | -To produce `high-quality' Japanese documents, we need not only that | |
868 | -Japanese characters are correctly placed, but also class files for | |
869 | -Japanese documents. Two major families of classes are widely used in Japan: | |
870 | -\emph{jclasses} which is distributed with the official p\LaTeXe\ macros, | |
871 | -and \emph{jsclasses}. At the present, \LuaTeX-ja | |
872 | -simply contains their counterparts: \emph{ltjclasses} and | |
873 | -\emph{ltjsclasses}. However, the policy on classes is not determined | |
874 | -now, and we hope to have another family of classes which are useful for | |
875 | -commercial printing. In the author's opinion, \emph{ltjclasses} is | |
876 | -better to stay as an example of porting of class files for \pTeX\ to | |
877 | -\LuaTeX-ja. | |
878 | - | |
879 | -\subsection{Patches for packages} | |
880 | -Apart from patches for the \LaTeXe~kernel and classes for Japanese | |
881 | -documents, we need to make patches for several packages. At the present, | |
882 | -we considered the following packages, and made patches or porting for | |
883 | -the former two packages. | |
884 | - | |
885 | -\begin{description} | |
886 | -\item[The \emph{fontspec} package] The \emph{fontspec} package is built | |
887 | - on NFSS2, hence control sequences offered by the | |
888 | - \emph{fontspec} package, such as |\setmainfont|, are only | |
889 | - effective for alphabetic fonts if \LuaTeX-ja is loaded. | |
890 | - \texttt{luatexja-\penalty0fontspec.sty} (not automatically | |
891 | - loaded) offers these counterparts for Japanese fonts, with | |
892 | - additional `j' in the name of control sequences, such as | |
893 | - |\setmainjfont|. As described in | |
894 | - Subsection~\ref{ssec-unichar}, it also includes a patch for | |
895 | - control sequences producing Unicode characters. | |
896 | - | |
897 | -\item[The \emph{otf} package] | |
898 | -This package is widely used in \pTeX\ for typesetting characters which is | |
899 | -not in JIS~X~0208, and for using more than one weight in \emph{mincho} | |
900 | -and \emph{gothic} font families. Therefore \LuaTeX-ja supports features | |
901 | -in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty} | |
902 | - manually. Note that characters by |\UTF{xxxx}| and | |
903 | - |\CID{xxxx}| are not appended to the current list as a | |
904 | - \emph{glyph\_node}, to avoid from callbacks by the | |
905 | - \emph{luaotfload} package. We have another remark; |\CID| | |
906 | - does not work with TrueType fonts, since |\CID| use the | |
907 | - conversion table between CID and the glyph order of the | |
908 | - current Japanese font. | |
909 | - | |
910 | -\item[The \emph{listings} package] | |
911 | -It is known for users of \pTeX\ that there is a patch |jlisting.sty| for | |
912 | - the \emph{listings} package, to use Japanese characters in | |
913 | - the |lstlisting| environment. Generally speaking, it also can | |
914 | - be used in \LuaTeX-ja. However, it seems to be that a | |
915 | - Japanese character after a space does not recieve any process | |
916 | - of the \emph{listings} package; this is inconvinient when we | |
917 | - use the \emph{showexpl} package. | |
918 | - | |
919 | -There is another way to use characters above 256 with the | |
920 | - \emph{listings} package (described in\cite{apl}). However, | |
921 | - this method is not suitable for Japanese, since the number of | |
922 | - Japanese characters is very large. We hope that the | |
923 | - \emph{listings} package will be able to handle all characters above | |
924 | - 256 without any patch, in the future. | |
925 | - | |
926 | - | |
927 | -\end{description} | |
928 | - | |
929 | - | |
930 | - | |
931 | -\section{Implementation} | |
932 | -\label{sec:implementation} | |
933 | -\subsection{Handling of Japanese fonts} | |
934 | -In \pTeX, there are three slots for maintaining current fonts, namely | |
935 | -|\font| for alphabetic fonts, |\jfont| for Japanese fonts (in horizontal | |
936 | -direction) and |\tfont| for Japanese fonts (in vertical direction). With | |
937 | -these slots, we can manage the current font for alphabetic characters | |
938 | -and that for Japanese characters separately in \pTeX. However, \LuaTeX\ | |
939 | -has only one slot for maintaining the current font, as \TeX82. This | |
940 | -situation leads a problem: how can we maintain the `current Japanese | |
941 | -font'? | |
942 | - | |
943 | -There are three approaches for this problem. One approach is to make a | |
944 | -mapping table from alphabetic fonts to corresponding Japanese fonts | |
945 | -(here we don't assume that NFSS2 is available). Another approach is | |
946 | -that we always use composite fonts with alphabetic fonts and Japanese | |
947 | -fonts. The third approach is that the information of the current | |
948 | -Japanese font is stored in an attribute. We adopted the third approach, | |
949 | -since \LuaTeX-ja is much affected by \pTeX\ as we noted in | |
950 | -Subsection~\ref{ssec-pol}. | |
951 | - | |
952 | -As in Figure~\ref{fig-jfdef}, \LuaTeX-ja uses |\jfont| for defining | |
953 | -Japanese fonts, as \pTeX. However, because the information of the current | |
954 | -Japanese font is stored into an attribute, control sequences defined by | |
955 | -|\jfont| (e.g.,~|\foo| and |\bar| in Figure~\ref{fig-jfdef}) is | |
956 | -not representing a font by the means of \TeX82. In other words, each of | |
957 | -these control sequences is just an assignment to an attribute, therefore | |
958 | -they cannot be an argument of |\the|, |\fontname|, nor |\textfont|. | |
959 | - | |
960 | - | |
961 | -Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs | |
962 | -according to OpenType font features, are performed just after `Examination of | |
963 | -stack level' (see Subsections | |
964 | -\ref{ssec-over}~and~\ref{ssec-stack}). Also note that calculation of | |
965 | -character classes for each Japanese character is done \emph{after} the | |
966 | -these callbacks for now. | |
967 | - | |
968 | -\subsection{Stack management} | |
969 | -\label{ssec-stack} | |
970 | - | |
971 | -As we noted in Subsection~\ref{ssec-csname}, parameters that the values | |
972 | -at the end of a horizontal box or that of a paragraph are valid in | |
973 | -whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented | |
974 | -by internal integers or registers of other types in \TeX. We explain it | |
975 | -in this subsection. | |
976 | - | |
977 | -\begin{figure} | |
978 | -\begin{lstlisting} | |
979 | -void package(int c) | |
980 | -{ | |
981 | - ... | |
982 | - d = box_max_depth; | |
983 | - unsave(); | |
984 | - save_ptr -= 4; | |
985 | - if (cur_list.mode_field == -hmode) { | |
986 | - cur_box = filtered_hpack(cur_list.head_field, | |
987 | - cur_list.tail_field, saved_value(1), | |
988 | - saved_level(1), grp, saved_level(2)); | |
989 | - subtype(cur_box) = HLIST_SUBTYPE_HBOX; | |
990 | - } else { | |
991 | -\end{lstlisting} | |
992 | -\caption{An extract of a CWEB-source \texttt{tex/packaging.w} of \LuaTeX.} | |
993 | -\label{fig-ltsrc} | |
994 | -\end{figure} | |
995 | - | |
996 | -Figure~\ref{fig-ltsrc} is an extract of a CWEB-source | |
997 | -\texttt{tex/packaging.w} of \LuaTeX\ (SVN revision 4358). This function | |
998 | -is called just when an explicit |\hbox{...}| or |\vbox{...}| is ended, and | |
999 | -the function |filtered_hpack()| is where the |hpack_filter| and then the | |
1000 | -actual `hpack' process are performed. Notice that the |unsave()| | |
1001 | -function is called before |filtered_hpack()|. This is the problem; | |
1002 | -because of |unsave()|, we can retrive only the values of registers | |
1003 | -\emph{outside} the box, even in the |hpack_filter| callback. | |
1004 | - | |
1005 | -To cope with this problem, \LuaTeX-ja has its own stack system, based on | |
1006 | -Lua codes in \cite{stack-mail}. Furthermore, \emph{whatsit} nodes whose | |
1007 | -\emph{user\_id} is 30112 (\emph{stack\_node}, for short) will be | |
1008 | -appended to the current horizontal list each time the current stack | |
1009 | -level is incremented, and their values are the values of | |
1010 | -|\currentgrouplevel| at that time. In the beginning of the |hpack_filter| | |
1011 | -callback, the list in question is traversed to determine whether the | |
1012 | -stack level at the end of the list and that outside the box coincides. | |
1013 | - | |
1014 | -Let $x$ be the value of |\currentgrouplevel|, and $y$ be the current | |
1015 | -stack level, both inside the |hpack_filter| callback, i.e.,~outside a | |
1016 | -horizontal box. Consider a list which represents the content of the box, | |
1017 | -then we have: | |
1018 | -\begin{itemize} | |
1019 | -\item A \emph{stack\_node} whose value is $x+1$ (because all materials | |
1020 | - in the box are included in a group |\hbox{...}|, the value of | |
1021 | - |\currentgrouplevel| inside the box is at least $x+1$) in the list | |
1022 | - corresponds to an assignment related to the stack system in just | |
1023 | - top-level of the list, like | |
1024 | -\begin{quote} | |
1025 | -\begin{verbatim} | |
1026 | -\hbox{...(assignment)...} | |
1027 | -\end{verbatim} | |
1028 | -\end{quote} | |
1029 | -In this case, the current stack level is incremented to $y+1$ after the assignment. | |
1030 | -\item A \emph{stack\_node} whose value is more than $x+1$ in the list corresponds | |
1031 | -to an assignment inside another group contained in the box. For example, | |
1032 | - the following input creates | |
1033 | -a \emph{stack\_node} whose value is $x+3=(x+1)+2$: | |
1034 | -\begin{quote} | |
1035 | -\begin{verbatim} | |
1036 | -\hbox{...{...{...(assignment)}...}...} | |
1037 | -\end{verbatim} | |
1038 | -\end{quote} | |
1039 | -\end{itemize} | |
1040 | -Thus, we can conclude that the stack level at the end of the list is | |
1041 | -$y+1$, if and only if there is a \emph{stack\_node} whose value is | |
1042 | -$x+1$. Otherwise, the stack level is just $y$. | |
1043 | - | |
1044 | -\subsection{Adjustment of the position of Japanese characters} | |
1045 | -\label{ssec-width} | |
1046 | - | |
1047 | -The size of a glyph specified in a metric and that of a real font | |
1048 | -usually differ. For example, the letter `\inhibitglue【' is half-width | |
1049 | -in |jfm-ujis.lua| or |jis.tfm|, while this letter is full-width like `【' | |
1050 | -in most TrueType fonts used in Japanese typesetting, such as | |
1051 | -IPA~Mincho. Hence the adjustment of position of such glyphs is | |
1052 | -needed. In the context of \pTeX, this process was performed using virtual fonts. | |
1053 | - | |
1054 | -On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph | |
1055 | -into a horizontal box. There are two main reasons why we adopted this | |
1056 | -method; one is that we feared Lua codes for coexisting with callbacks by | |
1057 | -the |luaotfload| package would be large if we use virtual fonts, and the | |
1058 | -other is to cope with shifting of the baseline of characters at the | |
1059 | -same time. | |
1060 | - | |
1061 | -\begin{figure} | |
1062 | -\begin{center}\unitlength=9pt\small | |
1063 | -\begin{picture}(15,12)(-1,-3) | |
1064 | - | |
1065 | -\color{grayx}% real glyph | |
1066 | -\put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength} | |
1067 | - | |
1068 | -\color{black}% real glyph :step1 | |
1069 | -\thicklines | |
1070 | -\put(-1,-1.5){\line(0,1){7}\line(0,-1){2.5}} | |
1071 | -\put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}} | |
1072 | -\put(-1,5.5){\line(1,0){6}} | |
1073 | -\put(-1,-4){\line(1,0){6}} | |
1074 | -\put(-1,0){\makebox(0,0)[r]{\strut$R$\,}} | |
1075 | - | |
1076 | -\thicklines | |
1077 | -\put(0,0){\vector(0,1){9}\line(0,-1){3}\vector(1,0){12}} | |
1078 | -\put(12,9){\makebox(0,0)[rt]{\strut$M$\,}} | |
1079 | -\put(12,0){\line(0,1){9}\vector(0,-1){3}} | |
1080 | -\put(0,9){\line(1,0){12}} | |
1081 | -\put(0,-3){\line(1,0){12}} | |
1082 | -\put(0.2,4.5){\makebox(0,0)[l]{\texttt{height}}} | |
1083 | -\put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}} | |
1084 | -\put(6,0.2){\makebox(0,0)[b]{\texttt{width}}} | |
1085 | - | |
1086 | -\thicklines | |
1087 | -\put(3,0){\line(0,1){7}\line(0,-1){2.5}\line(1,0){6}} | |
1088 | -\put(9,0){\line(0,1){7}\line(0,-1){2.5}} | |
1089 | -\put(3,7){\line(1,0){6}} | |
1090 | -\put(3,-2.5){\line(1,0){6}} | |
1091 | -\newsavebox{\eqdist} | |
1092 | -\savebox{\eqdist}(0,0)[c]{% | |
1093 | - \thinlines | |
1094 | - \put(-0.08,0.2){\line(0,-1){0.4}}% | |
1095 | - \put(0.08,0.2){\line(0,-1){0.4}}} | |
1096 | -\put(1.5,0){\usebox{\eqdist}} | |
1097 | -\put(10.5,0){\usebox{\eqdist}} | |
1098 | - | |
1099 | -\thicklines | |
1100 | -\put(3,-1.5){\vector(-1,0){4}} | |
1101 | -\put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}} | |
1102 | -\put(3,0){\vector(0,-1){1.5}} | |
1103 | -\put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}} | |
1104 | -\end{picture} | |
1105 | -\end{center} | |
1106 | -\caption{The position of the `real' glyph.} | |
1107 | -\label{fig-pos} | |
1108 | -\end{figure} | |
1109 | - | |
1110 | -Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is | |
1111 | -the imaginary body specified in the metric, and a vertical | |
1112 | -rectangle is the imaginary body of a real glyph. First, the real glyph | |
1113 | -is aligned with respect to the width of $M$. In the figure, the real | |
1114 | -glyph is aligned `middle'; this setting is useful for the full-width | |
1115 | -middle dot `・'. We have other settings, `left' and `right'. | |
1116 | -After that, it is shifted according to the value of |left| and |down|, | |
1117 | -which are specified in the metric, too. The final position of the real glyph | |
1118 | -is shown by the gray rectangle~$R$. If the amount of shifting the baseline is | |
1119 | -not zero, $M$ (and hence the real glyph) is shifted by that amount. | |
1120 | - | |
1121 | -We would like to remark briefly on the vertical position of a real | |
1122 | -glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for | |
1123 | -it may have different height or depth. In that case, it may look better | |
1124 | -if the real glyph is shifted vertically to match the height-depth ratio | |
1125 | -specified in the metric, while any vertical adjustment except the | |
1126 | -adjustment by the |down| value does not performed in the present | |
1127 | -implementation of \LuaTeX-ja . This situation is carefully studied by | |
1128 | -Otobe~\cite{min10}. Here the policy on this problem is not determined | |
1129 | -now, however we would like to offer several solutions in future | |
1130 | -development. | |
1131 | - | |
1132 | -\section{Conclusion} | |
1133 | -We have discussed about our \LuaTeX-ja package, which is much affected | |
1134 | -by \pTeX. For now, it can be used for experimental use, however there | |
1135 | -are much refinements which are needed for regular use. The author hopes | |
1136 | -that this paper and \LuaTeX-ja project contribute the typesetting Japanese, | |
1137 | -and possibly other Asian languages, under \LuaTeX. | |
1138 | - | |
1139 | -\section*{Acknowledgements} | |
1140 | -The author would like to thank Ken Nakano and Hideaki Togashi for their | |
1141 | -development of ASCII \pTeX. The author is very grateful to Haruhiko | |
1142 | -Okumura for his leadership in the Japanese \TeX\ community. The author | |
1143 | -is also very grateful to members of \LuaTeX-ja project team for their | |
1144 | -valuable cooperation in development. | |
1145 | - | |
1146 | -%%% The style of the bibiliogrphy is `amsplain'. | |
1147 | -\providecommand{\bysame}{\leavevmode\hbox to3em{\hrulefill}\thinspace} | |
1148 | -\providecommand{\href}[2]{#2} | |
1149 | -\begin{thebibliography}{99} | |
1150 | - | |
1151 | -\bibitem{aj16} | |
1152 | -Adobe Systems Incorporated, \emph{Adobe-Japan1-6 Character Collection | |
1153 | - for CID-Keyed Fonts}, Technical Note~\#5078, 2004. | |
1154 | -\url{http://partners.adobe.com/public/developer/en/font/5078.Adobe-Japan1-6.pdf} | |
1155 | - | |
1156 | -\bibitem{ptex} | |
1157 | -ASCII MEDIA WORKS,アスキー日本語\TeX\ (\pTeX).\url{http://ascii.asciimw.jp/pb/ptex/} | |
1158 | - | |
1159 | -\bibitem{apl} | |
1160 | -John Baker, \emph{Typesetting UTF8 APL code with the \LaTeX\ lstlisting package}. | |
1161 | -\url{http://bakerjd99.wordpress.com/2011/08/15/} | |
1162 | - | |
1163 | -\bibitem{omega} | |
1164 | -Jin-Hwan~Cho and Haruhiko Okumura, \emph{Typesetting CJK Languages with Omega}, | |
1165 | -\TeX, XML, and Digital Typography, Lecture Notes in Computer Science, vol.~3130, | |
1166 | -Springer, 2004, 139--148. | |
1167 | - | |
1168 | -\bibitem{joylua} | |
1169 | -Yannis Haralambous. \emph{The Joy of \LuaTeX}. \url{http://luatex.bluwiki.com/} | |
1170 | - | |
1171 | -\bibitem{jisx4051} | |
1172 | -Japanese Industrial Standards Committee. \emph{JIS~X~4051: Formatting | |
1173 | - rules for Japanese documents}, 1993, 1995, 2004. | |
1174 | - | |
1175 | -\bibitem{eptex} | |
1176 | -北川弘典,$\varepsilon$-\pTeX についてのwiki. | |
1177 | -\url{http://sourceforge.jp/projects/eptex/wiki/FrontPage} | |
1178 | - | |
1179 | -\bibitem{luaums} | |
1180 | -北川弘典,\LuaTeX で日本語. | |
1181 | -\url{http://oku.edu.mie-u.ac.jp/tex/mod/forum/discuss.php?d=378} | |
1182 | - | |
1183 | -\bibitem{luatexref} | |
1184 | -\LuaTeX\ development team, \emph{The \LuaTeX\ reference}. | |
1185 | -\url{http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf} (snapshot of SVN trunk) | |
1186 | - | |
1187 | -\bibitem{man} | |
1188 | -\LuaTeX-ja project team, \emph{The \LuaTeX-ja package}. | |
1189 | -Not completed for now. Available at |doc/man-en.pdf| (in English) or | |
1190 | - |doc/man-ja.pdf| (in Japanese) | |
1191 | -in the Git repository. | |
1192 | - | |
1193 | -\bibitem{luajp-test} | |
1194 | -香田温人,\LuaTeX と日本語. | |
1195 | -\url{http://www1.pm.tokushima-u.ac.jp/~kohda/tex/luatex-old.html} | |
1196 | - | |
1197 | -\bibitem{luajalayout} | |
1198 | -前田一貴,luajalayout パッケージ---Lua\LaTeX によ | |
1199 | - る日本語組版---. | |
1200 | -\url{http://www-is.amp.i.kyoto-u.ac.jp/lab/kmaeda/lualatex/luajalayout/} | |
1201 | - | |
1202 | -\bibitem{jsclasses} | |
1203 | -奥村晴彦,p\LaTeXe 新ドキュメントクラス. | |
1204 | -\url{http://oku.edu.mie-u.ac.jp/~okumura/jsclasses/} | |
1205 | - | |
1206 | -\bibitem{ptexjp} | |
1207 | -Haruhiko Okumura, \emph{\pTeX\ and Japanese Typesetting}, | |
1208 | - The Asian Journal of \TeX\ \textbf{2}~(2008), 43--51. | |
1209 | - | |
1210 | -\bibitem{min10} | |
1211 | -乙部厳己,min10フォントについて. | |
1212 | -\url{http://argent.shinshu-u.ac.jp/~otobe/tex/files/min10.pdf} | |
1213 | - | |
1214 | -\bibitem{otf} | |
1215 | -齋藤修三郎,Open Type Font用VF. | |
1216 | -\url{http://psitau.kitunebi.com/otf.html} | |
1217 | - | |
1218 | -\bibitem{stack-mail} | |
1219 | -Jonathan Sauer, \emph{[Dev-luatex] tex.currentgrouplevel}. | |
1220 | -\url{http://www.ntg.nl/pipermail/dev-luatex/2008-August/001765.html} | |
1221 | - | |
1222 | -\bibitem{uptex} | |
1223 | -Takuji Tanaka, \emph{u\pTeX, up\LaTeX---unicode version of \pTeX, p\LaTeX}. | |
1224 | -\url{http://homepage3.nifty.com/ttk/comp/tex/uptex_en.html} | |
1225 | - | |
1226 | -\bibitem{ptexenc} | |
1227 | -Nobuyuki Tsuchimura, \emph{Development of a Japanese \TeX\ Distribution~`ptetex3'}, | |
1228 | -Computer Software\ \textbf{24} (2007), no.~4, 40--50, (in Japanese). | |
1229 | - | |
1230 | -\bibitem{w3c} | |
1231 | -W3C Working Group, \emph{Requirements for Japanese Text Layout}. | |
1232 | -\url{http://www.w3.org/TR/jlreq/} | |
1233 | -\end{thebibliography} | |
1234 | - | |
1235 | -\end{document} | |
1 | +%#!lualatex ajt-devel-ltja | |
2 | +\documentclass{ajt} | |
3 | + | |
4 | +%%% Packages used in this paper | |
5 | + | |
6 | +%%% Font setting for \LuaTeX; this is extract from ajt.cls | |
7 | +\makeatletter | |
8 | + \if@print | |
9 | + \RequirePackage{fontspec,xunicode} | |
10 | + \RequirePackage{luatextra} | |
11 | + \setmainfont[Mapping=tex-text]{Palatino LT Std} | |
12 | + \setsansfont[Mapping=tex-text]{Optima LT Std} | |
13 | + \else | |
14 | + \RequirePackage{fontspec,luatextra} | |
15 | + \setmainfont[Mapping=tex-text]{TeX Gyre Pagella} % \simeq Palatino | |
16 | + \fi | |
17 | + | |
18 | +%%% LuaTeX-ja | |
19 | +\usepackage{luatexja,luatexja-fontspec} | |
20 | +\ltjsetparameter{jacharrange={-3,-8}} | |
21 | +\DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.92489] file:ipam.ttf:jfm=ujis}{} | |
22 | +\DeclareFontShape{JY3}{gt}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=ujis}{} | |
23 | +% quick hack: monospaced Japanese font by \ttfamily | |
24 | +\DeclareKanjiFamily{JY3}{\ttdefault}{}{} | |
25 | +\DeclareFontShape{JY3}{\ttdefault}{m}{n}{<-> s*[0.92489] file:ipag.ttf:jfm=mono}{} | |
26 | + | |
27 | + | |
28 | +%%% LTXexample environment | |
29 | +\usepackage{showexpl,lltjlisting} | |
30 | +\lstset{basicstyle=\ttfamily\small, width=0.3\textwidth, basewidth=.5em} | |
31 | + | |
32 | +%%% Verbatim environment | |
33 | +\usepackage{fancyvrb} | |
34 | +\CustomVerbatimEnvironment{code}{Verbatim}% | |
35 | +{numbers=left,xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small} | |
36 | +\CustomVerbatimEnvironment{codewithoutnum}{Verbatim}% | |
37 | +{xleftmargin=1.5em,baselinestretch=1.069,fontsize=\small} | |
38 | +\CustomVerbatimEnvironment{codewithoutnumsmall}{Verbatim}% | |
39 | +{xleftmargin=1.5em,baselinestretch=1.0,fontsize=\footnotesize} | |
40 | +\DefineShortVerb{\|} | |
41 | + | |
42 | +%%% Others | |
43 | +\usepackage{mflogo,booktabs} | |
44 | +\definecolor{grayx}{gray}{0.85} | |
45 | +\hyphenation{ | |
46 | + kanjiskip | |
47 | + xkanjiskip | |
48 | +} | |
49 | + | |
50 | +%%% Mandatory article metadata %%% | |
51 | +\title{Development of \LuaTeX-ja package} | |
52 | +\author[北川 弘典]{Hironori Kitagawa} | |
53 | +\address{\LuaTeX-ja project team} | |
54 | +\email{h\_kitagawa2001@yahoo.co.jp} | |
55 | + | |
56 | +\keywords{\TeX, p\TeX, \LuaTeX, \LuaTeX-ja, Japanese} | |
57 | +\abstract{% | |
58 | +\LuaTeX-ja package is a macro package for typesetting Japanese | |
59 | +documents under \LuaTeX. The package has more flexibility of | |
60 | +typesetting than \pTeX, which is widely used Japanese extension of \TeX, | |
61 | +and has corrected some unwanted features of \pTeX. | |
62 | +In this paper, we describe specifications, the current status and some | |
63 | +internal processing methods of \LuaTeX-ja. | |
64 | +} | |
65 | + | |
66 | +\newcommand{\parname}[1]{\textsf{#1}} | |
67 | +\newcommand{\jstrut}{\vrule width0pt height\cht depth\cdp} | |
68 | +\newcommand{\imagfm}[1]{\ifvmode\leavevmode\fi% | |
69 | + \hbox{\fboxsep=0pt\fbox{\setbox0=\hbox{#1}\copy0\kern-\wd0 | |
70 | + \smash{\vrule width \wd0 height 0.4pt depth0.4pt}}}} | |
71 | +\begin{document} | |
72 | + | |
73 | +%%% Do not forget to start with \maketitle! | |
74 | +\maketitle | |
75 | + | |
76 | +\section{Introduction} | |
77 | +\subsection{History} | |
78 | +To typeset Japanese documents with \TeX, ASCII \pTeX~\cite{ptex} has | |
79 | +been widely used in Japan. There are other methods---for example, using | |
80 | +Omega and OTP~\cite{omega}, or with the CJK package---to do so, however, | |
81 | +these alternative methods did not become majority. The author thinks | |
82 | +that this is because \pTeX\ enables us to produce high-quality documents | |
83 | +(e.g.,~supporting vertical typesetting), and the appearance of \pTeX\ is | |
84 | +earlier than that of alternatives described above. | |
85 | + | |
86 | +However, \pTeX\ has been left behind from the extensions of \TeX\ such | |
87 | +as \eTeX\ and \pdfTeX, and the diffusion of UTF-8 encoding. In recent | |
88 | +years, the situation has become better, by development of | |
89 | +|ptexenc|~\cite{ptexenc} by Nobuyuki Tsuchimura (\hbox{土村展之}), | |
90 | +$\varepsilon$-\pTeX~\cite{eptex} by the author,~and u\pTeX~\cite{uptex} | |
91 | +by Takuji Tanaka (田中琢爾). However, continuing this approach, namely, | |
92 | +to develop an engine extension localized for Japanese, is not wise. This | |
93 | +approach needs lots of work for \emph{each} engine. In addition, if we | |
94 | +use \LuaTeX, the necessity of an engine extension is getting smaller | |
95 | +because \LuaTeX\ has an ability to hook \TeX's internal process by using | |
96 | +Lua callbacks. | |
97 | + | |
98 | + | |
99 | +There were several experimental attempts to typeset | |
100 | +Japanese documents with \LuaTeX\ before. Here we cite three examples: | |
101 | +\begin{itemize} | |
102 | +\item |luaums.sty|~\cite{luaums} developed by the author. This | |
103 | + experimental package is for creating a certain Japanese-based presentation | |
104 | + with \LuaTeX. | |
105 | +\item the \emph{luajalayout} package~\cite{luajalayout}, formerly known as the | |
106 | + \emph{jafontspec} package, by Kazuki Maeda (前田一貴). This package is based on | |
107 | + \LaTeXe\ and \emph{fontspec} package. | |
108 | +\item the \emph{luajp-test} package~\cite{luajp-test}, a test package made by | |
109 | + Atsuhito Kohda (香田温人), based on articles on the web page~\cite{joylua}. | |
110 | +\end{itemize} | |
111 | +However, these packages are based on \LaTeXe, and do not have much | |
112 | +ability to control the typesetting rule. And it is inefficient that more | |
113 | +than one people separately develop similar packages. Development of the | |
114 | +\LuaTeX-ja package is started initially by the author and Kazuki Maeda, because of | |
115 | +these situations. | |
116 | + | |
117 | +\subsection{Development policy of \LuaTeX-ja} | |
118 | +\label{ssec-pol} | |
119 | +The first aim of \LuaTeX-ja project was to implement features (from the | |
120 | +`primitive' level) of \pTeX\ as macros under \LuaTeX, therefore \LuaTeX-ja is | |
121 | +much affected by \pTeX. However, as development proceeded, some | |
122 | +technical/conceptual difficulties arose. Hence we changed the aim | |
123 | +of the project as follows: | |
124 | +\begin{itemize} | |
125 | +\item\emph{\LuaTeX-ja offers at least the same flexibility of | |
126 | + typesetting that p\TeX\ has.} | |
127 | + | |
128 | + We are not satisfied with the ability of producing outputs conformed to | |
129 | + JIS~X~4051~\cite{jisx4051}, the Japanese Industrial Standard for | |
130 | + typesetting, or to a technical note~\cite{w3c} by W3C; | |
131 | + if one wants to produce very incoherent outputs for some reason, it | |
132 | + should be possible. | |
133 | +In this point, previous attempts of Japanese typesetting with \LuaTeX\ | |
134 | + which we cited in the previous subsection are inadequate. | |
135 | + | |
136 | +\pTeX\ has some flexibility of typesetting, by changing internal | |
137 | + parameters such as |\kanjiskip| or |\prebreakpenalty|, and by using | |
138 | + custom JFM (Japanese TFM). Therefore we decided to include these | |
139 | + functionality to \LuaTeX-ja. | |
140 | + | |
141 | +\item\emph{\LuaTeX-ja isn't mere re-implementation or porting of \pTeX; | |
142 | + some (technically and/or conceptually) inconvenient features of | |
143 | + \pTeX\ are modified.} | |
144 | + | |
145 | + We describe this point in more detail at the next section. | |
146 | +\end{itemize} | |
147 | + | |
148 | + | |
149 | +\subsection{Overview of the processes} | |
150 | +\label{ssec-over} | |
151 | +We describe an outline of \LuaTeX-ja's process in order. | |
152 | + | |
153 | +\begin{itemize} | |
154 | +\item In the |process_input_buffer| callback: treatment of breaking | |
155 | + lines after a Japanese character (in Subsection~\ref{ssec-line}). | |
156 | + | |
157 | +\item In the |hyphenate| callback: font replacement. | |
158 | + | |
159 | +\LuaTeX-ja looks into for each \textit{glyph\_node}~$p$ in the horizontal list. If | |
160 | + the character represented by $p$ is considered as a Japanese | |
161 | + character, the font used at $p$ is replaced by the value of | |
162 | + |\ltj@curjfnt|, an attribute for `the current Japanese font' | |
163 | + at~$p$. | |
164 | + | |
165 | +Furthermore, the subtype of $p$ is subtracted by 1 to suppress | |
166 | + hyphenation around $p$ by \LuaTeX, because later processes of | |
167 | + \LuaTeX-ja take care of all things about Japanese characters. | |
168 | + | |
169 | +\item In |pre_linebreak_filter| and |hpack_filter| callbacks: | |
170 | + | |
171 | +\begin{enumerate} | |
172 | +\item \LuaTeX-ja has its own stack system, and the current horizontal | |
173 | + list is traversed in this stage to determine what the level of | |
174 | + \LuaTeX-ja's internal stack at the end of the list is. We will | |
175 | + discuss it in Subsection~\ref{ssec-stack}. | |
176 | + | |
177 | +\item In this stage, \LuaTeX-ja inserts glues/kerns for Japanese | |
178 | + typesetting in the list. This is the core routine of \LuaTeX-ja. | |
179 | + We will discuss it in Subsections | |
180 | + \ref{ssec-jglue}~and~\ref{ssec-jspec} . | |
181 | + | |
182 | +\item To make a match between a metric and a real font, sometimes | |
183 | + adjustument of the position of (Japanese) glyphs are performed. | |
184 | + We will discuss it in Subsection~\ref{ssec-width}. | |
185 | +\end{enumerate} | |
186 | +\item In the |mlist_to_hlist| callback: treatment of Japanese characters | |
187 | + in math formulas. This stage is similar to adjustment of the | |
188 | + position of glyphs (see above), so we omit to describe this stage | |
189 | + from this paper. | |
190 | +\end{itemize} | |
191 | + | |
192 | +In this paper, a \emph{alphabetic character} means a non-Japanese | |
193 | +character. Similarly, we use the word an \emph{alphabetic font} as the | |
194 | +counterpart of a jJpanese font. | |
195 | + | |
196 | +\subsection{Contents of this paper} | |
197 | +Here we describe the contents of the rest of this paper briefly. In | |
198 | +Section~\ref{sec:differences_with_ptex}, we describe major differences | |
199 | +between \pTeX\ and \LuaTeX-ja. The next section, | |
200 | +Section~\ref{sec:distinction_of_characters}, is concentrated on a | |
201 | +problem how we distinguish between Japanese characters and alphabetic | |
202 | +characters. In Section~\ref{sec:current_status}, we show current | |
203 | +development status of the package. Finally, in | |
204 | +Section~\ref{sec:implementation}, we describe some internal routines of | |
205 | +\LuaTeX-ja. | |
206 | + | |
207 | +\subsection{General information of the project} | |
208 | +This \LuaTeX-ja project is hosted by SourceForge.jp. The official wiki | |
209 | +is located on | |
210 | +\url{http://sourceforge.jp/projects/luatex-ja/wiki/}. There is | |
211 | +no stable version on October 22, 2011, however a set of developer sources can be | |
212 | +obtained from the git repository. Members of the project team are as follows | |
213 | +(in random order): Hironori Kitagawa, Kazuki Maeda, Takayuki Yato, | |
214 | +Yusuke Kuroki, Noriyuki Abe, Munehiro Yamamoto, Tomoaki Honda, | |
215 | +and~Shuzaburo Saito. | |
216 | + | |
217 | + | |
218 | +\section{Major differences with \pTeX} | |
219 | +\label{sec:differences_with_ptex} | |
220 | +In this section, we explain several major differences between \pTeX\ | |
221 | +and our \LuaTeX-ja. For general information of Japanese typesetting and the | |
222 | +overview of \pTeX, please see Okumura~\cite{ptexjp}. | |
223 | + | |
224 | + | |
225 | +\subsection{Names of control sequences} | |
226 | +\label{ssec-csname} Because \pTeX\ is an engine modification of Knuth's | |
227 | +original \TeX82 engine, some of the additional primitives take a form that is | |
228 | +very difficult to be simulated by a macro. For example, an additional | |
229 | +primitive |\prebreakpenalty|$\langle\hbox{\it | |
230 | +char\_code}\rangle$|[=]|$\langle\hbox{\it penalty}\rangle$ in \pTeX\ | |
231 | +sets the amount of penalty inserted before a character whose code is | |
232 | +$\langle\hbox{\it char\_code}\rangle$ to $\langle\hbox{\it | |
233 | +penalty}\rangle$, and this form |\prebreakpenalty|$\langle\hbox{\it | |
234 | +char\_code}\rangle$ can be also used for retrieving the value. | |
235 | + | |
236 | +Moreover, there are some internal parameters of \pTeX\ which values of them at the end of a | |
237 | +horizontal box or that of a paragraph are valid in whole box or | |
238 | +paragraph. However, the implementation of these parameters in | |
239 | +\LuaTeX-ja is not so easy; we will discuss it in Subsection~\ref{ssec-stack}. | |
240 | + | |
241 | +From above two problems discussed above, the assignment and retrieval | |
242 | +of most parameters in \LuaTeX-ja are summarized into the following | |
243 | +three control sequences: | |
244 | +\begin{itemize} | |
245 | +\item |\ltjsetparameter{|$\langle\hbox{\it | |
246 | + name}\rangle$|=|$\langle\hbox{\it value}\rangle$|,...}|: for local | |
247 | + assignment. | |
248 | +\item |\ltjglobalsetparameter|: for global assignment. Note that these two control | |
249 | + sequences obey the value of |\globaldefs| primitive. | |
250 | +\item |\ltjgetparameter{|$\langle\hbox{\it | |
251 | + name}\rangle$|}[{|$\langle\hbox{\it optional | |
252 | + argument}\rangle$|}]|: for retrieval. The returned value is always | |
253 | + a string. | |
254 | +\end{itemize} | |
255 | + | |
256 | +\subsection{Line-break after a Japanese character} | |
257 | +\label{ssec-line} | |
258 | + | |
259 | +Japanese texts can break lines almost everywhere, in contrast with | |
260 | +alphabetic texts can break lines only between words (or use | |
261 | +hyphenation). Hence, \pTeX's input processor is modified so that a | |
262 | +line-break after a Japanese character doesn't emit a space. However, | |
263 | +there is no way to customize the input processor of \LuaTeX, other than | |
264 | +to hack its CWEB-source. All a macro package can do is to modify an input line before | |
265 | +when \LuaTeX\ begin to process it, inside the |process_input_buffer| | |
266 | +callback. | |
267 | + | |
268 | +Hence, in \LuaTeX-ja, a comment letter (we reserve U+FFFFF for this | |
269 | +purpose) will be appended to an input line, if this line ends with a Japanese | |
270 | +character.\footnote{Strictly speaking, it also requires that the catcode | |
271 | +of the end-line character is 5~(\emph{end-of-line}). This condition is | |
272 | +useful under the verbatim environment.} One might jump to a conclusion | |
273 | +that the treatment of a line-break by \pTeX\ and that of \LuaTeX-ja are | |
274 | +totally same, however they are different in the respect that \LuaTeX-ja's | |
275 | +judgement whether a comment letter will be appended the line is done | |
276 | +\emph{before} the line is actually processed by \LuaTeX. | |
277 | + | |
278 | +Figure~\ref{fig-linebreak} shows an example of this situation; the | |
279 | +command at the first line marks most of Japanese characters as | |
280 | +`non-Japanese characters'. In other words, from that command onward, the | |
281 | +letter `あ' will be treated as an alphabetic character by | |
282 | +\LuaTeX-ja. Then, it is natural to have a space between `あ' and `y' in | |
283 | +the output, where the actual output in the figure does not so. This is | |
284 | +because `あ' is considered a Japanese character by \LuaTeX-ja, | |
285 | +when \LuaTeX-ja does the decision whether U+FFFFF will be added to the | |
286 | +input line~2. | |
287 | + | |
288 | +\begin{figure} | |
289 | +\begin{LTXexample} | |
290 | +\font\x=IPAMincho \x | |
291 | +\ltjsetparameter{jacharrange={-6}}xあ | |
292 | +y | |
293 | +\end{LTXexample} | |
294 | +\caption{A notable sample showing the treatment of a line-break after a | |
295 | +Japanese character.}\label{fig-linebreak} | |
296 | +\end{figure} | |
297 | + | |
298 | +\subsection{Separation between `real' fonts and metrics} | |
299 | +\label{ssec-sepmet} | |
300 | + | |
301 | +Traditionally, most Japanese fonts used in typesetting are not | |
302 | +proportional, that is, most glyphs have same size (in most cases, | |
303 | +square-shaped). Hence, it is not rare that the contents of different | |
304 | +JFMs are essentially same, and only differ in their names. For example, | |
305 | +|min10.tfm| and |goth10.tfm|, which are JFMs shipped with \pTeX\ for | |
306 | +seriffed \emph{mincho} family and sans-seriffed \emph{gothic} family, | |
307 | +differ their |FAMILY| and |FACE| only. Moreover, |jis.tfm| and | |
308 | +|jisg.tfm|, which is included in the \emph{jis} font metric, which is | |
309 | +used in \emph{jsclasses}~\cite{jsclasses} by Haruhiko Okumura (奥村晴彦), | |
310 | +are totally same as binary files. Considering this situation, we | |
311 | +decided to separate `real' fonts and metrics used for them in | |
312 | +\LuaTeX-ja. Typical declarations of Japanese fonts in the style of plain | |
313 | +\TeX\ are shown in Figure~\ref{fig-jfdef}. We would like to add several | |
314 | +remarks: | |
315 | +\begin{itemize} | |
316 | +\item A control sequence |\jfont| must be used for Japanese fonts, instead of |\font|. | |
317 | +\item \LuaTeX-ja automatically loads the \emph{luaotfload} package, so | |
318 | + \hbox{\tt file:} and \hbox{\tt name:} prefixes, and various font features can be | |
319 | + used as the first line in Figure~\ref{fig-jfdef}. | |
320 | +\item The |jfm| key specifies the metric for the font. In | |
321 | + Figure~\ref{fig-jfdef}, both fonts will use a metric stored in a | |
322 | + Lua script named |jfm-ujis.lua|. This metric is the standard | |
323 | + metric in \LuaTeX-ja, and is based on JFMs used in the \emph{otf} | |
324 | + package~\cite{otf}. | |
325 | +\item The \hbox{psft:} prefix can be used to specify name-only, non-embedded | |
326 | + fonts. When one displays a pdf with these fonts, actual fonts which | |
327 | + will be used for them depend on a pdf reader. | |
328 | +\end{itemize} | |
329 | +The specification of a metric for \LuaTeX-ja is similar to that of a JFM | |
330 | +(see \cite{ptexjp}); characters are grouped into several classes, the | |
331 | +size information of characters are specified for each class, and | |
332 | +glue/kern insertions are specified for each pair of classes. Although | |
333 | +the author have not tried, it may be possible to develop a program that | |
334 | +`converts' a JFM to a metric for \LuaTeX-ja. \LuaTeX-ja offers three | |
335 | +metrics by default; |jfm-ujis.lua|, |jfm-jis.lua| based on the | |
336 | +\emph{jis} font metric, and |jfm-min.lua| based on old |min10.tfm|. | |
337 | + | |
338 | + Note that |-kern| in features | |
339 | +is important, because kerning information from a real font itself will | |
340 | +clash with glue/kern information from the metric. | |
341 | + | |
342 | +\begin{figure} | |
343 | +\begin{verbatim} | |
344 | +\jfont\foo=file:ipam.ttf:jfm=ujis;script=latn;-kern;+jp04 at 12pt | |
345 | +\jfont\bar=psft:Ryumin-Light:jfm=ujis at 10pt | |
346 | +\end{verbatim} | |
347 | +\caption{Typical declarations of Japanese fonts.} | |
348 | +\label{fig-jfdef} | |
349 | +\end{figure} | |
350 | + | |
351 | +\subsection{Insertion of glues/kerns for Japanese typesetting: timing} | |
352 | +\label{ssec-jglue} | |
353 | + | |
354 | +As described in \cite{luatexref}, \LuaTeX's kerning and ligaturing | |
355 | +processes are totally different from those of \TeX82. \TeX82's process is | |
356 | +done just when a (sequence of) character is appended to the current | |
357 | +list. Thus we can interrupt this process by writing as | |
358 | +|f{}irm|. However, \LuaTeX's process is \emph{node-based}, that is, the | |
359 | +process will be done when a horizontal box or a paragraph is ended, so | |
360 | +|f{}irm| and |firm| yield same outputs under \LuaTeX. | |
361 | + | |
362 | +The situation for Japanese characters is more complicated. | |
363 | +Glues (and kerns) which are needed for Japanese | |
364 | +typesetting are divided into the following three categories: | |
365 | +\begin{itemize} | |
366 | +\item Glue (or kern) from the metric of Japanese fonts (\emph{JFM glue}, | |
367 | + for short). | |
368 | + | |
369 | +\item Default glue between a Japanese character and an alphabetic | |
370 | + character (\emph{xkanjiskip}, for short), usually 1/4 of | |
371 | + full-width (\emph{shibuaki}) with some stretch and shrink for | |
372 | + justifying each line. | |
373 | +\item Default glue between two consecutive Japanese characters | |
374 | + (\emph{kanjiskip}, for short). The main reason of this glue is to | |
375 | + enable breaking lines almost everywhere in Japanese texts. In most | |
376 | + cases, its natural width is zero, and some stretch/shrink for | |
377 | + justifying each line. | |
378 | +\end{itemize} | |
379 | +In \pTeX, these three kinds of glues are treated differently. A JFM glue | |
380 | +is inserted when a (sequence of) Japanese character is appended to the | |
381 | +current list, same as the case of alphabetic characters in \TeX82. This | |
382 | +means that one can interrupt the insertion process by saying |{}|. A | |
383 | +\emph{xkanjiskip} is inserted just before `hpack' or line-breaking of a | |
384 | +paragraph; this timing is somewhat similar to that of \LuaTeX's kerning | |
385 | +process. Finally, A \emph{kanjiskip} is not appeared as a node anywhere; | |
386 | +only appears implicitly in calculation of the width of a horizontal box, | |
387 | +that of breaking lines, and the actual output process to a DVI | |
388 | +file. These specifications have made \pTeX's behavior very hard to | |
389 | +understand. | |
390 | + | |
391 | +\LuaTeX-ja inserts glues in all three categories simultaneously inside | |
392 | +|hpack_filter| and |pre_linebreak_filter| callbacks. The reasons of | |
393 | +this specification are to behave like alphabetic characters in \LuaTeX\ | |
394 | +(as described in the first paragraph in this subsection), and to clarify | |
395 | +the specification for \LuaTeX-ja's process. | |
396 | + | |
397 | +\subsection{Insertion of glues/kerns for Japanese typesetting: specification} | |
398 | +\label{ssec-jspec} | |
399 | + | |
400 | +\begin{table} | |
401 | +\caption{Examples of differences between \pTeX\ and \LuaTeX-ja.} | |
402 | +\label{tab-jfmglue} | |
403 | +\begin{center} | |
404 | +\begin{tabular}{llllllll} | |
405 | +\toprule | |
406 | +&\multicolumn{1}{c}{(1)}&\multicolumn{1}{c}{(2)}&\multicolumn{1}{c}{(3)}&\multicolumn{1}{c}{(4)}\\ | |
407 | +Input &|あ】{}【〕\/〔| &|い』\/a| &|う)\hbox{}(| &|え]\special{}[|\\\midrule | |
408 | +\pTeX &あ】\hbox{}【〕\hbox{}〔&い』\/a &う)\hbox{}( &え]\hbox{}[\\ | |
409 | +\LuaTeX-ja &あ】{}【〕\/〔 &い』\/a &う)\hbox{}( &え]\special{}[\\ | |
410 | +\bottomrule | |
411 | +\end{tabular} | |
412 | +\end{center} | |
413 | +\end{table} | |
414 | + | |
415 | +\begin{figure} | |
416 | +\begin{center} | |
417 | +\fontsize{40}{40}\selectfont | |
418 | +\imagfm{\jstrut あ}% | |
419 | +\imagfm{\jstrut 】\inhibitglue}% | |
420 | +\imagfm{\jstrut\kern.5\zw}% | |
421 | +\imagfm{\jstrut\kern.5\zw}% | |
422 | +\imagfm{\jstrut\inhibitglue【}% | |
423 | +\imagfm{\jstrut 〕\inhibitglue}% | |
424 | +\imagfm{\jstrut\kern.5\zw}% | |
425 | +\imagfm{\jstrut\kern.5\zw}% | |
426 | +\imagfm{\jstrut\inhibitglue〔}% | |
427 | +\end{center} | |
428 | +\caption{Detail of the output of \pTeX\ in the input~(1) in Table~\ref{tab-jfmglue}.} | |
429 | +\label{fig-ptexjfm} | |
430 | +\end{figure} | |
431 | + | |
432 | +Now we will take a look at the insertion process itself through four points. | |
433 | + | |
434 | +\begin{description} | |
435 | +\item[Ignored nodes] | |
436 | +As noted in the previous subsection, the insertion process in \pTeX\ can | |
437 | + be interrupted by saying |{}| or anything else.\footnote{This | |
438 | + is why some tricks like \texttt{ちょ\char`\{\char`\}っと} for | |
439 | + \texttt{min10.tfm} and other `old' JFMs work.} This leads the | |
440 | + second row in Table~\ref{tab-jfmglue}, or | |
441 | + Figure~\ref{fig-ptexjfm}. Here `the process is interrupted' | |
442 | + means that \pTeX\ does not think the letter `】\inhibitglue' | |
443 | + is followed by `\inhibitglue【', hence two half-width glues | |
444 | + are inserted between `】\inhibitglue' and `\inhibitglue【', | |
445 | + where the left one is from `】\inhibitglue' and the right one | |
446 | + is from `\inhibitglue【'. | |
447 | + | |
448 | + On the other hand, in \LuaTeX-ja, the process is done inside | |
449 | + |hpack_filter| and |pre_linebreak_filter| callbacks. Hence, | |
450 | + \emph{anything that does not make any node will be | |
451 | + ignored}\ in \LuaTeX-ja, as shown in (1) in | |
452 | + Table~\ref{tab-jfmglue}. \LuaTeX-ja also ignores any nodes | |
453 | + which does not make any contribution to current horizontal | |
454 | + list---\emph{ins\_node}, \emph{adjust\_node}, | |
455 | + \emph{mark\_node}, \emph{whatsit\_node} and | |
456 | + \emph{penalty\_node}---, as shown in (4). | |
457 | + | |
458 | + | |
459 | +By the way, around a \emph{glyph\_node} $p$ there may be some nodes | |
460 | + attached to~$p$. These are an accent and kerns for | |
461 | + moving it to the right place, and a kern from the italic | |
462 | + correction\footnote{\TeX82 (and \LuaTeX) does not distinguish | |
463 | + between explicit kern and a kern for italic correction. To | |
464 | + distinguish them, an additional subtype for a kern is introduced | |
465 | + in \pTeX. On the other hand, \LuaTeX-ja uses an additional attribute and | |
466 | + redefines \texttt{\char`\\/} to set this attribute.} for $p$. It is natural that | |
467 | + these attachments should be ignored inside the process. Hence | |
468 | + \LuaTeX-ja takes this approach, as the latest version of | |
469 | + \pTeX\ (version~p3.2). This explains (2) in the Table~\ref{tab-jfmglue}. | |
470 | + | |
471 | +Summerizing above, one should put an empty horizontal box |\hbox{}| to | |
472 | + where he/she wants to interrupt the insertion process in | |
473 | + \LuaTeX-ja as (3) in the Table~\ref{tab-jfmglue}. | |
474 | + | |
475 | +\item[Fonts with the same metric] | |
476 | +Recall that \LuaTeX-ja separates `real' fonts and metrics, as in Subsection~\ref{ssec-sepmet}. | |
477 | +Consider the following input, where all Japanese fonts use same metric | |
478 | + (in \LuaTeX-ja), and |\gt| selects \emph{gothic} family for | |
479 | + the current Japanese font family: | |
480 | +\begin{quote} | |
481 | +\begin{verbatim} | |
482 | +明朝)\gt (ゴシック | |
483 | +\end{verbatim} | |
484 | +\end{quote} | |
485 | +If the above input is processed by \pTeX, because the insertion process is | |
486 | + interrupt by |\gt|, the result looks like | |
487 | +\begin{quote} | |
488 | +\mc 明朝)\hbox{}\gt (ゴシック | |
489 | +\end{quote} | |
490 | +However this seems to be unnatural, since two Japanese fonts in the | |
491 | + output use the same metric, i.e.,~the same | |
492 | + typesetting rule. Hence, we decided that Japanese fonts with | |
493 | + the same metric are treated as one font in the insertion | |
494 | + process of \LuaTeX-ja. Thus, the output from the above input | |
495 | + in \LuaTeX-ja looks like: | |
496 | +\begin{quote} | |
497 | +\mc 明朝)\gt (ゴシック | |
498 | +\end{quote} | |
499 | +One might have the situation that this default behavior is not | |
500 | + suitable. \LuaTeX-ja offers a way to handle this situation, but | |
501 | + we leave it to the manual~\cite{man}. | |
502 | + | |
503 | +\item[Fonts with different metrics] | |
504 | +The case where two consecutive Japanese characters use different metrics and/or | |
505 | + different size is similar. Consider the following input where | |
506 | + the \emph{mincho} family and the \emph{gothic} family use | |
507 | + different metrics: | |
508 | +\begin{quote} | |
509 | +\begin{verbatim} | |
510 | +漢)\gt (漢)\large (大 | |
511 | +\end{verbatim} | |
512 | +\end{quote} | |
513 | +As the previous paragraph, this input yields the following, by \pTeX: | |
514 | +\begin{quote} | |
515 | +\mc 漢)\hbox{}\gt (漢)\hbox{}\large (大 | |
516 | +\end{quote} | |
517 | +We had thought that amounts of spaces between parentheses in above output | |
518 | + are too much. Hence we have changed the default behavior of | |
519 | + \LuaTeX-ja, so that the amount of a glue between two Japanese | |
520 | + characters with different metrics is the \emph{average} of a glue | |
521 | + from the left character and that from the right | |
522 | + character. For example, Figure~\ref{fig-diffmet} shows the | |
523 | + output from above input. The width of glue indicated `(1)' is | |
524 | + $(a/2 + a/2)/2 = 0.5a$, and the width of glue indicated `(2)' | |
525 | + is $(a/2 + 1.2a/2)/2 = 0.55a$. This default behavior can be | |
526 | + changed by \textsf{diffrentmet} parameter of \LuaTeX-ja. | |
527 | + | |
528 | +\begin{figure} | |
529 | +\begin{center} | |
530 | +\fontsize{40}{40}\selectfont | |
531 | +\imagfm{\jstrut\smash{% | |
532 | + \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr漢\cr | |
533 | + \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$a$}\ | |
534 | + \hrulefill\vrule height .5ex depth .5ex\cr}}}}% | |
535 | +\imagfm{\jstrut )\inhibitglue}% | |
536 | +\hbox to .5\zw{\hss\normalsize (1)\hss}% | |
537 | +\imagfm{\jstrut\inhibitglue\gt (}% | |
538 | +\imagfm{\jstrut\gt 漢}% | |
539 | +\imagfm{\jstrut\gt )\inhibitglue}% | |
540 | +\hbox to .55\zw{\hss\normalsize (2)\hss}% | |
541 | +\imagfm{\fontsize{48}{48}\selectfont\jstrut\gt\inhibitglue (}% | |
542 | +\imagfm{\fontsize{48}{48}\selectfont\jstrut\smash{% | |
543 | + \vtop{\lineskiplimit=\maxdimen\lineskip2pt\halign{#\cr\gt 大\cr | |
544 | + \small\vrule height .5ex depth .5ex\hrulefill\ \lower.5ex\hbox{$1.2a$}\ | |
545 | + \hrulefill\vrule height .5ex depth .5ex\cr}}}} | |
546 | +\end{center} | |
547 | +\caption{Fonts with different metrics.} | |
548 | +\label{fig-diffmet} | |
549 | +\end{figure} | |
550 | + | |
551 | +\item[\emph{kanjiskip} and \emph{xkanjiskip}] | |
552 | +In \pTeX, the value of \emph{xkanjiskip} is controlled by a skip named | |
553 | + |\xkanjiskip|. A well-known defect of this implementation is | |
554 | + that the value of \emph{xkanjiskip} is not connected with the | |
555 | + size of the currnt Japanese font. It seems that |EXTRASPACE|, | |
556 | + |EXTRASTRETCH|, |EXTRASHRINK| parameters in a JFM are | |
557 | + reserved for specifying the default value of | |
558 | + \emph{xkanjiskip} in a unit of the design size, but \pTeX\ | |
559 | + did not use these parameters, actually. | |
560 | + | |
561 | +Considering this situation of p\TeX, \LuaTeX-ja can use the value of | |
562 | + \emph{xkanjiskip} that specified in a metric. If the value of | |
563 | + \emph{xkanjiskip} on user side (this is the value of | |
564 | + \textsf{xkanjiskip} parameter of |\ltjsetparameter|) is | |
565 | + |\maxdimen|, then \LuaTeX-ja use the specification from | |
566 | + the current used metric as the actual value of | |
567 | + \emph{xkanjiskip}. This description also applies for \emph{kanjiskip}. | |
568 | +\end{description} | |
569 | + | |
570 | +\section{Distinction of characters} | |
571 | +\label{sec:distinction_of_characters} Since \LuaTeX\ can handle Unicode | |
572 | +characters natively, it is a major problem that how we distinguish | |
573 | +Japanese characters and alphabetic characters. For example, the | |
574 | +multiplication sign (U+00D7) exists both in ISO-8859-1 (hence in Latin-1 | |
575 | +Supplement in Unicode) and in the basic Japanese character set | |
576 | +JIS~X~0208. It is not desirable that this character is always treated as | |
577 | +an alphabetic character, because this symbol is often used in the sense | |
578 | +of `negative' in Japan. | |
579 | + | |
580 | +\subsection{Character ranges} | |
581 | +Before we describe the approach taken is \LuaTeX-ja, we review the | |
582 | +approach taken by u\pTeX. u\pTeX\ extends the |\kcatcode| primitive in | |
583 | +\pTeX, to use this primitive for setting how a character is treated | |
584 | +among alphabetic characters~(15), \emph{kanji}~(16), \emph{kana}~(17), | |
585 | +\emph{kanji}, \emph{Hangul}~(17), or~\emph{other CJK characters}~(18). | |
586 | +The assignment to |\kcatcode| can be done by a Unicode | |
587 | +block.\footnote{There are some exceptions. For example, U+FF00--FFEF | |
588 | +(Halfwidth and Fullwidth Forms) are divided into three blocks in recent | |
589 | +u\pTeX.} | |
590 | + | |
591 | +\LuaTeX-ja adopted a different approach. There are many Unicode blocks | |
592 | + in Basic Multilingual Plane which are not included in | |
593 | + Japanese fonts, therefore it is inconvenient if we process by a Unicode | |
594 | + block. Furthermore, JIS~X~0208 are not just union of Unicode | |
595 | + blocks; for example, the intersection of JIS~X~0208 and | |
596 | + Latin-1 Supplement is shown in | |
597 | + Table~\ref{tab-inter}. Considering these two points, to | |
598 | + customize the range of Japanese characters in \LuaTeX-ja, one | |
599 | + has to define ranges of character codes in his source in advance. | |
600 | + | |
601 | + | |
602 | +\begin{table} | |
603 | +\caption{Intersection of JIS~X~0208 and Latin-1 Supplement.} | |
604 | +\label{tab-inter} | |
605 | +\begin{center} | |
606 | +\begin{tabular}{llll} | |
607 | +\ltjjachar"A7 (U+00A7),& | |
608 | +\ltjjachar"A8 (U+00A8),& | |
609 | +\ltjjachar"B0 (U+00B0),& | |
610 | +\ltjjachar"B1 (U+00B1),\\ | |
611 | +\ltjjachar"B4 (U+00B4),& | |
612 | +\ltjjachar"B6 (U+00B6),& | |
613 | +\ltjjachar"D7 (U+00D7),& | |
614 | +\ltjjachar"F7 (U+00F7) | |
615 | +\end{tabular} | |
616 | +\end{center} | |
617 | +\end{table} | |
618 | + | |
619 | + | |
620 | +We note that \LuaTeX-ja offers two additional control sequences, | |
621 | + |\ltjjachar| and |\ltjalchar|. They are similar to |\char| | |
622 | + primitive, however |\ltjjachar| always yields a Japanese character, provided that | |
623 | + the argument is more than or equal to 128, and |\ltjalchar| always | |
624 | + yields an alphabetic character, regardless of the argument. | |
625 | + | |
626 | +\subsection{Default setting of ranges} | |
627 | +Patches for plain \TeX\ and \LaTeXe\ of \LuaTeX-ja predefine 8~character | |
628 | +ranges, as shown in Table~\ref{tab-chrrng}. Almost of these ranges are | |
629 | +just the union of Unicode blocks, and determined from the Adobe-Japan1-6 | |
630 | +character collection~\cite{aj16}, and JIS~X~0208. Among these 8~ranges, | |
631 | +the ranges~2, 3, 6, 7, and~8 are considered ranges of Japanese | |
632 | +characters, and others are considered ranges of alphabetic | |
633 | +characters.\footnote{Note that ranges 3~and~8 are considered ranges of | |
634 | +alphabetic characters in this paper.} We remark on ranges 2~and~8: | |
635 | +\begin{description} | |
636 | +\item[The range~2] | |
637 | +JIS~X~0208 includes Greek letters and Cyrillic letters, however, these | |
638 | + letters cannot be used for typesetting Greek or Russian, of | |
639 | + course. Hence it is reasonable that Greek letters and | |
640 | + Cyrillic consist another character range. | |
641 | +\item[The range~8] | |
642 | +If one want to use 8-bit TFMs, such as T1 or TS1 encodings, he should | |
643 | + mark this range~8 as a range of alphabetic characters by | |
644 | +\begin{quote} | |
645 | +|\ltjsetparameter{jacharrange={-8}}| | |
646 | +\end{quote} | |
647 | +This is because some 8-bit TFMs have a glyph in this range; for example, | |
648 | + the character `\OE' is located at |"D7| in the T1 encoding. %" | |
649 | +\end{description} | |
650 | + | |
651 | + | |
652 | +\begin{table} | |
653 | +\caption{Predefined ranges in \LuaTeX-ja.} | |
654 | +\label{tab-chrrng} | |
655 | +\begin{center} | |
656 | +\begin{tabular}{@{\bf}rl} | |
657 | +1&(Additional) Latin characters which are not belonged in the range~8.\\ | |
658 | +2&Greek and Cyrillic letters.\\ | |
659 | +3&Punctuations and miscellaneous symbols.\\ | |
660 | +4&Unicode blocks which does not intersect with Adobe-Japan1-6.\\ | |
661 | +5&Surrogates and supplementary private use Areas.\\ | |
662 | +6&Characters used in Japanese typesetting.\\ | |
663 | +7&Characters possibly used in CJK typesetting, but not in Japanese.\\ | |
664 | +8&Characters in Table~\ref{tab-inter}. | |
665 | +\end{tabular} | |
666 | +\end{center} | |
667 | +\end{table} | |
668 | + | |
669 | +\subsection{Control sequences producing Unicode characters} | |
670 | +\label{ssec-unichar} | |
671 | + | |
672 | +The \emph{fontspec} package\footnote{Preciously saying, it is the | |
673 | +\emph{xunicode} package, originally a package for \XeTeX and | |
674 | +automatically loaded by the \emph{fontspec} package.} offers various | |
675 | +control sequences that produce Unicode characters. However, these | |
676 | +control sequences as it stands cannot work correctly with the default | |
677 | +range setting of \LuaTeX-ja. For example, |\textquotedblleft| is just | |
678 | +an abbreviation of |\char"201C\relax|, and the character U+201C (LEFT %" | |
679 | +DOUBLE QUOTATION MARK) is treated as an Japanese character, because it | |
680 | +belongs to the range~3. This problem is resolved by using |\ltjalchar| | |
681 | +instead of the |\char| primitive. It is included in an optional package | |
682 | +named \texttt{luatexja-\penalty0fontspec.sty}. Figure~\ref{fig-unitxt} | |
683 | +shows several ways o typeset a character , both as a Japanese character | |
684 | +and as as an alphabetic characters. | |
685 | + | |
686 | +\begin{figure} | |
687 | +\begin{LTXexample} | |
688 | +×, \char`×, % depend on range setting | |
689 | +\ltjalchar`×, % alphabetic char | |
690 | +\ltjjachar`×, % Japanese char | |
691 | +\texttimes % alph. char (by fontspec) | |
692 | +\end{LTXexample} | |
693 | +\caption{Control sequences producing a Unicode character.} | |
694 | +\label{fig-unitxt} | |
695 | +\end{figure} | |
696 | + | |
697 | +The situation looks similar in math formulas, but in fact it differs. | |
698 | +Each control sequence that represents an ordinary symbol defined by the | |
699 | +\emph{unicode-math} package is just synonym of a character. For example, | |
700 | +the meaning of |\otimes| is just the character U+2297 (CIRCLED TIMES), | |
701 | +which is included in the range~3. However, it is difficult to define a | |
702 | +control sequence like |\ltjalUmathchar| as a counterpart of | |
703 | +|\Umathchar|, since an input like `|\sum^\ltjalUmathchar ...|' has to be | |
704 | +permitted. | |
705 | + | |
706 | +However, we couldn't develop a satisfactory solution to this problem in | |
707 | +time for this paper, due to a lack of time. We are just testing a | |
708 | +solution below: | |
709 | +\begin{itemize} | |
710 | +\item \LuaTeX-ja has a list of character codes which will be always reated as | |
711 | + alphabetic characters in math mode. Considering 8-bit TFMs for | |
712 | + math symbols, this list includes natural numbers between |"80| and | |
713 | + |"FF| by default. | |
714 | +\item Redefine internal commands defined in the \emph{unicode-math} | |
715 | + package so that | |
716 | +codes of characters which are mentioned in the \emph{unicode-math} | |
717 | + package will be included in the list. | |
718 | +\end{itemize} | |
719 | + | |
720 | + | |
721 | +We would like to extend treatments described in this subsection to 8-bit | |
722 | +font encodings, but we leave it to further development too. | |
723 | + | |
724 | +\section{Current status of development} | |
725 | +\label{sec:current_status} | |
726 | +At the moment, \LuaTeX-ja can be used under plain \TeX, and under | |
727 | +\LaTeXe. Generally speaking, one only has to read |luatexja.sty|, by | |
728 | +|\input| command or |\usepackage| (in~\LaTeXe), if you merely want to | |
729 | +typeset Japanese characters. We look more detail by parts. | |
730 | + | |
731 | +\subsection{`Engine extension'} | |
732 | +The lowest part of \LuaTeX-ja corresponds to the \pTeX\ extension as | |
733 | +\emph{an engine extension of \TeX}. We, the project menbers, think that | |
734 | +this part is almost done. There is one more feature of \LuaTeX-ja which | |
735 | +we are going to explain: | |
736 | + | |
737 | +\begin{description} | |
738 | +\item[Shifting baseline] | |
739 | +In order to make a match between Japanese fonts and alphabetic fonts, | |
740 | + sometimes shifting the baseline of alphabetic characters may | |
741 | + be needed. \pTeX\ has a dimension |\ybaselineshift|, which | |
742 | + corresponds to the amount of shifting down the baseline of alphabetic | |
743 | + characters. This is useful for Japanese-based documents, but | |
744 | + not for documents mainly in languages with alphabetic | |
745 | + characters. | |
746 | + | |
747 | +Hence, \LuaTeX-ja extends \pTeX's |\ybaselineshift| to Japanese | |
748 | + characters. Namely, \LuaTeX-ja offers two parameters, | |
749 | + \textsf{yjabaselineshift} and \textsf{yalbaselineshift}, for the | |
750 | + amount of shifting the baseline of Japanese characters and | |
751 | + that of alphabetic characters, respectively. | |
752 | +\begin{figure} | |
753 | +\begin{center} | |
754 | +\fontsize{40}{40}\selectfont\fboxsep0mm | |
755 | +\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth | |
756 | +\hbox to 0.9\linewidth{% | |
757 | +\hfil | |
758 | +\raise-10pt\imagfm{\jstrut 漢}% | |
759 | +\raise-10pt\imagfm{\jstrut 字}\hskip.25\zw% | |
760 | +\imagfm{p}% | |
761 | +\imagfm{h}% | |
762 | +\hfil\hfil | |
763 | +\imagfm{\jstrut 漢}% | |
764 | +\imagfm{\jstrut 字}\hskip.25\zw% | |
765 | +\raise-10pt\imagfm{p}% | |
766 | +\raise-10pt\imagfm{h}% | |
767 | +\hfil | |
768 | +} | |
769 | +\end{center} | |
770 | + | |
771 | +\caption{First example of shifting baseline.} | |
772 | +\label{fig-bls} | |
773 | +\end{figure} | |
774 | + | |
775 | +\begin{figure} | |
776 | +\begin{center} | |
777 | +\fontsize{30}{30}\selectfont\fboxsep0mm | |
778 | +\vrule width 0.9\textwidth height0.4pt depth0.4pt\kern-0.9\textwidth | |
779 | +\hbox to 0.9\linewidth{% | |
780 | +\hfil | |
781 | +\imagfm{a}% | |
782 | +\imagfm{b}\hskip.25\zw% | |
783 | +\imagfm{\jstrut 本}% | |
784 | +\imagfm{\jstrut 文}\hskip.33333\zw% | |
785 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut\inhibitglue (}% | |
786 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 注}% | |
787 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut 釈}\hskip.1666667\zw% | |
788 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont c}% | |
789 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont o}% | |
790 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}% | |
791 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont m}% | |
792 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont e}% | |
793 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont n}% | |
794 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont t}% | |
795 | +\raise3.514582pt\imagfm{\fontsize{20}{20}\selectfont\jstrut )\inhibitglue}% | |
796 | +\hskip.33333\zw% | |
797 | +\imagfm{\jstrut 本}% | |
798 | +\imagfm{\jstrut 文}% | |
799 | +\hfil | |
800 | +} | |
801 | +\end{center} | |
802 | + | |
803 | +\caption{Second example of shifting baseline.} | |
804 | +\label{fig-small} | |
805 | +\end{figure} | |
806 | + | |
807 | +An example output is shown in Figure~\ref{fig-bls}. The left half is the | |
808 | + output when \textsf{yjabaselineshift} is positive, hence the | |
809 | + baseline of Japanese characters is shifted down. On the other | |
810 | + hand, the right half is the output when | |
811 | + \textsf{yalbaselineshift} is positive, hence the baseline of | |
812 | + alphabetic characters is shifted down. Figure~\ref{fig-small} | |
813 | + shows an intresting use of these parameters. | |
814 | + | |
815 | +\end{description} | |
816 | +Note that \LuaTeX-ja doesn't support vertical typesetting, \emph{tategaki}, for now. | |
817 | + | |
818 | +\subsection{Patches for plain \TeX\ and \LaTeXe} | |
819 | +\pTeX\ has a patch for plain \TeX, namely |ptex.tex|, that for \LaTeXe\ | |
820 | +macro (this patch and \LaTeXe\ consist \emph{p\LaTeXe}), and | |
821 | +|kinsoku.tex| which includes the default setting of \emph{kinsoku | |
822 | +shori}, the Japanese hyphenation. We ported them to \LuaTeX-ja, except | |
823 | +the codes related to vertical typesetting, because \LuaTeX-ja doesn't | |
824 | +support vertical typesetting yet. We remark one point related to the | |
825 | +porting: | |
826 | +\begin{description} | |
827 | + | |
828 | +\item[Behavior of\/ {\tt\char92fontfamily\/}] | |
829 | +The control sequence |\fontfamily| in p\LaTeXe\ changes the current alphabetic | |
830 | + font family and/or the current Japanese font family, | |
831 | + depending the argument. More concretely, | |
832 | + |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the | |
833 | + current alphabetic font family to $\langle\hbox{\it | |
834 | + arg\/}\rangle$, if and only if one of the following | |
835 | + conditions are satisfied: | |
836 | +\begin{itemize} | |
837 | +\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ in | |
838 | + \emph{some} alphabetic encoding is already defined in the document. | |
839 | +\item There exists an alphabetic encoding $\langle\hbox{\it | |
840 | + enc\/}\rangle$ already defined in the document such that a font | |
841 | + definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it | |
842 | + arg\/}\rangle$|.fd| (all lowercase) exists. | |
843 | +\end{itemize} | |
844 | +The same criterion is used for changing Japanese font family. | |
845 | + | |
846 | +To work this behavior well, a list of all (alphabetic) encodings defined | |
847 | + already in the document is needed. However, since \LuaTeX-ja | |
848 | + is loaded as a package, \LuaTeX-ja cannot have this list. | |
849 | + Hence \LuaTeX-ja adopted a different approach, namely | |
850 | + |\fontfamily{|$\langle\hbox{\it arg\/}\rangle$|}| changes the | |
851 | + current alphabetic font family to $\langle\hbox{\it | |
852 | + arg\/}\rangle$, if and only if: | |
853 | +\begin{itemize} | |
854 | +\item An alphabetic font family named $\langle\hbox{\it arg\/}\rangle$ | |
855 | + in the current alphabetic encoding $\langle\hbox{\it | |
856 | + enc\/}\rangle$ is already defined in the document. | |
857 | +\item A font definition file $\langle\hbox{\it enc\/}\rangle\langle\hbox{\it | |
858 | + arg\/}\rangle$|.fd| (all lowercase) exists. | |
859 | +\end{itemize} | |
860 | + | |
861 | + | |
862 | +\end{description} | |
863 | + | |
864 | + | |
865 | + | |
866 | +\subsection{Classes for Japanese documents} | |
867 | +To produce `high-quality' Japanese documents, we need not only that | |
868 | +Japanese characters are correctly placed, but also class files for | |
869 | +Japanese documents. Two major families of classes are widely used in Japan: | |
870 | +\emph{jclasses} which is distributed with the official p\LaTeXe\ macros, | |
871 | +and \emph{jsclasses}. At the present, \LuaTeX-ja | |
872 | +simply contains their counterparts: \emph{ltjclasses} and | |
873 | +\emph{ltjsclasses}. However, the policy on classes is not determined | |
874 | +now, and we hope to have another family of classes which are useful for | |
875 | +commercial printing. In the author's opinion, \emph{ltjclasses} is | |
876 | +better to stay as an example of porting of class files for \pTeX\ to | |
877 | +\LuaTeX-ja. | |
878 | + | |
879 | +\subsection{Patches for packages} | |
880 | +Apart from patches for the \LaTeXe~kernel and classes for Japanese | |
881 | +documents, we need to make patches for several packages. At the present, | |
882 | +we considered the following packages, and made patches or porting for | |
883 | +the former two packages. | |
884 | + | |
885 | +\begin{description} | |
886 | +\item[The \emph{fontspec} package] The \emph{fontspec} package is built | |
887 | + on NFSS2, hence control sequences offered by the | |
888 | + \emph{fontspec} package, such as |\setmainfont|, are only | |
889 | + effective for alphabetic fonts if \LuaTeX-ja is loaded. | |
890 | + \texttt{luatexja-\penalty0fontspec.sty} (not automatically | |
891 | + loaded) offers these counterparts for Japanese fonts, with | |
892 | + additional `j' in the name of control sequences, such as | |
893 | + |\setmainjfont|. As described in | |
894 | + Subsection~\ref{ssec-unichar}, it also includes a patch for | |
895 | + control sequences producing Unicode characters. | |
896 | + | |
897 | +\item[The \emph{otf} package] | |
898 | +This package is widely used in \pTeX\ for typesetting characters which is | |
899 | +not in JIS~X~0208, and for using more than one weight in \emph{mincho} | |
900 | +and \emph{gothic} font families. Therefore \LuaTeX-ja supports features | |
901 | +in the \emph{otf} package, by loading \texttt{luatexja-\penalty0otf.sty} | |
902 | + manually. Note that characters by |\UTF{xxxx}| and | |
903 | + |\CID{xxxx}| are not appended to the current list as a | |
904 | + \emph{glyph\_node}, to avoid from callbacks by the | |
905 | + \emph{luaotfload} package. We have another remark; |\CID| | |
906 | + does not work with TrueType fonts, since |\CID| use the | |
907 | + conversion table between CID and the glyph order of the | |
908 | + current Japanese font. | |
909 | + | |
910 | +\item[The \emph{listings} package] | |
911 | +It is known for users of \pTeX\ that there is a patch |jlisting.sty| for | |
912 | + the \emph{listings} package, to use Japanese characters in | |
913 | + the |lstlisting| environment. Generally speaking, it also can | |
914 | + be used in \LuaTeX-ja. However, it seems to be that a | |
915 | + Japanese character after a space does not recieve any process | |
916 | + of the \emph{listings} package; this is inconvinient when we | |
917 | + use the \emph{showexpl} package. | |
918 | + | |
919 | +There is another way to use characters above 256 with the | |
920 | + \emph{listings} package (described in\cite{apl}). However, | |
921 | + this method is not suitable for Japanese, since the number of | |
922 | + Japanese characters is very large. We hope that the | |
923 | + \emph{listings} package will be able to handle all characters above | |
924 | + 256 without any patch, in the future. | |
925 | + | |
926 | + | |
927 | +\end{description} | |
928 | + | |
929 | + | |
930 | + | |
931 | +\section{Implementation} | |
932 | +\label{sec:implementation} | |
933 | +\subsection{Handling of Japanese fonts} | |
934 | +In \pTeX, there are three slots for maintaining current fonts, namely | |
935 | +|\font| for alphabetic fonts, |\jfont| for Japanese fonts (in horizontal | |
936 | +direction) and |\tfont| for Japanese fonts (in vertical direction). With | |
937 | +these slots, we can manage the current font for alphabetic characters | |
938 | +and that for Japanese characters separately in \pTeX. However, \LuaTeX\ | |
939 | +has only one slot for maintaining the current font, as \TeX82. This | |
940 | +situation leads a problem: how can we maintain the `current Japanese | |
941 | +font'? | |
942 | + | |
943 | +There are three approaches for this problem. One approach is to make a | |
944 | +mapping table from alphabetic fonts to corresponding Japanese fonts | |
945 | +(here we don't assume that NFSS2 is available). Another approach is | |
946 | +that we always use composite fonts with alphabetic fonts and Japanese | |
947 | +fonts. The third approach is that the information of the current | |
948 | +Japanese font is stored in an attribute. We adopted the third approach, | |
949 | +since \LuaTeX-ja is much affected by \pTeX\ as we noted in | |
950 | +Subsection~\ref{ssec-pol}. | |
951 | + | |
952 | +As in Figure~\ref{fig-jfdef}, \LuaTeX-ja uses |\jfont| for defining | |
953 | +Japanese fonts, as \pTeX. However, because the information of the current | |
954 | +Japanese font is stored into an attribute, control sequences defined by | |
955 | +|\jfont| (e.g.,~|\foo| and |\bar| in Figure~\ref{fig-jfdef}) is | |
956 | +not representing a font by the means of \TeX82. In other words, each of | |
957 | +these control sequences is just an assignment to an attribute, therefore | |
958 | +they cannot be an argument of |\the|, |\fontname|, nor |\textfont|. | |
959 | + | |
960 | + | |
961 | +Callbacks by the \emph{luaotfload} package, e.g.,~replacement of glyphs | |
962 | +according to OpenType font features, are performed just after `Examination of | |
963 | +stack level' (see Subsections | |
964 | +\ref{ssec-over}~and~\ref{ssec-stack}). Also note that calculation of | |
965 | +character classes for each Japanese character is done \emph{after} the | |
966 | +these callbacks for now. | |
967 | + | |
968 | +\subsection{Stack management} | |
969 | +\label{ssec-stack} | |
970 | + | |
971 | +As we noted in Subsection~\ref{ssec-csname}, parameters that the values | |
972 | +at the end of a horizontal box or that of a paragraph are valid in | |
973 | +whole box or paragraph, such as \emph{kanjiskip}, cannot be implemented | |
974 | +by internal integers or registers of other types in \TeX. We explain it | |
975 | +in this subsection. | |
976 | + | |
977 | +\begin{figure} | |
978 | +\begin{lstlisting} | |
979 | +void package(int c) | |
980 | +{ | |
981 | + ... | |
982 | + d = box_max_depth; | |
983 | + unsave(); | |
984 | + save_ptr -= 4; | |
985 | + if (cur_list.mode_field == -hmode) { | |
986 | + cur_box = filtered_hpack(cur_list.head_field, | |
987 | + cur_list.tail_field, saved_value(1), | |
988 | + saved_level(1), grp, saved_level(2)); | |
989 | + subtype(cur_box) = HLIST_SUBTYPE_HBOX; | |
990 | + } else { | |
991 | +\end{lstlisting} | |
992 | +\caption{An extract of a CWEB-source \texttt{tex/packaging.w} of \LuaTeX.} | |
993 | +\label{fig-ltsrc} | |
994 | +\end{figure} | |
995 | + | |
996 | +Figure~\ref{fig-ltsrc} is an extract of a CWEB-source | |
997 | +\texttt{tex/packaging.w} of \LuaTeX\ (SVN revision 4358). This function | |
998 | +is called just when an explicit |\hbox{...}| or |\vbox{...}| is ended, and | |
999 | +the function |filtered_hpack()| is where the |hpack_filter| and then the | |
1000 | +actual `hpack' process are performed. Notice that the |unsave()| | |
1001 | +function is called before |filtered_hpack()|. This is the problem; | |
1002 | +because of |unsave()|, we can retrive only the values of registers | |
1003 | +\emph{outside} the box, even in the |hpack_filter| callback. | |
1004 | + | |
1005 | +To cope with this problem, \LuaTeX-ja has its own stack system, based on | |
1006 | +Lua codes in \cite{stack-mail}. Furthermore, \emph{whatsit} nodes whose | |
1007 | +\emph{user\_id} is 30112 (\emph{stack\_node}, for short) will be | |
1008 | +appended to the current horizontal list each time the current stack | |
1009 | +level is incremented, and their values are the values of | |
1010 | +|\currentgrouplevel| at that time. In the beginning of the |hpack_filter| | |
1011 | +callback, the list in question is traversed to determine whether the | |
1012 | +stack level at the end of the list and that outside the box coincides. | |
1013 | + | |
1014 | +Let $x$ be the value of |\currentgrouplevel|, and $y$ be the current | |
1015 | +stack level, both inside the |hpack_filter| callback, i.e.,~outside a | |
1016 | +horizontal box. Consider a list which represents the content of the box, | |
1017 | +then we have: | |
1018 | +\begin{itemize} | |
1019 | +\item A \emph{stack\_node} whose value is $x+1$ (because all materials | |
1020 | + in the box are included in a group |\hbox{...}|, the value of | |
1021 | + |\currentgrouplevel| inside the box is at least $x+1$) in the list | |
1022 | + corresponds to an assignment related to the stack system in just | |
1023 | + top-level of the list, like | |
1024 | +\begin{quote} | |
1025 | +\begin{verbatim} | |
1026 | +\hbox{...(assignment)...} | |
1027 | +\end{verbatim} | |
1028 | +\end{quote} | |
1029 | +In this case, the current stack level is incremented to $y+1$ after the assignment. | |
1030 | +\item A \emph{stack\_node} whose value is more than $x+1$ in the list corresponds | |
1031 | +to an assignment inside another group contained in the box. For example, | |
1032 | + the following input creates | |
1033 | +a \emph{stack\_node} whose value is $x+3=(x+1)+2$: | |
1034 | +\begin{quote} | |
1035 | +\begin{verbatim} | |
1036 | +\hbox{...{...{...(assignment)}...}...} | |
1037 | +\end{verbatim} | |
1038 | +\end{quote} | |
1039 | +\end{itemize} | |
1040 | +Thus, we can conclude that the stack level at the end of the list is | |
1041 | +$y+1$, if and only if there is a \emph{stack\_node} whose value is | |
1042 | +$x+1$. Otherwise, the stack level is just $y$. | |
1043 | + | |
1044 | +\subsection{Adjustment of the position of Japanese characters} | |
1045 | +\label{ssec-width} | |
1046 | + | |
1047 | +The size of a glyph specified in a metric and that of a real font | |
1048 | +usually differ. For example, the letter `\inhibitglue【' is half-width | |
1049 | +in |jfm-ujis.lua| or |jis.tfm|, while this letter is full-width like `【' | |
1050 | +in most TrueType fonts used in Japanese typesetting, such as | |
1051 | +IPA~Mincho. Hence the adjustment of position of such glyphs is | |
1052 | +needed. In the context of \pTeX, this process was performed using virtual fonts. | |
1053 | + | |
1054 | +On the other hand, Lua\TeX-ja does the adjustment by encapsuling a glyph | |
1055 | +into a horizontal box. There are two main reasons why we adopted this | |
1056 | +method; one is that we feared Lua codes for coexisting with callbacks by | |
1057 | +the |luaotfload| package would be large if we use virtual fonts, and the | |
1058 | +other is to cope with shifting of the baseline of characters at the | |
1059 | +same time. | |
1060 | + | |
1061 | +\begin{figure} | |
1062 | +\begin{center}\unitlength=9pt\small | |
1063 | +\begin{picture}(15,12)(-1,-3) | |
1064 | + | |
1065 | +\color{grayx}% real glyph | |
1066 | +\put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength} | |
1067 | + | |
1068 | +\color{black}% real glyph :step1 | |
1069 | +\thicklines | |
1070 | +\put(-1,-1.5){\line(0,1){7}\line(0,-1){2.5}} | |
1071 | +\put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}} | |
1072 | +\put(-1,5.5){\line(1,0){6}} | |
1073 | +\put(-1,-4){\line(1,0){6}} | |
1074 | +\put(-1,0){\makebox(0,0)[r]{\strut$R$\,}} | |
1075 | + | |
1076 | +\thicklines | |
1077 | +\put(0,0){\vector(0,1){9}\line(0,-1){3}\vector(1,0){12}} | |
1078 | +\put(12,9){\makebox(0,0)[rt]{\strut$M$\,}} | |
1079 | +\put(12,0){\line(0,1){9}\vector(0,-1){3}} | |
1080 | +\put(0,9){\line(1,0){12}} | |
1081 | +\put(0,-3){\line(1,0){12}} | |
1082 | +\put(0.2,4.5){\makebox(0,0)[l]{\texttt{height}}} | |
1083 | +\put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}} | |
1084 | +\put(6,0.2){\makebox(0,0)[b]{\texttt{width}}} | |
1085 | + | |
1086 | +\thicklines | |
1087 | +\put(3,0){\line(0,1){7}\line(0,-1){2.5}\line(1,0){6}} | |
1088 | +\put(9,0){\line(0,1){7}\line(0,-1){2.5}} | |
1089 | +\put(3,7){\line(1,0){6}} | |
1090 | +\put(3,-2.5){\line(1,0){6}} | |
1091 | +\newsavebox{\eqdist} | |
1092 | +\savebox{\eqdist}(0,0)[c]{% | |
1093 | + \thinlines | |
1094 | + \put(-0.08,0.2){\line(0,-1){0.4}}% | |
1095 | + \put(0.08,0.2){\line(0,-1){0.4}}} | |
1096 | +\put(1.5,0){\usebox{\eqdist}} | |
1097 | +\put(10.5,0){\usebox{\eqdist}} | |
1098 | + | |
1099 | +\thicklines | |
1100 | +\put(3,-1.5){\vector(-1,0){4}} | |
1101 | +\put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}} | |
1102 | +\put(3,0){\vector(0,-1){1.5}} | |
1103 | +\put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}} | |
1104 | +\end{picture} | |
1105 | +\end{center} | |
1106 | +\caption{The position of the `real' glyph.} | |
1107 | +\label{fig-pos} | |
1108 | +\end{figure} | |
1109 | + | |
1110 | +Figure~\ref{fig-pos} shows the adjustment process. A large square $M$ is | |
1111 | +the imaginary body specified in the metric, and a vertical | |
1112 | +rectangle is the imaginary body of a real glyph. First, the real glyph | |
1113 | +is aligned with respect to the width of $M$. In the figure, the real | |
1114 | +glyph is aligned `middle'; this setting is useful for the full-width | |
1115 | +middle dot `・'. We have other settings, `left' and `right'. | |
1116 | +After that, it is shifted according to the value of |left| and |down|, | |
1117 | +which are specified in the metric, too. The final position of the real glyph | |
1118 | +is shown by the gray rectangle~$R$. If the amount of shifting the baseline is | |
1119 | +not zero, $M$ (and hence the real glyph) is shifted by that amount. | |
1120 | + | |
1121 | +We would like to remark briefly on the vertical position of a real | |
1122 | +glyph. A JFM (or a metric used in \LuaTeX-ja) and a real font used for | |
1123 | +it may have different height or depth. In that case, it may look better | |
1124 | +if the real glyph is shifted vertically to match the height-depth ratio | |
1125 | +specified in the metric, while any vertical adjustment except the | |
1126 | +adjustment by the |down| value does not performed in the present | |
1127 | +implementation of \LuaTeX-ja . This situation is carefully studied by | |
1128 | +Otobe~\cite{min10}. Here the policy on this problem is not determined | |
1129 | +now, however we would like to offer several solutions in future | |
1130 | +development. | |
1131 | + | |
1132 | +\section{Conclusion} | |
1133 | +We have discussed about our \LuaTeX-ja package, which is much affected | |
1134 | +by \pTeX. For now, it can be used for experimental use, however there | |
1135 | +are much refinements which are needed for regular use. The author hopes | |
1136 | +that this paper and \LuaTeX-ja project contribute the typesetting Japanese, | |
1137 | +and possibly other Asian languages, under \LuaTeX. | |
1138 | + | |
1139 | +\section*{Acknowledgements} | |
1140 | +The author would like to thank Ken Nakano and Hideaki Togashi for their | |
1141 | +development of ASCII \pTeX. The author is very grateful to Haruhiko | |
1142 | +Okumura for his leadership in the Japanese \TeX\ community. The author | |
1143 | +is also very grateful to members of \LuaTeX-ja project team for their | |
1144 | +valuable cooperation in development. | |
1145 | + | |
1146 | +%%% The style of the bibiliogrphy is `amsplain'. | |
1147 | +\providecommand{\bysame}{\leavevmode\hbox to3em{\hrulefill}\thinspace} | |
1148 | +\providecommand{\href}[2]{#2} | |
1149 | +\begin{thebibliography}{99} | |
1150 | + | |
1151 | +\bibitem{aj16} | |
1152 | +Adobe Systems Incorporated, \emph{Adobe-Japan1-6 Character Collection | |
1153 | + for CID-Keyed Fonts}, Technical Note~\#5078, 2004. | |
1154 | +\url{http://partners.adobe.com/public/developer/en/font/5078.Adobe-Japan1-6.pdf} | |
1155 | + | |
1156 | +\bibitem{ptex} | |
1157 | +ASCII MEDIA WORKS,アスキー日本語\TeX\ (\pTeX).\url{http://ascii.asciimw.jp/pb/ptex/} | |
1158 | + | |
1159 | +\bibitem{apl} | |
1160 | +John Baker, \emph{Typesetting UTF8 APL code with the \LaTeX\ lstlisting package}. | |
1161 | +\url{http://bakerjd99.wordpress.com/2011/08/15/} | |
1162 | + | |
1163 | +\bibitem{omega} | |
1164 | +Jin-Hwan~Cho and Haruhiko Okumura, \emph{Typesetting CJK Languages with Omega}, | |
1165 | +\TeX, XML, and Digital Typography, Lecture Notes in Computer Science, vol.~3130, | |
1166 | +Springer, 2004, 139--148. | |
1167 | + | |
1168 | +\bibitem{joylua} | |
1169 | +Yannis Haralambous. \emph{The Joy of \LuaTeX}. \url{http://luatex.bluwiki.com/} | |
1170 | + | |
1171 | +\bibitem{jisx4051} | |
1172 | +Japanese Industrial Standards Committee. \emph{JIS~X~4051: Formatting | |
1173 | + rules for Japanese documents}, 1993, 1995, 2004. | |
1174 | + | |
1175 | +\bibitem{eptex} | |
1176 | +北川弘典,$\varepsilon$-\pTeX についてのwiki. | |
1177 | +\url{http://sourceforge.jp/projects/eptex/wiki/FrontPage} | |
1178 | + | |
1179 | +\bibitem{luaums} | |
1180 | +北川弘典,\LuaTeX で日本語. | |
1181 | +\url{http://oku.edu.mie-u.ac.jp/tex/mod/forum/discuss.php?d=378} | |
1182 | + | |
1183 | +\bibitem{luatexref} | |
1184 | +\LuaTeX\ development team, \emph{The \LuaTeX\ reference}. | |
1185 | +\url{http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf} (snapshot of SVN trunk) | |
1186 | + | |
1187 | +\bibitem{man} | |
1188 | +\LuaTeX-ja project team, \emph{The \LuaTeX-ja package}. | |
1189 | +Not completed for now. Available at |doc/man-en.pdf| (in English) or | |
1190 | + |doc/man-ja.pdf| (in Japanese) | |
1191 | +in the Git repository. | |
1192 | + | |
1193 | +\bibitem{luajp-test} | |
1194 | +香田温人,\LuaTeX と日本語. | |
1195 | +\url{http://www1.pm.tokushima-u.ac.jp/~kohda/tex/luatex-old.html} | |
1196 | + | |
1197 | +\bibitem{luajalayout} | |
1198 | +前田一貴,luajalayout パッケージ---Lua\LaTeX によ | |
1199 | + る日本語組版---. | |
1200 | +\url{http://www-is.amp.i.kyoto-u.ac.jp/lab/kmaeda/lualatex/luajalayout/} | |
1201 | + | |
1202 | +\bibitem{jsclasses} | |
1203 | +奥村晴彦,p\LaTeXe 新ドキュメントクラス. | |
1204 | +\url{http://oku.edu.mie-u.ac.jp/~okumura/jsclasses/} | |
1205 | + | |
1206 | +\bibitem{ptexjp} | |
1207 | +Haruhiko Okumura, \emph{\pTeX\ and Japanese Typesetting}, | |
1208 | + The Asian Journal of \TeX\ \textbf{2}~(2008), 43--51. | |
1209 | + | |
1210 | +\bibitem{min10} | |
1211 | +乙部厳己,min10フォントについて. | |
1212 | +\url{http://argent.shinshu-u.ac.jp/~otobe/tex/files/min10.pdf} | |
1213 | + | |
1214 | +\bibitem{otf} | |
1215 | +齋藤修三郎,Open Type Font用VF. | |
1216 | +\url{http://psitau.kitunebi.com/otf.html} | |
1217 | + | |
1218 | +\bibitem{stack-mail} | |
1219 | +Jonathan Sauer, \emph{[Dev-luatex] tex.currentgrouplevel}. | |
1220 | +\url{http://www.ntg.nl/pipermail/dev-luatex/2008-August/001765.html} | |
1221 | + | |
1222 | +\bibitem{uptex} | |
1223 | +Takuji Tanaka, \emph{u\pTeX, up\LaTeX---unicode version of \pTeX, p\LaTeX}. | |
1224 | +\url{http://homepage3.nifty.com/ttk/comp/tex/uptex_en.html} | |
1225 | + | |
1226 | +\bibitem{ptexenc} | |
1227 | +Nobuyuki Tsuchimura and Yusuke Kuroki, \emph{Development of Japanese \TeX\ Environment}, | |
1228 | + The Asian Journal of \TeX\ \textbf{2}~(2008), 53--62. | |
1229 | + | |
1230 | +\bibitem{w3c} | |
1231 | +W3C Working Group, \emph{Requirements for Japanese Text Layout}. | |
1232 | +\url{http://www.w3.org/TR/jlreq/} | |
1233 | +\end{thebibliography} | |
1234 | + | |
1235 | +\end{document} |