argra****@users*****
argra****@users*****
2009年 4月 23日 (木) 22:53:42 JST
Index: docs/modules/re-0.08/re.pod diff -u /dev/null docs/modules/re-0.08/re.pod:1.1 --- /dev/null Thu Apr 23 22:53:42 2009 +++ docs/modules/re-0.08/re.pod Thu Apr 23 22:53:42 2009 @@ -0,0 +1,814 @@ + +=encoding euc-jp + +=head1 NAME + +=begin original + +re - Perl pragma to alter regular expression behaviour + +=end original + +re - 正規表現の振る舞いを変えるための Perl プラグマ + +=head1 SYNOPSIS + +=begin original + + use re 'taint'; + ($x) = ($^X =~ /^(.*)$/s); # $x is tainted here + +=end original + + use re 'taint'; + ($x) = ($^X =~ /^(.*)$/s); # $x はここで汚染されている + +=begin original + + $pat = '(?{ $foo = 1 })'; + use re 'eval'; + /foo${pat}bar/; # won't fail (when not under -T switch) + +=end original + + $pat = '(?{ $foo = 1 })'; + use re 'eval'; + /foo${pat}bar/; # 失敗しない (-T スイッチがないとき) + +=begin original + + { + no re 'taint'; # the default + ($x) = ($^X =~ /^(.*)$/s); # $x is not tainted here + +=end original + + { + no re 'taint'; # デフォルト + ($x) = ($^X =~ /^(.*)$/s); # $x はここで汚染されていない + +=begin original + + no re 'eval'; # the default + /foo${pat}bar/; # disallowed (with or without -T switch) + } + +=end original + + no re 'eval'; # デフォルト + /foo${pat}bar/; # 許されない (-T スイッチの有無に関係なく) + } + +=begin original + + use re 'debug'; # output debugging info during + /^(.*)$/s; # compile and run time + +=end original + + use re 'debug'; # コンパイル時と実行時に + /^(.*)$/s; # デバッグ情報を出力する + + +=begin original + + use re 'debugcolor'; # same as 'debug', but with colored output + ... + +=end original + + use re 'debugcolor'; # 'debug' と同じだが、出力に色がつく + ... + + use re qw(Debug All); # Finer tuned debugging options. + use re qw(Debug More); + no re qw(Debug ALL); # Turn of all re debugging in this scope + + use re qw(is_regexp regexp_pattern); # import utility functions + my ($pat,$mods)=regexp_pattern(qr/foo/i); + if (is_regexp($obj)) { + print "Got regexp: ", + scalar regexp_pattern($obj); # just as perl would stringify it + } # but no hassle with blessed re's. + +=begin original + +(We use $^X in these examples because it's tainted by default.) + +=end original + +(この例では、デフォルトで汚染されているので $^X を使いました) + +=head1 DESCRIPTION + +=head2 'taint' mode + +=begin original + +When C<use re 'taint'> is in effect, and a tainted string is the target +of a regex, the regex memories (or values returned by the m// operator +in list context) are tainted. This feature is useful when regex operations +on tainted data aren't meant to extract safe substrings, but to perform +other transformations. + +=end original + +C<use re 'taint'> が有効で、汚染された文字列が正規表現の +ターゲットであるとき、正規表現のメモリー(もしくはリストコンテキストで +m// 演算子が返す値)は汚染されます。 +この機能は汚染されたデータに対する正規表現演算が安全な部分文字列を +取り出すものでないときに便利ですが、その他の変換は働きます。 + +=head2 'eval' mode + +=begin original + +When C<use re 'eval'> is in effect, a regex is allowed to contain +C<(?{ ... })> zero-width assertions even if regular expression contains +variable interpolation. That is normally disallowed, since it is a +potential security risk. Note that this pragma is ignored when the regular +expression is obtained from tainted data, i.e. evaluation is always +disallowed with tainted regular expressions. See L<perlre/(?{ code })>. + +=end original + +C<use re 'eval'> が有効なとき、変数展開を含む正規表現でも +ゼロ幅表明 C<(?{ ... })> を持つことができます。 +これは通常はセキュリティ上のリスクとなる可能性があるので許されていません。 +このプラグマは正規表現が汚染されたデータからきたものである場合には +無視されることに注意してください。 +つまり、汚染された正規表現を評価することは常に許されません。 +L<perlre/(?{ code })> を参照してください。 + +=begin original + +For the purpose of this pragma, interpolation of precompiled regular +expressions (i.e., the result of C<qr//>) is I<not> considered variable +interpolation. Thus: + +=end original + +このプラグマの目的のため、プリコンパイルされた正規表現 +(つまり、C<qr//> の結果)の展開(interpolation)は変数展開とは +I<みなされません>。 +したがって: + + /foo${pat}bar/ + +=begin original + +I<is> allowed if $pat is a precompiled regular expression, even +if $pat contains C<(?{ ... })> assertions. + +=end original + +は、$pat がプリコンパイルされた正規表現であれば、たとえ $pat が +C<(?{ ... })> 表明を含んでいたとしても I<許されます>。 + +=head2 'debug' mode + +=begin original + +When C<use re 'debug'> is in effect, perl emits debugging messages when +compiling and using regular expressions. The output is the same as that +obtained by running a C<-DDEBUGGING>-enabled perl interpreter with the +B<-Dr> switch. It may be quite voluminous depending on the complexity +of the match. Using C<debugcolor> instead of C<debug> enables a +form of output that can be used to get a colorful display on terminals +that understand termcap color sequences. Set C<$ENV{PERL_RE_TC}> to a +comma-separated list of C<termcap> properties to use for highlighting +strings on/off, pre-point part on/off. +See L<perldebug/"Debugging regular expressions"> for additional info. + +=end original + +C<use re 'debug'> が有効なとき、perl は正規表現をコンパイルするときと +使うときにデバッグ用メッセージを出力します。 +その出力は C<-DDEBUGGING> が有効になっている perl インタプリタに +B<-Dr> スイッチを与えたときと同じです。 +これはマッチの複雑さに応じて非常に多弁になる可能性があります。 +C<debug> の代わりに C<debugcolor> を使うと、termcap カラーシーケンスを +使ったカラフルな出力を端末に行います。 +C<termcap> プロパティのカンマ区切りのリストを C<$ENV{PERL_RE_TC}> に +セットすることで、文字列のオン/オフや pre-point 部分のオン/オフを +ハイライトできます。 +更なる情報については +L<perldebug/"Debugging regular expressions"> を参照してください。 + +=begin original + +As of 5.9.5 the directive C<use re 'debug'> and its equivalents are +lexically scoped, as the other directives are. However they have both +compile-time and run-time effects. + +=end original + +5.9.5 現在、 +C<use re 'debug'> 指示子およびそれと等価な設定は、他の指示子と同様 +レキシカルスコープです。 +しかしこれらはコンパイル時と実行時の両方に影響を及ぼします。 + +=begin original + +See L<perlmodlib/Pragmatic Modules>. + +=end original + +L<perlmodlib/Pragmatic Modules> を参照してください。 + +=head2 'Debug' mode + +=begin original + +Similarly C<use re 'Debug'> produces debugging output, the difference +being that it allows the fine tuning of what debugging output will be +emitted. Options are divided into three groups, those related to +compilation, those related to execution and those related to special +purposes. The options are as follows: + +=end original + +Similarly C<use re 'Debug'> produces debugging output, the difference +being that it allows the fine tuning of what debugging output will be +emitted. Options are divided into three groups, those related to +compilation, those related to execution and those related to special +purposes. The options are as follows: +(TBT) + +=over 4 + +=item Compile related options + +=over 4 + +=item COMPILE + +=begin original + +Turns on all compile related debug options. + +=end original + +Turns on all compile related debug options. +(TBT) + +=item PARSE + +=begin original + +Turns on debug output related to the process of parsing the pattern. + +=end original + +Turns on debug output related to the process of parsing the pattern. +(TBT) + +=item OPTIMISE + +=begin original + +Enables output related to the optimisation phase of compilation. + +=end original + +Enables output related to the optimisation phase of compilation. +(TBT) + +=item TRIEC + +=begin original + +Detailed info about trie compilation. + +=end original + +Detailed info about trie compilation. +(TBT) + +=item DUMP + +=begin original + +Dump the final program out after it is compiled and optimised. + +=end original + +Dump the final program out after it is compiled and optimised. +(TBT) + +=back + +=item Execute related options + +=over 4 + +=item EXECUTE + +=begin original + +Turns on all execute related debug options. + +=end original + +Turns on all execute related debug options. +(TBT) + +=item MATCH + +=begin original + +Turns on debugging of the main matching loop. + +=end original + +Turns on debugging of the main matching loop. +(TBT) + +=item TRIEE + +=begin original + +Extra debugging of how tries execute. + +=end original + +Extra debugging of how tries execute. +(TBT) + +=item INTUIT + +=begin original + +Enable debugging of start point optimisations. + +=end original + +Enable debugging of start point optimisations. +(TBT) + +=back + +=item Extra debugging options + +=over 4 + +=item EXTRA + +=begin original + +Turns on all "extra" debugging options. + +=end original + +Turns on all "extra" debugging options. +(TBT) + +=item BUFFERS + +=begin original + +Enable debugging the capture buffer storage during match. Warning, +this can potentially produce extremely large output. + +=end original + +Enable debugging the capture buffer storage during match. Warning, +this can potentially produce extremely large output. +(TBT) + +=item TRIEM + +=begin original + +Enable enhanced TRIE debugging. Enhances both TRIEE +and TRIEC. + +=end original + +Enable enhanced TRIE debugging. Enhances both TRIEE +and TRIEC. +(TBT) + +=item STATE + +=begin original + +Enable debugging of states in the engine. + +=end original + +Enable debugging of states in the engine. +(TBT) + +=item STACK + +=begin original + +Enable debugging of the recursion stack in the engine. Enabling +or disabling this option automatically does the same for debugging +states as well. This output from this can be quite large. + +=end original + +Enable debugging of the recursion stack in the engine. Enabling +or disabling this option automatically does the same for debugging +states as well. This output from this can be quite large. +(TBT) + +=item OPTIMISEM + +=begin original + +Enable enhanced optimisation debugging and start point optimisations. +Probably not useful except when debugging the regex engine itself. + +=end original + +Enable enhanced optimisation debugging and start point optimisations. +Probably not useful except when debugging the regex engine itself. +(TBT) + +=item OFFSETS + +=begin original + +Dump offset information. This can be used to see how regops correlate +to the pattern. Output format is + +=end original + +Dump offset information. This can be used to see how regops correlate +to the pattern. Output format is +(TBT) + + NODENUM:POSITION[LENGTH] + +=begin original + +Where 1 is the position of the first char in the string. Note that position +can be 0, or larger than the actual length of the pattern, likewise length +can be zero. + +=end original + +Where 1 is the position of the first char in the string. Note that position +can be 0, or larger than the actual length of the pattern, likewise length +can be zero. +(TBT) + +=item OFFSETSDBG + +=begin original + +Enable debugging of offsets information. This emits copious +amounts of trace information and doesn't mesh well with other +debug options. + +=end original + +Enable debugging of offsets information. This emits copious +amounts of trace information and doesn't mesh well with other +debug options. +(TBT) + +=begin original + +Almost definitely only useful to people hacking +on the offsets part of the debug engine. + +=end original + +Almost definitely only useful to people hacking +on the offsets part of the debug engine. +(TBT) + +=back + +=item Other useful flags + +=begin original + +These are useful shortcuts to save on the typing. + +=end original + +These are useful shortcuts to save on the typing. +(TBT) + +=over 4 + +=item ALL + +=begin original + +Enable all options at once except OFFSETS, OFFSETSDBG and BUFFERS + +=end original + +Enable all options at once except OFFSETS, OFFSETSDBG and BUFFERS +(TBT) + +=item All + +=begin original + +Enable DUMP and all execute options. Equivalent to: + +=end original + +Enable DUMP and all execute options. Equivalent to: +(TBT) + + use re 'debug'; + +=item MORE + +=item More + +=begin original + +Enable TRIEM and all execute compile and execute options. + +=end original + +Enable TRIEM and all execute compile and execute options. +(TBT) + +=back + +=back + +=begin original + +As of 5.9.5 the directive C<use re 'debug'> and its equivalents are +lexically scoped, as the other directives are. However they have both +compile-time and run-time effects. + +=end original + +As of 5.9.5 the directive C<use re 'debug'> and its equivalents are +lexically scoped, as the other directives are. However they have both +compile-time and run-time effects. +(TBT) + +=head2 Exportable Functions + +=begin original + +As of perl 5.9.5 're' debug contains a number of utility functions that +may be optionally exported into the caller's namespace. They are listed +below. + +=end original + +As of perl 5.9.5 're' debug contains a number of utility functions that +may be optionally exported into the caller's namespace. They are listed +below. +(TBT) + +=over 4 + +=item is_regexp($ref) + +=begin original + +Returns true if the argument is a compiled regular expression as returned +by C<qr//>, false if it is not. + +=end original + +Returns true if the argument is a compiled regular expression as returned +by C<qr//>, false if it is not. +(TBT) + +=begin original + +This function will not be confused by overloading or blessing. In +internals terms, this extracts the regexp pointer out of the +PERL_MAGIC_qr structure so it it cannot be fooled. + +=end original + +This function will not be confused by overloading or blessing. In +internals terms, this extracts the regexp pointer out of the +PERL_MAGIC_qr structure so it it cannot be fooled. +(TBT) + +=item regexp_pattern($ref) + +=begin original + +If the argument is a compiled regular expression as returned by C<qr//>, +then this function returns the pattern. + +=end original + +If the argument is a compiled regular expression as returned by C<qr//>, +then this function returns the pattern. +(TBT) + +=begin original + +In list context it returns a two element list, the first element +containing the pattern and the second containing the modifiers used when +the pattern was compiled. + +=end original + +In list context it returns a two element list, the first element +containing the pattern and the second containing the modifiers used when +the pattern was compiled. +(TBT) + + my ($pat, $mods) = regexp_pattern($ref); + +=begin original + +In scalar context it returns the same as perl would when strigifying a raw +C<qr//> with the same pattern inside. If the argument is not a compiled +reference then this routine returns false but defined in scalar context, +and the empty list in list context. Thus the following + +=end original + +In scalar context it returns the same as perl would when strigifying a raw +C<qr//> with the same pattern inside. If the argument is not a compiled +reference then this routine returns false but defined in scalar context, +and the empty list in list context. Thus the following +(TBT) + + if (regexp_pattern($ref) eq '(?i-xsm:foo)') + +=begin original + +will be warning free regardless of what $ref actually is. + +=end original + +will be warning free regardless of what $ref actually is. +(TBT) + +=begin original + +Like C<is_regexp> this function will not be confused by overloading +or blessing of the object. + +=end original + +Like C<is_regexp> this function will not be confused by overloading +or blessing of the object. +(TBT) + +=item regmust($ref) + +=begin original + +If the argument is a compiled regular expression as returned by C<qr//>, +then this function returns what the optimiser consiers to be the longest +anchored fixed string and longest floating fixed string in the pattern. + +=end original + +If the argument is a compiled regular expression as returned by C<qr//>, +then this function returns what the optimiser consiers to be the longest +anchored fixed string and longest floating fixed string in the pattern. +(TBT) + +=begin original + +A I<fixed string> is defined as being a substring that must appear for the +pattern to match. An I<anchored fixed string> is a fixed string that must +appear at a particular offset from the beginning of the match. A I<floating +fixed string> is defined as a fixed string that can appear at any point in +a range of positions relative to the start of the match. For example, + +=end original + +A I<fixed string> is defined as being a substring that must appear for the +pattern to match. An I<anchored fixed string> is a fixed string that must +appear at a particular offset from the beginning of the match. A I<floating +fixed string> is defined as a fixed string that can appear at any point in +a range of positions relative to the start of the match. For example, +(TBT) + + my $qr = qr/here .* there/x; + my ($anchored, $floating) = regmust($qr); + print "anchored:'$anchored'\nfloating:'$floating'\n"; + +=begin original + +results in + +=end original + +results in +(TBT) + + anchored:'here' + floating:'there' + +=begin original + +Because the C<here> is before the C<.*> in the pattern, its position +can be determined exactly. That's not true, however, for the C<there>; +it could appear at any point after where the anchored string appeared. +Perl uses both for its optimisations, prefering the longer, or, if they are +equal, the floating. + +=end original + +Because the C<here> is before the C<.*> in the pattern, its position +can be determined exactly. That's not true, however, for the C<there>; +it could appear at any point after where the anchored string appeared. +Perl uses both for its optimisations, prefering the longer, or, if they are +equal, the floating. +(TBT) + +=begin original + +B<NOTE:> This may not necessarily be the definitive longest anchored and +floating string. This will be what the optimiser of the Perl that you +are using thinks is the longest. If you believe that the result is wrong +please report it via the L<perlbug> utility. + +=end original + +B<NOTE:> This may not necessarily be the definitive longest anchored and +floating string. This will be what the optimiser of the Perl that you +are using thinks is the longest. If you believe that the result is wrong +please report it via the L<perlbug> utility. +(TBT) + +=item regname($name,$all) + +=begin original + +Returns the contents of a named buffer of the last successful match. If +$all is true, then returns an array ref containing one entry per buffer, +otherwise returns the first defined buffer. + +=end original + +Returns the contents of a named buffer of the last successful match. If +$all is true, then returns an array ref containing one entry per buffer, +otherwise returns the first defined buffer. +(TBT) + +=item regnames($all) + +=begin original + +Returns a list of all of the named buffers defined in the last successful +match. If $all is true, then it returns all names defined, if not it returns +only names which were involved in the match. + +=end original + +Returns a list of all of the named buffers defined in the last successful +match. If $all is true, then it returns all names defined, if not it returns +only names which were involved in the match. +(TBT) + +=item regnames_count() + +=begin original + +Returns the number of distinct names defined in the pattern used +for the last successful match. + +=end original + +Returns the number of distinct names defined in the pattern used +for the last successful match. +(TBT) + +=begin original + +B<Note:> this result is always the actual number of distinct +named buffers defined, it may not actually match that which is +returned by C<regnames()> and related routines when those routines +have not been called with the $all parameter set. + +=end original + +B<Note:> this result is always the actual number of distinct +named buffers defined, it may not actually match that which is +returned by C<regnames()> and related routines when those routines +have not been called with the $all parameter set. +(TBT) + +=back + +=head1 SEE ALSO + +L<perlmodlib/Pragmatic Modules>. + +=begin meta + +Created: KIMURA Koichi (0.04) +Updated: Kentaro Shirakata <argra****@ub32*****> (0.08) + +=end meta + +=cut +