我有以下测试:
use Test::More;
use Lingua::EN::NameCase 'nc';
use utf8;
my $output = Test::Builder->new->todo_output;
binmode $output, ':encoding(UTF-8)';
$output = Test::Builder->new->failure_output;
binmode $output, ':encoding(UTF-8)';
my $name = 'Lintão';
is nc($name), $name, 'nc() should not change a properly namecased name';
diag nc($name);
done_testing;
在带有Perl 5.10.1的Mac OS X上,我得到以下输出:
nc.t ..
ok 1 - nc() should not change a properly namecased name
1..1
# Lintão
ok
All tests successful.
Files=1, Tests=1, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.04 cusr 0.00 csys = 0.07 CPU)
Result: PASS
不幸的是,在Debian Squeezebox上使用5.10.1 Perl进行的相同测试产生了以下输出:
nc.t ..
not ok 1 - nc() should not change a properly namecased name
# Failed test 'nc() should not change a properly namecased name'
# at nc.t line 10.
# got: 'LintãO'
# expected: 'Lintão'
# LintãO
1..1
# Looks like you failed 1 test of 1.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/1 subtests
Test Summary Report
-------------------
nc.t (Wstat: 256 Tests: 1 Failed: 1)
Failed test: 1
Non-zero exit status: 1
Files=1, Tests=1, 0 wallclock secs ( 0.01 usr 0.00 sys + 0.03 cusr 0.00 csys = 0.04 CPU)
Result: FAIL
nc()
子例程中令人反感的行看起来是这样的:
s{ \b (\w) }{\u$1}gox ; # Uppercase first letter of every word.
因此,以某种方式,在Debian上使用的同一版本的Perl会误解“边界”一词。谁能帮我进一步调试?
您的Linux机器上的语言环境不考虑ã
单词字符(Lingua::EN::NameCase
已设置为,use locale;
因此它使用当前LC_CTYPE
设置进行字符分类)。使用范围从5.8.1到5.18.1的perls,我在具有en_GB.UTF-8
区域设置的Ubuntu 12.04 LTS上一致地获得以下输出:
$ perl -Mutf8 -le 'print 0+("ã" =~ /\w/); use locale; print 0+("ã" =~ /\w/)'
1
0
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句