前言:
该阅读的书籍:
《Mastering Regular Expressions》, Jeffrey Friedl
《Perl Cookbook》,Tom Christiansen和Nathan Torkington
第一部分:概述
第一章:Perl概述
赋值:双引号可以用来变量内插和反斜扛内插,比如把\n解释成换行符;单引号取消内插;反引号用来执行外部程序并返回程序的输出。
举例:
$pi=3.14159265; #一个实数
$avocados=6.02e23; #科学计数法
$sign="I love my $pet"; #带内插的字符串
$thence=$whence; #另一个变量的值
$exit system("vi $file"); #一条命令的数字状态
$cwd=`pwd`; #从一个命令输出的字符串
标量也可以保存对其他数据结构的引用,包括子例程和对象。
$ary=\@myarray; #引用一个命名数组
$hah=\%myhash; #引用一个命名散列
$sub=\%mysub; #引用一个命名子例程
$ary=[1,2,3,4,5]; #引用一个未命名的数组
$fido=new Camel "Amelia"; #引用一个对象
如果使用的是未赋值的变量,perl会自动根据上下文的环境来确定变量的类型和初值:
$camels='123';
print $camels+1, "\n";
#最初变量camels是个字符串变量,但是遇到+1,perl会先转换成数值型变量,然后再转换成字符串显示出来。
数组赋值:
@home = ("couch", "chair", "table", "stove");
从数组给4个变量赋值:
($potato, $lift, $tennis, $pipe) = @home;
交换两个变量不需要中间值:
($alpha,$omega) = ($omega,$alpha);
数组是从0开始计数的,如果你想一次对一个数组元素进行赋值:
$home[0] = "couch";
$home[1] = "chair";
$home[2] = "table";
$home[3] = "stove";
散列是无序的,不象数组类似于一个堆栈,有开始也有结尾。因此构建散列时必须同时指定值和健。
%longday = ("Sun", "Sunday", "Mon", "Monday", "Tue", "Tuesday",
"Wed", "Wednesday", "Thu", "Thursday", "Fri",
"Friday", "Sat", "Saturday"); #非常难读
%longday = (
"Sun" => "Sunday",
"Mon" => "Monday",
"Tue" => "Tuesday",
"Wed" => "Wednesday",
"Thu" => "Thursday",
"Fri" => "Friday",
"Sat" => "Saturday",
); #非常清晰,"健" => "值"
散列单个赋值:
$wife{"Adam"} = "Eve";
复杂的数据结构:
$wife{"Jacob"} = ["Leah", "Rachel", "Bilhah", "Zilpah"]; #
等于下面4行
$wife{"Jacob"}[0] = "Leah";
$wife{"Jacob"}[1] = "Rachel";
$wife{"Jacob"}[2] = "Bilhah";
$wife{"Jacob"}[3] = "Zilpah";
$kids_of_wife{"Jacob"} = {
"Leah" => ["Reuben", "Simeon", "Levi", "Judah", "Issachar",
"Zebulun"],
"Rachel" => ["Joseph", "Benjamin"],
"Bilhah" => ["Dan", "Naphtali"],
"Zilpah" => ["Gad", "Asher"],
};
$kids_of_wife{"Jacob"}{"Leah"}[0] = "Reuben";
$kids_of_wife{"Jacob"}{"Leah"}[1] = "Simeon";
$kids_of_wife{"Jacob"}{"Leah"}[2] = "Levi";
$kids_of_wife{"Jacob"}{"Leah"}[3] = "Judah";
$kids_of_wife{"Jacob"}{"Leah"}[4] = "Issachar";
$kids_of_wife{"Jacob"}{"Leah"}[5] = "Zebulun";
$kids_of_wife{"Jacob"}{"Rachel"}[0] = "Joseph";
$kids_of_wife{"Jacob"}{"Rachel"}[1] = "Benjamin";
$kids_of_wife{"Jacob"}{"Bilhah"}[0] = "Dan";
$kids_of_wife{"Jacob"}{"Bilhah"}[1] = "Naphtali";
$kids_of_wife{"Jacob"}{"Zilpah"}[0] = "Gad";
$kids_of_wife{"Jacob"}{"Zilpah"}[1] = "Asher";
名字空间:建立自己的名字空间使得程序中使用的变量不会被混淆,比如:
package Camel;
$fido = &fetch(); # $fido的真实名字应该时$Camel::fido
&Camel::fetch
package Dog;
$fido = &fetch(); # $fido代表$Dog:fido &Dog:fetch
两个名字空间中的变量不会被混淆,但是同一时间只能有一个名字空间。
引用已经写好的模块:CPAN中有大量的已经写好的模块供使用。
use Camel;
use strict; #
严格的约束Perl中的一些规则,一般代码超过一屏最好使用它
执行单行perl程序:
perl -e 'print "Hello, world!\n";'
文件句柄:
open(SESAME, "filename") # read from existing file
open(SESAME, "<filename") # 同上,显式指定
open(SESAME, ">filename") # create file and write to it
open(SESAME, ">>filename") # append to existing file
open(SESAME, "| output-pipe-command") # set up an output filter
open(SESAME, "input-pipe-command |") # set up an input filter
钻石操作符:<STDIN>
这本书里面没有这么说,不过在入门的那本书确实这么形容的。
print STDOUT "Enter a number: "; # ask for a number
$number = <STDIN>; # input the number
print STDOUT "The number is $number.\n"; # print the number
chomp和chop:我认为只要记住chomp就可以了,因为chomp会自动去除结束标记,例如换行,而chop不管最后一个是什么字符,都会去掉。
chomp($number = <STDIN>); #输入数字并删除换行符
if和unless语句
if ($debug_level > 0) {
# Something has gone wrong. Tell the user.
print "Debug: Danger, Will Robinson, danger!\n";
print "Debug: Answer was '54', expected '42'.\n";
}
在perl中大括号是必须的。
if ($city eq "New York") {
print "New York is northeast of Washington, D.C.\n";
}
elsif ($city eq "Chicago") {
print "Chicago is northwest of Washington, D.C.\n";
}
elsif ($city eq "Miami") {
print "Miami is south of Washington, D.C. And much warmer!\n";
}
else {
print "I don't know where $city is, sorry.\n";
}
对的,这里是elsif
如果这不是真的,就做某事:
unless ($destination eq $home) {
print "I'm not going home.\n";
}
循环结构:while, until , for和foreach
while和until语句类似于if和unless,如果条件满足(while语句为真,until是假)
while ($tickets_sold < 10000) {
$available = 10000 - $tickets_sold;
print "$available tickets are available. How many would you like:
";
$purchase = <STDIN>;
chomp($purchase);
$tickets_sold += $purchase;
}
while (@ARGV) {
process(shift @ARGV);
}
#每次循环,shift操作符都从参数数组中删除一个元素(同时返回这个元素),当数组@ARGV用完后循环自动推出,这时数组长度为0。
for语句:和C语言中非常类似
for ($sold = 0; $sold < 10000; $sold += $purchase) {
$available = 10000 - $sold;
print "$available tickets are available. How many would you like:
";
$purchase = <STDIN>;
chomp($purchase);
}
foreach语句:条件表达式是在列表环境,而不实标量环境中。循环变量直接指向元素本身,修改循环变量,就是修改原始数组,foreach比for用的多的多。
foreach $user (@users) {
if (-f "$home{$user}/.nexrc") {
print "$user is cool... they use a perl-aware vi!\n";
}
}
跳出控制结构:next和last
next操作符跳至本次循环的结束
last操作符跳至整个循环的结束
foreach $user (@users) {
if ($user eq "root" or $user eq "lp") {
next;
}
if ($user eq "special") {
print "Found the special account.\n";
# do some processing
last;
}
}
一些正则表达式的例子:
while ($line = <FILE>) {
if ($line =~ /http:/) {
print $line;
}
}
while (<FILE>) {
print if /http:/;
} #和上面的例子想同,但更简单
匹配最少7个数字,最多11个数字:\d{7,11}
匹配7个数字:\d{7}
匹配大于等于7的数字:\d{7, }
+代表{1, }
*代表{0, }
?代表{0,1}
/bam{2}/匹配"bamm"
/(bam){2}/匹配"bambam"
最小匹配:一般的匹配都会进行贪婪匹配,匹配到最右边,可以强制指定第一个匹配。
/.*?:/ 将停止在第一个冒号,而不是最后一个冒号
\b为单词边界,/\bFred\b/将匹配"The Great
Fred"但不匹配"Frederick"因为d后面的字符是单词字符
因此前面所说的/\d{7,11}/
并不排除11位后面的数字,因此需要在量词两头实用锚点,如\b
/<(.*?)>.*?<\/\1>/ 将匹配"<B>Blod</B>"
互换两个词: s/(\S+)\s+(\S+)/$2 $1/
@days Array containing ($days[0], $days[1],... $days[n])
@days[3, 4, 5] Array slice containing ($days[3], $days[4], $days[5])
@days[3..5] Array slice containing ($days[3], $days[4], $days[5])
@days{'Jan','Feb'} Hash slice containing ($days{'Jan'},$days{'Feb'})
%days (Jan => 31, Feb => $leap ? 29 : 28, ...)
名字查找,也就是变量名的搜索,perl会预先在词法作用域里查找,文件是最大的词法作用域,如果找不到则代表这个变量是一个package的声明。不管多大的perl都编译成至少一个词法作用域,一个包作用域。
数字直接量:可以用下划线提高可读性,另外可以表示8,16进制的数
$x = 12345; # integer
$x = 12345.67; # floating point
$x = 6.02e23; # scientific notation
$x = 4_294_967_296; # underline for legibility
$x = 0377; # octal
$x = 0xffff; # hexadecimal
$x = 0b1100_0000; # binary
其他字符来替代'和''
常用 通用 含义 可否内插
' ' q/ / 直接量字符串 否
" " qq/ / 直接量字符串 是
` ` qx/ / 执行命令 是
( ) qw/ / 单词列表 否
/ / m/ / 模式匹配 是
s/ / / s/ / / 模式替换 是
y/ / / tr/ / / 字符转换 否
" " qr/ / 正则表达式 是
列表中通用的一列后面的/
/都可以用成对的其他非子字母数字来代替,非常方便。
例如:
$single = q!I said, "You said, 'she said it.'"!;
#这里就用了!来做为定界符替代/
使用use strict 'subs'; 来禁止bare
word,一旦出现,则编译出错。如
print hello, ' ', world, "\n"; #系统将会报错
here-document: 和shell中的用法查不多
print <<EOF;
The price is $Price.
EOF
print <<"EOF"; # 和上面的一样
The price is $Price.
EOF
print <<"dromedary", <<"camelid"; # 可以嵌套
I said bactrian.
dromedary
She said llama.
camelid
funkshun(<<"THIS", 23, <<'THAT'); # doesn't matter if they're in
parens
Here's a line
or two.
THIS
And here's another.
THAT
print << x 10; #
打印下面一行十次,注意后面要留空格或者空行
The camels are coming! Hurrah! Hurrah!
print <<"" x 10; #
比上面好的写法,注意后面要留空格或者空行
The camels are coming! Hurrah! Hurrah!
如果你的here-document在其他代码中是缩进的,可以收工从每行删除开头的空白;
($quote = <<'QUOTE') =~ s/^\s+//gm;
The Road goes ever on and on,
down from the door where it began.
QUOTE