ruby 的语法真有点搞 (1) pack 和 unpack

19 views

Skip to first unread message

volve...@gmail.com

unread,

Oct 26, 2006, 11:12:35 PM10/26/06

to 闲敲棋子落灯花

ASCII字符串(塞入null字符/保留后续的null字符或空格)

["abc"].pack("a") => "a"
["abc"].pack("a*") => "abc"
["abc"].pack("a4") => "abc\0"

"abc\0".unpack("a4") => ["abc\0"]
"abc ".unpack("a4") => ["abc "]

ASCII字符串(塞入空格/删除后续的null字符和空格)

["abc"].pack("A") => "a"
["abc"].pack("A*") => "abc"
["abc"].pack("A4") => "abc "

"abc ".unpack("A4") => ["abc"]
"abc\0".unpack("A4") => ["abc"]

null终点字符串(与a相同 / 删除后续的null字符)

["abc"].pack("Z") => "a"
["abc"].pack("Z*") => "abc"
["abc"].pack("Z4") => "abc\0"

"abc\0".unpack("Z4") => ["abc"]
"abc ".unpack("Z4") => ["abc "]

位串(从下级位到上级位)

"\001\002".unpack("b*") => ["1000000001000000"]
"\001\002".unpack("b3") => ["100"]

["1000000001000000"].pack("b*") => "\001\002"

位串(从上级位到下级位)

"\001\002".unpack("B*") => ["0000000100000010"]
"\001\002".unpack("B9") => ["000000010"]

["0000000100000010"].pack("B*") => "\001\002"

16进制字符串(下级半字节(nibble)在先)

"\x01\xfe".unpack("h*") => ["10ef"]
"\x01\xfe".unpack("h3") => ["10e"]

["10ef"].pack("h*") => "\001\376"

16进制字符串(上级半字节在先)

"\x01\xfe".unpack("H*") => ["01fe"]
"\x01\xfe".unpack("H3") => ["01f"]

["01fe"].pack("H*") => "\001\376"

char (8bit 有符号整数)

"\001\376".unpack("c*") => [1, -2]

[1, -2].pack("c*") => "\001\376"
[1, 254].pack("c*") => "\001\376"

unsigned char (8bit 无符号整数)

"\001\376".unpack("C*") => [1, 254]

[1, -2].pack("C*") => "\001\376"
[1, 254].pack("C*") => "\001\376"

short (16bit 有符号整数, 取决于Endian) (s! 并非16bit,
它取决于short的大小)

小Endian:

"\001\002\376\375".unpack("s*") => [513, -514]

[513, 65022].pack("s*") => "\001\002\376\375"
[513, -514].pack("s*") => "\001\002\376\375"

大Endian:

"\001\002\376\375".unpack("s*") => [258, -259]

[258, 65277].pack("s*") => "\001\002\376\375"
[258, -259].pack("s*") => "\001\002\376\375"

unsigned short (16bit 无符号整数, 取决于Endian)
(S!并非16bit,它取决于short 的大小)

小Endian:

"\001\002\376\375".unpack("S*") => [513, 65022]

[513, 65022].pack("s*") => "\001\002\376\375"
[513, -514].pack("s*") => "\001\002\376\375"

大Endian:

"\001\002\376\375".unpack("S*") => [258, 65277]

[258, 65277].pack("S*") => "\001\002\376\375"
[258, -259].pack("S*") => "\001\002\376\375"

int (有符号整数, 取决于Endian和int的大小)

小Endian, 32bit int:

"\001\002\003\004\377\376\375\374".unpack("i*") => [67305985,
-50462977]

[67305985, 4244504319].pack("i*") => RangeError
[67305985, -50462977].pack("i*") =>
"\001\002\003\004\377\376\375\374"

大Endian, 32bit int:

"\001\002\003\004\377\376\375\374".unpack("i*") => [16909060,
-66052]

[16909060, 4294901244].pack("i*") => RangeError
[16909060, -66052].pack("i*") =>
"\001\002\003\004\377\376\375\374"

unsigned int (无符号整数, 取决于Endian和int的大小)

小Endian, 32bit int:

"\001\002\003\004\377\376\375\374".unpack("I*") => [67305985,
4244504319]

[67305985, 4244504319].pack("I*") =>
"\001\002\003\004\377\376\375\374"
[67305985, -50462977].pack("I*") =>
"\001\002\003\004\377\376\375\374"

大Endian, 32bit int:

"\001\002\003\004\377\376\375\374".unpack("I*") => [16909060,
4294901244]

[16909060, 4294901244].pack("I*") =>
"\001\002\003\004\377\376\375\374"
[16909060, -66052].pack("I*") =>
"\001\002\003\004\377\376\375\374"

long (32bit 有符号整数, 取决于Endian) (l! 并非32bit,
它取决于long的大小)

小Endian, 32bit long:

"\001\002\003\004\377\376\375\374".unpack("l*") => [67305985,
-50462977]

[67305985, 4244504319].pack("l*") => RangeError
[67305985, -50462977].pack("l*") =>
"\001\002\003\004\377\376\375\374"

unsigned long (32bit 无符号整数, 取决于Endian) (L!
并非32bit, 它取决于long的大小)

小Endian, 32bit long:

"\001\002\003\004\377\376\375\374".unpack("L*") => [67305985,
4244504319]

[67305985, 4244504319].pack("L*") =>
"\001\002\003\004\377\376\375\374"
[67305985, -50462977].pack("L*") =>
"\001\002\003\004\377\376\375\374"

ruby 1.7 特性: long long (有符号整数,
取决于Endian和long long 的大小) (在C中无法处理long
long时, 就是64bit)

小Endian, 64bit long long:

"\001\002\003\004\005\006\007\010\377\376\375\374\373\372\371\370".unpack("q*")
=> [578437695752307201, -506097522914230529]

[578437695752307201, -506097522914230529].pack("q*")
=>
"\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"
[578437695752307201, 17940646550795321087].pack("q*")
=>
"\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"

ruby 1.7 特性: unsigned long long (无符号整数,
取决于Endian和 long long 的大小) (在C中无法处理long
long时, 就是64bit)

小Endian, 64bit long long:

"\001\002\003\004\005\006\007\010\377\376\375\374\373\372\371\370".unpack("Q*")
=> [578437695752307201, 17940646550795321087]

[578437695752307201, 17940646550795321087].pack("Q*")
=>
"\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"
[578437695752307201, -506097522914230529].pack("Q*")
=>
"\001\002\003\004\005\006\a\010\377\376\375\374\373\372\371\370"

被base64编码过的字符串。每隔60个八位组(或在结尾)添加一个换行代码。

Base64是一种编码方法,
它只使用ASCII码中的65个字符(包括[A-Za-z0-9+/]这64字符和用来padding的'='),将3个八位组(8bits
* 3 = 24bits)中的二进制代码转为4个(6bits * 4 =
24bits)可印刷的字符。具体细节请参考RFC2045。

[""].pack("m") => ""
["\0"].pack("m") => "AA==\n"
["\0\0"].pack("m") => "AAA=\n"
["\0\0\0"].pack("m") => "AAAA\n"
["\377"].pack("m") => "/w==\n"
["\377\377"].pack("m") => "//8=\n"
["\377\377\377"].pack("m") => "////\n"

["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"].pack("m")
=>
"YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJT\nVFVWV1hZWg==\n"
["abcdefghijklmnopqrstuvwxyz"].pack("m3")
=> "YWJj\nZGVm\nZ2hp\namts\nbW5v\ncHFy\nc3R1\ndnd4\neXo=\n"

"".unpack("m") => [""]
"AA==\n".unpack("m") => ["\000"]
"AA==".unpack("m") => ["\000"]

"YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJT\nVFVWV1hZWg==\n".unpack("m")
=> ["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"]

"YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWg==\n".unpack("m")
=> ["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"]

经过quoted-printable encoding编码的字符串

["a b c\td \ne"].pack("M") => "a b c\td =\n\ne=\n"

"a b c\td =\n\ne=\n".unpack("M") => ["a b c\td \ne"]

网络字节顺序(大Endian)的unsigned short (16bit
无符号整数)

[0,1,-1,32767,-32768,65535].pack("n*")
=> "\000\000\000\001\377\377\177\377\200\000\377\377"

"\000\000\000\001\377\377\177\377\200\000\377\377".unpack("n*")
=> [0, 1, 65535, 32767, 32768, 65535]

网络字节顺序(大Endian)的unsigned long (32bit
无符号整数)

[0,1,-1].pack("N*") =>
"\000\000\000\000\000\000\000\001\377\377\377\377"

"\000\000\000\000\000\000\000\001\377\377\377\377".unpack("N*")
=> [0, 1, 4294967295]

"VAX"字节顺序(小Endian)的unsigned short (16bit
无符号整数)

[0,1,-1,32767,-32768,65535].pack("v*")
=> "\000\000\001\000\377\377\377\177\000\200\377\377"

"\000\000\001\000\377\377\377\177\000\200\377\377".unpack("v*")
=> [0, 1, 65535, 32767, 32768, 65535]

"VAX"字节顺序(小Endian)的unsigned long (32bit
无符号整数)

[0,1,-1].pack("V*") =>
"\000\000\000\000\001\000\000\000\377\377\377\377"

"\000\000\000\000\001\000\000\000\377\377\377\377".unpack("V*")
=> [0, 1, 4294967295]

单精度浮点数(取决于系统)

IA-32 (x86) (IEEE754 单精度小Endian):

[1.0].pack("f") => "\000\000\200?"

sparc (IEEE754 单精度大Endian):

[1.0].pack("f") => "?\200\000\000"

双精度浮点数(取决于系统)

IA-32 (IEEE754 双精度小Endian):

[1.0].pack("d") => "\000\000\000\000\000\000\360?"

sparc (IEEE754 双精度大Endian):

[1.0].pack("d") => "?\360\000\000\000\000\000\000"

小Endian的单精度浮点数(取决于系统)

IA-32:

[1.0].pack("e") => "\000\000\200?"

sparc:

[1.0].pack("e") => "\000\000\200?"

小Endian的双精度浮点数(取决于系统)

IA-32:

[1.0].pack("E") => "\000\000\000\000\000\000\360?"

sparc:

[1.0].pack("E") => "\000\000\000\000\000\000\360?"

大Endian的单精度浮点数(取决于系统)

IA-32:

[1.0].pack("g") => "?\200\000\000"

sparc:

[1.0].pack("g") => "?\200\000\000"

大Endian的双精度浮点数(取决于系统)

IA-32:

[1.0].pack("G") => "?\360\000\000\000\000\000\000"

sparc:

[1.0].pack("G") => "?\360\000\000\000\000\000\000"

指向null终点字符串的指针

[""].pack("p") => "\310\037\034\010"
["a", "b", "c"].pack("p3") => "
=\030\010\340^\030\010\360^\030\010"
[nil].pack("p") => "\000\000\000\000"

指向结构体(定长字符串)的指针

[nil].pack("P") => "\000\000\000\000"
["abc"].pack("P3") => "x*\024\010"

["abc"].pack("P4") => ArgumentError: too short buffer for P(3 for
4)
[""].pack("P") => ArgumentError: too short buffer for P(0 for 1)

被uuencode编码的字符串

[""].pack("u") => ""
["a"].pack("u") => "!80``\n"
["abc"].pack("u") => "#86)C\n"
["abcd"].pack("u") => "$86)C9```\n"
["a"*45].pack("u") =>
"M86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A\n"
["a"*46].pack("u") =>
"M86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A86%A\n!80``\n"
["abcdefghi"].pack("u6") => "&86)C9&5F\n#9VAI\n"

utf-8

[0].pack("U") => "\000"
[1].pack("U") => "\001"
[0x7f].pack("U") => "\177"
[0x80].pack("U") => "\302\200"
[0x7fffffff].pack("U") => "\375\277\277\277\277\277"
[0x80000000].pack("U") => ArgumentError
[0,256,65536].pack("U3") => "\000\304\200\360\220\200\200"

"\000\304\200\360\220\200\200".unpack("U3") => [0, 256, 65536]
"\000\304\200\360\220\200\200".unpack("U") => [0]
"\000\304\200\360\220\200\200".unpack("U*") => [0, 256, 65536]

BER压缩整数

用7位来表现1字节,
这样就能以最少的字节数来表现任意大小的0以上的整数。各字节的最高位中除了数据的末尾以外,肯定还有个1(也就是说,
最高位可以表示数据伸展到的位置)。

BER是Basic Encoding
Rules的缩略语(BER并非只能处理整数。ASN.1的编码中也用到了它)
*

读入null字节/1字节
*

后退1字节
*

向绝对位置移动

Reply all

Reply to author

Forward

0 new messages

ruby 的语法 真有点 搞 (1) pack 和 unpack

volve...@gmail.com

ruby 的语法真有点搞 (1) pack 和 unpack