awk能否识别带括号的字段

36 views
Skip to first unread message

Qf Yang

unread,
Sep 1, 2015, 11:41:39 PM9/1/15
to shlug
分析apache日志,默认的 combined 格式,如下

220.181.108.91 - - [02/Sep/2015:11:24:48 +0800] "GET /Ultrasonic/policy.html HTTP/1.1" 200 40366 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" 244 40932
68.180.229.57 - - [02/Sep/2015:11:24:47 +0800] "GET /indian/list-23.html HTTP/1.1" 200 38850 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 192 39452

字段分隔符是空格,但部分字段内部带空格,这种字段是引号括起来的。

请教awk怎么正确识别这些字段,或者有其它工具推荐吗?

Qf Yang

unread,
Sep 2, 2015, 3:01:45 AM9/2/15
to shlug
啃man awk,发现 内建变量FPAT

       FPAT        A  regular  expression describing the contents of the fields in a record.  When set, gawk
                   parses the input into fields, where the fields match the regular expression,  instead  of
                   using the value of the FS variable as the field separator.  See Fields, above.

参看gawk官方的帮助
  https://www.gnu.org/software/gawk/manual/html_node/User_002dmodified.html#index-differences-in-awk-and-gawk_002c-FPAT-variable

要求gawk 4.x以上版本,centos 6自带的awk是3.1.7 不支持,被坑了。。。。幸好cygwin的gawk是4.x


Reply all
Reply to author
Forward
0 new messages