2013/9/24 Perchouli Chen <
jp.ch...@gmail.com>:
> 要处理的字符串是一些nginx日志,例如:
> "127.0.0.1" "21" "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;
> .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)" "-" "2013-09-23T10:00:04+08:00"
> "“
针对这种特殊情况,可以考虑用 " " 做分隔符,例如:
echo '"abcd d" "efgh" "kkk adad"'|awk -F'" "' '{print $2}'
如果要兼顾头尾的",
还可以写成这样:
echo '"abcd d" "efgh" "kkk adad"'|awk -F'(" "|^"|"$)' '{print $2}'
但是:如果输入是一般的CSV规范的格式的话,即,引号可以选择性的出现,比如
field1 "field2 with delimiter" field3
上面的办法就无能为力了。
这就是正则表达式的局限之处。
引王垠的说法是,这玩意用多了也不见得有多好,很多时候不如不用。
> --
> -- You received this message because you are subscribed to the Google Groups
> Shanghai Linux User Group group. To post to this group, send email to
>
sh...@googlegroups.com. To unsubscribe from this group, send email to
>
shlug+un...@googlegroups.com. For more options, visit this group at
>
https://groups.google.com/d/forum/shlug?hl=zh-CN
> ---
> 您收到此邮件是因为您订阅了 Google 网上论坛的“Shanghai Linux User Group”论坛。
> 要退订此论坛并停止接收此论坛的电子邮件,请发送电子邮件到
shlug+un...@googlegroups.com。
> 要查看更多选项,请访问
https://groups.google.com/groups/opt_out。