Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

awk -F / cut -d, whitespace, fields

1,613 views
Skip to first unread message

andre.a...@gmail.com

unread,
Apr 8, 2012, 2:30:33 AM4/8/12
to
Hi,

cut -d seems to consider every occurence of the delimiter as a new field... ok that's consistent

awk -F seems to do the same if the delimiter is not a whitespace character, but if it is a whitespace character then it considers contiguous delimiters as delimiting one field. See,

> echo ":::hi:::" | awk -F: '{print $4}'
hi

> echo " hi " | grep hi | awk -F' ' '{print $1}'
hi


How do I make awk (in first case above) consider contiguous :'s as 1 delimiter like it does in the second case? We already have cut to cover the first case, not sure why awk becomes another cut when -F is called with whitespace.

Hermann Peifer

unread,
Apr 8, 2012, 3:23:58 AM4/8/12
to
On 08/04/2012 08:30, andre.a...@gmail.com wrote:
>
>> echo ":::hi:::" | awk -F: '{print $4}'
> hi
>
>> echo " hi " | grep hi | awk -F' ' '{print $1}'
> hi
>
>
> How do I make awk (in first case above) consider contiguous :'s as 1 delimiter like it does in the second case?

awk -F:+ '{print $2}'

http://www.gnu.org/software/gawk/manual/html_node/Field-Separators.html


Ed Morton

unread,
Apr 10, 2012, 3:40:02 PM4/10/12
to
awk -F ' ' = treat all contiguous white space as a single field
separator AND ignore any leading and trailing white space when determining
fields $1 to $NF. This is THE special-case (and default) field separator
for awk.

awk -F '[X]' = treat each occurrence of the character X as a field
separator. This is equivalent to cut -d X.

awk -F '[X]+' = treat all contiguous occurrences of the character X as a
field separator.

awk -F '[[:space:]]+' = treat all contiguous white space as a single
field separator. Note that this is not quite the same as awk -F ' ' since
it has no requirement to ignore leading/trailing white space.

awk -F 'X' = equivalent to awk -F '[X]' except when X is a single blank
character (see above) or an RE metacharacter.

Hope that helps.

Ed.


Posted using www.webuse.net
0 new messages