How to convert tab character string ("\t") from command line flag into a rune?

3,990 views
Skip to first unread message

spiffytech

unread,
May 15, 2013, 12:27:13 PM5/15/13
to golan...@googlegroups.com
I'm trying to accept a column separator for CSV parsing as a command line flag. Problem is the CSV library expects the separator to be a rune and not a string, which is giving me problems when I try to declare that a tab is the column separator. 

If I hardcode a tab character into the code, Go correctly parses it as a single-rune string and returns the whole tab character as the rune. If I try to read it from the CLI flag, Go interprets it as a string containing a backslash and a 't', and only converts the backslash to a rune.

What do I need to do to get the whole tab character from the command line converted to a rune?


func processLine(line string) ([]string, error) {
    strReader := strings.NewReader(line)
    csvReader := csv.NewReader(strReader)

    sepString := *separator  // Declared like so: var separator = flag.String("separator", ",", "Single character to be used as a separator between fields")
    fmt.Println("Separator is", string(sepString[0]))
    fmt.Println("'", rune(sepString[0]), "'")
    fmt.Println("'", string(rune("\t"[0])), "'")

    csvReader.Comma = rune(sepString[0])
}

$ cat ~/T-609-group-names.csv | go run csvmaster.go --separator='\t'
Separator is \
' 92 '
'        '



====

P.S. - Yes, I could add some logic that specifically checks whether separator is a tab character and apply the hardcoded string then, but I'd prefer a less hacky, and more general, solution.

Matthew Kane

unread,
May 15, 2013, 1:41:09 PM5/15/13
to spiffytech, golang-nuts
You are passing your program a literal backslash followed by a literal
't'. If you want to pass in a single tab character as an argument, use
$'\t' when you invoke your program.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



--
matt kane
twitter: the_real_mkb / nynexrepublic
http://hydrogenproject.com

Andy Balholm

unread,
May 15, 2013, 4:19:13 PM5/15/13
to golan...@googlegroups.com
If you want to be able to handle backslash escapes in your input, you can use the %q format in fmt.Sscanf. You will need to enclose your input in double quotes, though. http://play.golang.org/p/L9XysXuxIs

Carlos Castillo

unread,
May 15, 2013, 7:48:55 PM5/15/13
to golan...@googlegroups.com
There are a few issues you are running into here:
  1. The shell you are running may be attempting command line completion (due to the tab)
  2. The shell could treat a tab as whitespace and not pass it as an argument
  3. "\t" is only a representation of a tab, and the literal \t characters are passed through to your program, which isn't interpreting them as a tab
  4. rune(str[0]) is a horrible way to get the first rune from a string, you should either use http://golang.org/pkg/unicode/utf8/#DecodeRuneInString or ([]rune(str))[0] otherwise the code will only work for ASCII characters as it converts the first byte to a rune, and unicode runes are 32-bit values.
To solve 1 and 2, the users could type: '<CTRL-V><TAB>' (or mkb's $'\t') as the argument so that file/command completion doesn't kick in, and the value is treated as an argument. This change in user behaviour requires you to do nothing to your code (although you should still fix #4). It also will probably not work for Windows.

Most likely you should solve #3, so you will need to write or find code that turns "\t" (or whatever you decide on) and other escape sequences into correct runes. Taking Andy's logic one step further: http://play.golang.org/p/zT3jqgx7jC

minux

unread,
May 16, 2013, 4:27:29 AM5/16/13
to Andy Balholm, golan...@googlegroups.com

On Thursday, May 16, 2013, Andy Balholm wrote:
If you want to be able to handle backslash escapes in your input, you can use the %q format in fmt.Sscanf. You will need to enclose your input in double quotes, though. http://play.golang.org/p/L9XysXuxIs
enclose the input in single-quotes and use strconv.Unquote seems better
as Sscanf is heavy-weight and it won't check the input contains only one
rune.

spiffytech

unread,
May 17, 2013, 10:51:12 PM5/17/13
to golan...@googlegroups.com, Andy Balholm
Thanks everyone for the help! strconv.Unquote wound up working nicely for me. And thanks for the pointers on the rune stuff- I'd somehow been under the impression that slicing a string returned runes, when I see that it, in fact, returns a type uint8.

If anyone's interested in my project, it's a tool to make handling CSV files on the command line easier. cut/awk don't play nice with RFC 4180-compliant, quoted CSV files, and I haven't found another tool that does, so I made one. More difficult to work around in the shell with other tools, it also handles converting a file from e.g., tab-separated to comma-separated.

Reply all
Reply to author
Forward
Message has been deleted
0 new messages