How to go from token list-> string again

7,139 views
Skip to first unread message

Sania

unread,
Apr 13, 2012, 2:34:05 PM4/13/12
to nltk-users
I have a token list but I want to change all of them back into
strings.
Is this possible? I have been searching but I can't really find
anything, or maybe I just don't understand it.

Thanks,
Sania

Alex Rudnick

unread,
Apr 13, 2012, 2:51:30 PM4/13/12
to nltk-...@googlegroups.com
Hey Sania,

If you just have a list of strings, consider:

mylistofstrings = ["foo", "bar", "baz"]
joinedtogether = " ".join(mylistofstrings)

Now joinedtogether is the string "foo bar baz".

If you have a list of tuples of strings (say where the second element
is a pos tag), then that's a little more complicated, but just go
through the list and take out the first element of each tuple.

taggedwords = [("the", "DT"), ("dog", "NN")]
words = [word for (word,tag) in taggedwords]

Then just do a join on the variable words.

In the long run, not sure this is the best place to ask basic Python
questions :)

Cheers,

--
-- alexr

Sania

unread,
Apr 13, 2012, 4:07:13 PM4/13/12
to nltk-users
Thanks :)

On Apr 13, 2:51 pm, Alex Rudnick <alex.rudn...@gmail.com> wrote:
> Hey Sania,
>
> If you just have a list of strings, consider:
>
> mylistofstrings = ["foo", "bar", "baz"]
> joinedtogether = " ".join(mylistofstrings)
>
> Now joinedtogether is the string "foo bar baz".
>
> If you have a list of tuples of strings (say where the second element
> is a pos tag), then that's a little more complicated, but just go
> through the list and take out the first element of each tuple.
>
> taggedwords = [("the", "DT"), ("dog", "NN")]
> words = [word for (word,tag) in taggedwords]
>
> Then just do a join on the variable words.
>
> In the long run, not sure this is the best place to ask basic Python
> questions :)
>
Message has been deleted

Alex Rudnick

unread,
Apr 14, 2012, 10:54:50 PM4/14/12
to nltk-...@googlegroups.com
Francisco,

I think most of the time, that won't work in Python :) Unless you've
built a Java interpreter into your copy of Python...

On Sat, Apr 14, 2012 at 8:27 AM, FranciscoMXCA
<francisc...@gmail.com> wrote:
>
> Try yourList.toArray() as in
> http://www.exampledepot.com/egs/java.util/coll_GetArrayFromVector.html

--
-- alexr

Erick Fonseca

unread,
Apr 15, 2012, 9:41:50 AM4/15/12
to nltk-users
I don't think there is a straightforward solution to this, but calling
" ".join() will leave you with all tokens, including periods and
commas, with whitespaces between them.

If you want the string in a more readable format, you could call
s.replace(' ,', ',')
s.replace(' .', '.')

to eliminate spaces before whatever tokens you want. Or maybe use a
regular expression.

Cheers,
Erick

On Apr 13, 3:51 pm, Alex Rudnick <alex.rudn...@gmail.com> wrote:
> Hey Sania,
>
> If you just have a list of strings, consider:
>
> mylistofstrings = ["foo", "bar", "baz"]
> joinedtogether = " ".join(mylistofstrings)
>
> Now joinedtogether is the string "foo bar baz".
>
> If you have a list of tuples of strings (say where the second element
> is a pos tag), then that's a little more complicated, but just go
> through the list and take out the first element of each tuple.
>
> taggedwords = [("the", "DT"), ("dog", "NN")]
> words = [word for (word,tag) in taggedwords]
>
> Then just do a join on the variable words.
>
> In the long run, not sure this is the best place to ask basic Python
> questions :)
>

deepti patil

unread,
Jul 2, 2018, 5:16:39 PM7/2/18
to nltk-users
Hi,

If we have a list of list say :

sample = [[ 'I' ,'am', 'studying','Physics'],['I','am', 'going','to','the','wedding']]

The output should be :

output = [[I am studying Physics],[I am going to the wedding]]

In this case I was trying to join sentences using " ".join() but was not able to. How to proceed in such cases?

Jordi Carrera

unread,
Jul 3, 2018, 11:34:22 AM7/3/18
to nltk-users
Hey Deepti,

I think what you need is list comprehension, like this:

>>> sample = [[ 'I' ,'am', 'studying','Physics'],['I','am', 'going','to','the','wedding']]
>>> output = [' '.join(x) for x in sample]
>>> print output
['I am studying Physics', 'I am going to the wedding']


However, notice the final output is a list of strings (vs. the input, which was a list of lists of strings), so it's not exactly the same as what you provided as the ideal output:


[[I am studying Physics],[I am going to the wedding]]

However, the latter does not actually seem valid Python code :) so I assume you meant the output to be as shown above.


Hope this helps!



Jordi

deepti patil

unread,
Jul 4, 2018, 1:11:21 AM7/4/18
to nltk-...@googlegroups.com
Thanks a lot!!

Thanks & Regards,
Deepti R. Patil

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages