Support JSONL format (from twarc)

31 views
Skip to first unread message

Frederik Elwert

unread,
Oct 24, 2018, 3:59:03 AM10/24/18
to FireAnt-Discussion
Currently, FireAnt reads JSON data (from twitter) and CSV. It would be super practical if it could also read the JSONL format (JSON lines, which is one JSON object per line, instead of wrapping all objects in a list). That format is used e.g. by twarc[1] for collecting tweets, and it would be nice to be able to use FireAnt for initial analysis and filtering instead of having to go through CSV.

Thanks,
Frederik


Laurence Anthony

unread,
Oct 24, 2018, 4:44:15 AM10/24/18
to Frederik Elwert, fir...@googlegroups.com
Hi Frederik,

The current version of FireAnt already reads JSONL (it just uses the json extension name). Try it and you should find that it works. If it doesn't work, can you just send me 10-20 lines of your JSONL file so that I can check?

Regards,

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "FireAnt-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fireant+u...@googlegroups.com.
To post to this group, send email to fir...@googlegroups.com.
Visit this group at https://groups.google.com/group/fireant.
To view this discussion on the web visit https://groups.google.com/d/msgid/fireant/5d9968d8-9539-4f85-95ce-4ec2783deb44%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Frederik Elwert

unread,
Oct 25, 2018, 3:03:55 AM10/25/18
to Laurence Anthony, fir...@googlegroups.com
Hi Laurence,

thank you for the reply! It does work indeed! I previously checked with
a file with a .jsonl extension, since that’s what the twarc
documentation uses. But after renaming it loads perfectly.

Maybe .jsonl could be added to the list of accepted extensions?

Best,
Frederik


Am 24.10.18 um 10:43 schrieb Laurence Anthony:
> Hi Frederik,
>
> The current version of FireAnt already reads JSONL (it just uses the
> json extension name). Try it and you should find that it works. If it
> doesn't work, can you just send me 10-20 lines of your JSONL file so
> that I can check?
>
> Regards,
>
> Laurence.
>
> ###############################################################
> Laurence ANTHONY, Ph.D.
> Professor of Applied Linguistics
> Faculty of Science and Engineering
> Waseda University
> 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
> E-mail: antho...@gmail.com <mailto:antho...@gmail.com>
> WWW: http://www.laurenceanthony.net/ <http://www.antlab.sci.waseda.ac.jp/>
> ###############################################################
>
>
> On Wed, 24 Oct 2018 at 16:59, 'Frederik Elwert' via FireAnt-Discussion
> <fir...@googlegroups.com <mailto:fir...@googlegroups.com>> wrote:
>
> Currently, FireAnt reads JSON data (from twitter) and CSV. It would
> be super practical if it could also read the JSONL format (JSON
> lines, which is one JSON object per line, instead of wrapping all
> objects in a list). That format is used e.g. by twarc[1] for
> collecting tweets, and it would be nice to be able to use FireAnt
> for initial analysis and filtering instead of having to go through CSV.
>
> Thanks,
> Frederik
>
>
> [1]: https://github.com/DocNow/twarc
>
> --
> You received this message because you are subscribed to the Google
> Groups "FireAnt-Discussion" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to fireant+u...@googlegroups.com
> <mailto:fireant+u...@googlegroups.com>.
> To post to this group, send email to fir...@googlegroups.com
> <mailto:fir...@googlegroups.com>.
> <https://groups.google.com/d/msgid/fireant/5d9968d8-9539-4f85-95ce-4ec2783deb44%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.
>

--
Dr. Frederik Elwert

Digital Humanities Coordinator
Center for Religious Studies
Ruhr-University Bochum

Universitätsstr. 90a
D-44780 Bochum

Phone +49(0)234 32-23024

https://dh.ceres.rub.de/

Laurence Anthony

unread,
Oct 25, 2018, 5:07:09 AM10/25/18
to Frederik Elwert, fir...@googlegroups.com
Dear Frederik,

Yes, I'll certainly add jsonl as an accepted file format. To be honest, I didn't know it was used as a valid extension name until now. I thought it was still in "proposed" status.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Reply all
Reply to author
Forward
0 new messages