how to handle amazon review dataset in json format-reg

57 views
Skip to first unread message

Madhusudhanan Sambath

unread,
Feb 4, 2017, 1:31:57 AM2/4/17
to nltk-users, SentimentAI
hi to all,

I have collected amazon review data from snap which is in json format.
i used python code to convert it in to text format , it is converted json to text file

this is coding
import pandas as pd import gzip def parse(path): g = gzip.open(path, 'rb') for l in g: yield eval(l) def getDF(path): i = 0 df = {} for d in parse(path): df[i] = d i += 1 return pd.DataFrame.from_dict(df, orient='index') df = getDF('reviews_books_5.json.gz')

i am running this code in anaconda in i3processor 4gb ram,6ht gen, itb harddisk. after run the code below , it is idle..... for long time.... while i opening that text file for analysis in notepad or textpad or word it says file size is too large, so it will hang my system.

kindly help me for further process...
Thanks and regards

S.Madhusudhanan,

Jeff Silverman

unread,
Feb 9, 2017, 4:39:52 AM2/9/17
to nltk-users
How big is the text file? It may be too big to be edited. You may have to process it sequentially.

Let me know how this turns out.


Reply all
Reply to author
Forward
0 new messages