Json Decoding | ghetto queue consume

77 views
Skip to first unread message

Abhinav Shukla

unread,
May 14, 2015, 3:22:03 AM5/14/15
to phireho...@googlegroups.com
Whenever I try to process the queue files I get an error message saying it is not a valid json file. 

I want to basically get the contents of the file and assign it to a variable to json decode it and put it into database.

phirehose-ghettoqueue.20150508-030441.queue

Fenn Bailey

unread,
May 14, 2015, 4:33:37 AM5/14/15
to phireho...@googlegroups.com
Hey there,

You're right, it looks like the ghetto-queue-collect example is potentially out of date.

The consume script expects each record to be newline separated (ie: each JSON tweet is on its own line) but that is definitely not the case in the file you attached.

If you append "\n" to the stream in this line: https://github.com/fennb/phirehose/blob/master/example/ghetto-queue-collect.php#L66, you should find things work as expected.

Cheers!

On Thu, May 14, 2015 at 5:22 PM, Abhinav Shukla <abhinav...@gmail.com> wrote:
Whenever I try to process the queue files I get an error message saying it is not a valid json file. 

I want to basically get the contents of the file and assign it to a variable to json decode it and put it into database.

--

---
You received this message because you are subscribed to the Google Groups "Phirehose Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phirehose-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Abhinav Shukla

unread,
May 14, 2015, 6:03:16 AM5/14/15
to phireho...@googlegroups.com
Hi Fenn,

Thank you so much for the reply however, I added
 
$status .= "\n";

before line 66 but whenever I pass the results through json parser it says it is not a valid json format after the first tweet. All the info upto the first tweet is fine but beyond that its not.

Thanks again

--

---
You received this message because you are subscribed to a topic in the Google Groups "Phirehose Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/phirehose-users/YEDD6LxPYqQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to phirehose-use...@googlegroups.com.

Fenn Bailey

unread,
May 15, 2015, 2:11:02 AM5/15/15
to phireho...@googlegroups.com
You may be best to look at the output file itself (and/or feed it through a validator) and see if you can work out why it isn't valid.

KCL

unread,
Sep 11, 2015, 1:11:06 PM9/11/15
to Phirehose Users
I found the same issue when I first started with Phirehose.  The suggestion to go through a validator is an excellent one -- I use pro.jsonlint.com to debug the JSON stuff.

 We break our tweets into an array before we start our analysis, so my solution was more like this:


$rawData = file_get_contents ( $sourceFile );      // Source file is the ghetto-queue generated file.
if(strpos($rawData, "}{")) {                       // Multiple nodules in a single file - JSON not formatted correctly, fix.
    $tmpNodules
= explode("}-{",str_replace("}{", "}}-{{", $rawData)); // Pull 'em apart, then re-merge them.
    $serviceNodules
= array_merge($serviceNodules, $tmpNodules);
}

The second line above is looking for that same issue.  $serviceNodules is the array that holds all the tweets.

I realize this may be out of date but it had us stuck for awhile.


Scott.

mgr...@cloudappwares.com

unread,
Oct 24, 2015, 10:05:05 PM10/24/15
to Phirehose Users
line 66:  fputs($this->getStream(), $status."\n");
Reply all
Reply to author
Forward
0 new messages