Failing to import from Ticketfly where incoming data contains certain hexidecimal values

32 views
Skip to first unread message

Carlos Mossman

unread,
Mar 16, 2015, 10:55:27 PM3/16/15
to ticket...@googlegroups.com
We run a daily import from Ticketfly which almost never manages to complete. The specific error is difficult to diagnose. We receive what appears to be a valid response but the moment we attempt to interact with the data (print_r, json_encode, fwrite) our process goes away. To troubleshoot, we began noting the page on which the errors occur, logging the data we received.  

When an import fails, we query for that specific page (just in a browser), plus several on either side, manually save the data from each page, and then try to json_encode each. There is always a page which causes an error. We ran this by someone with more fundamental understanding of what might be going on and his feedback was:

"OK, I opened up the file in a hex editor in order to inspect the bytes.
 At the position you mention in p81.json, there is some obvious
corruption going on.  The bytes are as follows near the end of the sentence:

e  r  e  .  ENDQUOTE \n  DC4 \  r  \  n  \  r  \  n
65 72 65 2e e2 80 9d 0a  20  5c 72 5c 6e 5c 72 5c 6e

I believe that the problem is coming in right after the first newline
(\n), where the byte is 0x20, which is an ASCII control code that should
definitely *not* be in your text file."

He has general ideas about how we might go about cleaning up the data but believes the problem could be more easily handled at the point of origin. To that end, I've attached an example page of data which causes our script to fail. Any chance the contents can be evaluated to determine the cause and likelihood that outgoing content can be stripped of this type of thing?

Thanks for any feedback!
p81.json

Bill Rousseau

unread,
Mar 16, 2015, 11:31:44 PM3/16/15
to ticket...@googlegroups.com
Please specify the exact URL of the endpoint you are having difficulty with.

-b
--
You received this message because you are subscribed to the Google Groups "ticketfly-api" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ticketfly-ap...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

Carlos Mossman

unread,
Mar 17, 2015, 1:43:42 AM3/17/15
to ticket...@googlegroups.com
To unsubscribe from this group and stop receiving emails from it, send an email to ticketfly-api+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Bill Rousseau

unread,
Mar 18, 2015, 2:51:26 PM3/18/15
to ticket...@googlegroups.com
You would need to apply parameters to that request. Also the list method returns all events, past and present. 

I would recommend using the upcoming method, but you should review the documentation and find the method that helps you achieve your objective. 

Example of the upcoming method, and specifying a venue. This will return all upcoming events at the 930 Club:

To unsubscribe from this group and stop receiving emails from it, send an email to ticketfly-ap...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Carlos Mossman

unread,
Mar 18, 2015, 3:11:35 PM3/18/15
to ticket...@googlegroups.com
Again, sorry. I thought you simply wanted to know the endpoint so that you had a codebase as a context. We do parameterize our urls. Here's a sample that actually ran:

http://www.ticketfly.com/api/events/list.json?orgId=1&fromDate=2015-03-06&country=0&fields=venue,id,name,image,startDate,endDate,ticketPurchaseUrl,ticketPrice,headliners

In terms of our objective, I believe the current endpoint is already providing us with what we need - that's not the issue. 

The issue is that, occasionally and unpredictably, we appear to receive characters in the response which php doesn't want to deal with. While I have not tested the other endpoint you've provided, I suspect I'd encounter a similar issue if I just switched. 

Were you able to reproduce the issue I described using the attached json? Can you identify any of the characters I referred to?

Thanks for your support.

-Carlos

Bill Rousseau

unread,
Mar 18, 2015, 6:07:36 PM3/18/15
to ticket...@googlegroups.com
Thanks Carlos. I'm not able to recreate or spot that specific text in attached file or in by hitting request you shared. Can you give me the eventId of the object it's in?

I'm wondering if any of these filters might help:

--
You received this message because you are subscribed to the Google Groups "ticketfly-api" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ticketfly-ap...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Carlos Mossman

unread,
Mar 19, 2015, 2:25:58 PM3/19/15
to ticket...@googlegroups.com
No - at present, I can not identify the actual event which is (theoretically) the culprit. That would certainly help narrow things down, though.

How about this - I've attached a zip containing test-ticketfly-response-data.php, which attempts to json_decode the contents of an actual Ticketfly response. Included are two responses: good-data.php and bad-data.php. These are responses saved directly from a browser by re-composing the URLs which were being processed when a failure occurred. So, page 81 (good-data.php) works while page 82  (bad-data.php) dies. 

You should be able to run that file with good-data.php and see the results of print_r without error. When you switch the file reference in $filename to 'bad-data.php', you should see an error. Assuming that is so, then there is something in that file - bad-data.php - which php does not like. And that's what I need help with.

I appreciate your suggestion that perhaps a filter or variety of filters might be effective. I believe it'd be best to know what exactly I'm trying to filter out. Realistically, if data originates from the Ticketfly API which can't be processed by PHP, I would think filters would be most effectively implemented on your end. But, of course, the source of this problem has not yet been identified...  

Can you test the attached files and provide any observations/feedback? Thanks again!

-Carlos
ticketfly-response-data-test.zip

Carlos Mossman

unread,
Mar 23, 2015, 2:15:09 PM3/23/15
to ticket...@googlegroups.com
Hi Bill,

Were you able to take a look at the attached files? Any thoughts regarding the odd content in the second sample of TF response data? 

Thanks!

-Carlos

Bill Rousseau

unread,
Mar 23, 2015, 8:11:24 PM3/23/15
to ticket...@googlegroups.com
Hey Carlos,

Code review goes beyond the scope of the support we can provide. I did take a quick look, and even in the good-data.php data set I see all kinds of "gremlin" characters.

I took one example eventID=772353, which is the first event in your dataset:

  • good-data.php has: Wakefield‚ The Cribs‚ 
  • Our API has: Wakefield‚ The Cribs‚


Looks like something with your initial ingestion tries to encode commas?

I would definitely try one of the sanitization filters I recommended before http://php.net/manual/en/filter.filters.sanitize.php


-b


Reply all
Reply to author
Forward
0 new messages