ACRIS public data is blocked

1,338 views
Skip to first unread message

Ralph Yozzo

unread,
Aug 15, 2014, 4:24:49 PM8/15/14
to betan...@googlegroups.com
Has anyone seen this http://a836-acris.nyc.gov/BandwidthPolicy/ACRIS-BW-POL.html

It appears that hosts from amazon ec2 are blocked.

It was only the first request and it says:

Further access to ACRIS is denied.  This can be due to multiple reasons such as detection of automated scripts/robots that are capturing data from the website or having exceeded the bandwidth limits we have established to ensure that all users of the ACRIS system experience high performance.  If you need large amounts of data, please contact the City Register (Ph: 212-487-6300) to learn about our subscription data services.



Ralph Yozzo

unread,
Aug 15, 2014, 4:31:34 PM8/15/14
to betan...@googlegroups.com
I called the number and the person at ACRIS HELP DESK is not aware of any blocking or subscription data service :)

And of course, this is the first time they have ever heard of this problem :)

Jeremy Baron

unread,
Aug 15, 2014, 4:41:28 PM8/15/14
to betan...@googlegroups.com

On Aug 15, 2014 4:31 PM, "Ralph Yozzo" <ra...@brooklynmarathon.com> wrote:
> I called the number and the person at ACRIS HELP DESK is not aware of any blocking or subscription data service :)
>
> And of course, this is the first time they have ever heard of this problem :)

wunderbar :)

the page source of the error mag has some potentially useful metadata:

<o:LastAuthor>Sandip Desai</o:LastAuthor>
<o:Created>2006-04-26T15:28:00Z</o:Created>
<o:LastSaved>2014-02-10T22:08:00Z</o:LastSaved>
<o:Company>New York City Dept. of Finance</o:Company>

-Jeremy

Ralph Yozzo

unread,
Aug 15, 2014, 5:17:00 PM8/15/14
to betan...@googlegroups.com
Yes, thank you.  Why this particular outdated page would be updated in 2/10/2014 is not clear.

The last saved date is probably a release date for the entire package.

It would be nice if there was an email address.

Also the person answering the phone had no technical contact that she could refer me to.

It seems that even she did not have technical support, which seems odd.


-Jeremy

--
This is the BetaNYC Developers list. You can find projects at < projects.betanyc.us > or ideas at < betaNYC.ideascale.com >.
 
BetaNYC is committed to hosting safe and open spaces for all. By participating in this space you are committing yourself to BetaNYC's Code of Conduct and Anti-Harassment Policy. < bit.ly/betanyc-coc >
---
You received this message because you are subscribed to the Google Groups "BetaNYC-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to betanyc-dev...@googlegroups.com.
To post to this group, send email to betan...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/betanyc-dev/CAE-2OCaEMvPk_NT8ePZ6spmVWEGvOFCEdaLy7%2BE0ZmjTU_rQwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Noel Hidalgo

unread,
Aug 19, 2014, 1:49:56 AM8/19/14
to betan...@googlegroups.com
Ralph,

You have seem to locate an error page but how did that come up? It isn't happening on my end.

Can you fill in the details on how you are using this via EC2? That seems like a big block of internet to exclude. 

n


--
This is the BetaNYC Developers list. You can find projects at < projects.betanyc.us > or ideas at < betaNYC.ideascale.com >.
 
BetaNYC is committed to hosting safe and open spaces for all. By participating in this space you are committing yourself to BetaNYC's Code of Conduct and Anti-Harassment Policy. < bit.ly/betanyc-coc >
---
You received this message because you are subscribed to the Google Groups "BetaNYC-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to betanyc-dev...@googlegroups.com.
To post to this group, send email to betan...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Sent from an Apple //c

Ralph Yozzo

unread,
Aug 19, 2014, 4:18:27 PM8/19/14
to betan...@googlegroups.com
Hi Noel,

Thanks for asking.

I've tried a few EC2 machines and they are blocked.

A simple test is run this:


or 


On blocked machines you will see:

<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found <a HREF="http://a836-acris.nyc.gov/BandwidthPolicy/ACRIS-BW-POL.html">here</a></body>


Noel Hidalgo

unread,
Aug 19, 2014, 7:45:10 PM8/19/14
to betan...@googlegroups.com
Let me ask my DOITT contacts.

n



For more options, visit https://groups.google.com/d/optout.

Ralph Yozzo

unread,
Aug 19, 2014, 8:37:37 PM8/19/14
to betan...@googlegroups.com
Thank you Noel.  I can say that if your contact is Albert Webber, his response is not helpful.  In fact, when you follow up on the same issue, he does not respond at all.

I hope you have better response!

Thank you for everything that you do!


David Schwartz

unread,
Nov 10, 2014, 4:08:37 PM11/10/14
to betan...@googlegroups.com
Hi Ralph and Noel,

Did you ever find an explanation for this.  I am also being blocked from accessing ACRIS from my AWS machines.  It works from Azure, but not AWS, even if I try different availability zones.

Thanks in advance for any help.

David

Ralph Yozzo

unread,
Nov 10, 2014, 4:58:58 PM11/10/14
to betan...@googlegroups.com
Hi David,

No I've never heard an answer from the NYC DOF about this.  I did not pursue it as there are many workarounds.

One thing that seems clear is that it is a very simple "protection" 

It seems to be based on ip address

I've seen 100's of thousands of requests from one machine with no blocking at all.

Yet even one request from an AWS machine is blocked instantly.

Seems like a bug to me.

Thanks for asking and bringing attention to this.

Ralph

David Schwartz

unread,
Nov 10, 2014, 11:11:36 PM11/10/14
to betan...@googlegroups.com
Hi Ralph,

I may pursue this a bit more and will keep you posted if I find anything out.

You mentioned "many workarounds".  Could you share what some of those are?

Thanks

David

Ralph Yozzo

unread,
Nov 10, 2014, 11:50:34 PM11/10/14
to betan...@googlegroups.com
I apologize for making sound like I can make it from AWS.

All I meant was that it works from most machines except AWS machines, so I simply use those.

But now that you bring it up, there may be a proxy solution that would make it work from AWS but I have not tried that.

Sorry for the confusion.

Ralph

david.s...@credifi.com

unread,
Nov 11, 2014, 2:22:20 AM11/11/14
to betan...@googlegroups.com
Hi Ralph,

No worries... I was just searching for solutions.  I found that it does work from Azure and other cloud computing platforms.

Strangely, several of my AWS machines were suddenly unblocked this morning.  Not complaining, but seems odd that they are suddenly given access where there was none before.  You can try it as well, it may work now

Thanks

David

Ralph Yozzo

unread,
Nov 11, 2014, 1:04:56 PM11/11/14
to betan...@googlegroups.com
Hi David,

I try this right now from an AWS machine.   What command or http request are you using?

curl 'https://a836-acris.nyc.gov/DS/DocumentSearch/DocumentTypeResult' -H 'Cookie: _ga=GA1.2.91831431.1410968089; WT_FPC=id=0a96a54e-0df0-4b3d-92d0-cc431a524716:lv=1415378163345:ss=1415378163345; __utma=48278585.91831431.1410968089.1415639509.1415661163.39; __utmc=48278585; __utmz=48278585.1415639509.38.27.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); __RequestVerificationToken_L0RT=K72hOsVkGiUE3kqMR+HwDXtko0rAZqMtOK6vyTzkHsd8jtkdzYc90RpZ5Ph+UAnwfDWJdiQp4h4NwkFvUM9o48B3qEaHdacbo8oJIExz7jtq06NnRpuyr00TdX3oJBIXAmr2+L80j3Yc9jQkykaPOxzr3KFE4gOA5YrrxAnZ+V8=; ASP.NET_SessionId=d41psjz4e0dpauzg4s4sodze' -H 'Origin: https://a836-acris.nyc.gov' -H 'Accept-Encoding: gzip,deflate' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Cache-Control: max-age=0' -H 'Referer: https://a836-acris.nyc.gov/DS/DocumentSearch/DocumentType' -H 'Connection: keep-alive' --data '__RequestVerificationToken=rQmDvRaRhwTaGK1hsNkgQtNZ6T9t40%2FrCsgWknRIl6pu0a7%2FzOG2KKj5aQSWsOsE%2BThL%2F6LbtaYMrBGuGhumm7IrgX188V7s5e3OQY3HJVjbfd79Hdzp9qnW53MhkSjqCw49v3u8Eo1UhM7Vp4pidGJaPEa43tOkqYNGkq2SnGg%3D&hid_doctype=DEED&hid_doctype_name=DEED&hid_selectdate=7&hid_datefromm=&hid_datefromd=&hid_datefromy=&hid_datetom=&hid_datetod=&hid_datetoy=&hid_borough=0&hid_borough_name=ALL+BOROUGHS&hid_max_rows=10&hid_page=1&hid_ReqID=&hid_SearchType=DOCTYPE&hid_ISIntranet=N&hid_sort=' --compressed

And it returns this:

<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found <a HREF="http://a836-acris.nyc.gov/BandwidthPolicy/ACRIS-BW-POL.html">here</a></body>

John Krauss

unread,
Nov 11, 2014, 1:58:20 PM11/11/14
to betan...@googlegroups.com
I would recommend just making your requests through a different cloud service (like Digital Ocean) or setting up a proxy on another cloud provider.

This would probably take much less time than waiting for DoF to get back to you. :D

David Schwartz

unread,
Nov 11, 2014, 10:27:44 PM11/11/14
to betan...@googlegroups.com
Hi Ralph,

I tried that exact URL, but without all of the extra headers and cookies.  Maybe I just got lucky with the IP address of my machines, or maybe they locked it again since I tried.

I will keep you posted if I find any further details.

David

You received this message because you are subscribed to a topic in the Google Groups "BetaNYC-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/betanyc-dev/5Avgjydpbh4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to betanyc-dev...@googlegroups.com.

To post to this group, send email to betan...@googlegroups.com.

Ralph Yozzo

unread,
Nov 12, 2014, 3:42:25 PM11/12/14
to betan...@googlegroups.com
Hi David,

The NYC server security appears to haphazard at best :)

Thanks,
Ralph

Reply all
Reply to author
Forward
0 new messages