--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/66b45a86-3915-4281-a2e7-5607b9398b11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/9e3bc710-67c8-4436-b68b-d8062006daa7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi,
I’m reviving this thread…
There is a use case: single endpoint (e.g. /api/access/datafile/$id), crawled by a collection of bots with dynamic IPs. Similar to the outcome of a DDoS, it could slow down Dataverse services, or result in 50X.
Am consolidating some of the mitigation strategies mentioned here and on the github issues.
If anyone has interest/expertise on these, would like to ask a couple of questions.
Also welcome any other mitigation strategies, and how it might be implemented.
On this thread
(1) robots.txt
Q: How do we make an explicit rule for bots not to crawl /api/access/datafile/$id, in addition to the robots.txt production file?
Q: How and where do you deploy robots.txt file? During deployment on /payara/glassfish/domain/domain1/dataverse-*/robots.txt? Or is it fronted on the Apache proxy?
Q: Has it worked in your experience? Is it correct to say that it's only a code of conduct that may not be respected if bots choose not to?
(2) Restarting apache proxy
Q: How does this reduce the bot?
Other ways
(3) Rate limiting
This seems particularly useful for our case, because it's hitting a single endpoint. However the feature does not seem to be implemented on the application level: https://github.com/IQSS/dataverse/issues/1339.
Q: Would it be in the pipeline for implementation? If so when?
(4) Web application firewall
Q: If it is a botnet (i.e. collection of bots with different IP addresses) and no definite pattern/IPs are dynamic, is this a feasible solution?
Q: Has anyone implemented this?
Thanks in advance,
Eunice
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2cad9775-d652-4d3d-82fe-69b0c89d9bf7n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/e3c8d56e-f9b9-45cb-87a4-3ad3f1e79bfdn%40googlegroups.com.