I am doing research for Open Knowledge International on issues with text- and Datamining. We have already identified some of the issues you may have experienced yourself in
- getting the data you need (access to content via API and licensing/ copyright issues)
- the data itself (formats, cleaning data, lack of digitized content or sharing) and TDM tools
- skills ( being able to program, don't know where to start)
If you are interested I would very much like to hear from you how you are dealing with these issues and what you would recommend in terms of good practices for others involved in the process. Your insights will help us develop recommendations for the EC on how to improve the uptake of TDM.
Do not hesitate to contact me or share any info/case studies that are relevant !