I have a question about how the GitHub dataset was built...
Does the "2015 September/GitHub" dataset contains forked projects
written in Java language? Or the forked projects have been excluded
during the construction of the dataset?
How the aforementioned dataset has been extracted? If this dataset
contains forked projects, how do I filter only Java projects that are not forked?
Many thanks and best regards,
Eduardo Cunha Campos
Software Engineering Ph.D. Student at Federal University of Uberlândia, Brazil