Mining Python Projects

53 views
Skip to first unread message

malw...@oregonstate.edu

unread,
Nov 28, 2018, 2:33:44 PM11/28/18
to Boa Language and Infrastructure User Forum

Hi all, 


I am impressed to know about the Boa platform which help us to mine large amount of code repositories and thank you for your contribution which helps to research community.


I went through the examples list and the programming guide. Before diving deeply into it, I have a basic level question for you. My requirement is to mine python projects in GitHub. I just wanted to know about the current level of capability of Boa for mining python projects. In the other topics in the user forum, I noticed that you are looking forward to support multiple languages.  


I assume Boa support language dependent mining capability. What is the level of support to mine Python projects compared to mining Java projects (I noticed couple of Java specific key words in the program guide). 


For an example (to be specific): 

Let’s take following example, It examines the number of valid Java files in the latest snapshot. If I want to get the number of python files in the snapshot, what key word should I use instead of "SOURCE_JAVA_JLS" ?

p: Project = input;

counts: output sum of int;

 visit(p, visitor {

            before node: CodeRepository ->

                        counts << len(getsnapshot(node, "SOURCE_JAVA_JLS"));

});


Regards,

Malinda

Robert Dyer

unread,
Nov 28, 2018, 3:11:09 PM11/28/18
to boa-...@googlegroups.com
Hi Malinda,

As of today, the only source code available for mining within the published Boa datasets is Java source code.

Our next dataset will extend the data to include other programming languages.  From what I recall however, Python was not in that list.  We were targeting PHP and JavaScript next.

- Robert

--
More information about Boa: http://boa.cs.iastate.edu/
---
You received this message because you are subscribed to the Google Groups "Boa Language and Infrastructure User Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to boa-user+u...@googlegroups.com.
To post to this group, send email to boa-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rajan, Hridesh [COM S]

unread,
Nov 28, 2018, 4:43:43 PM11/28/18
to boa-...@googlegroups.com

Hi all,


Python is also on the way... stay tuned. 


Best wishes,
Hridesh

Dr. Hridesh Rajan
Professor, Department of Computer Science
Kingland Professor of Data Analytics
Professor-In-Charge, ISU Data Science Program
Chair, Graduate Admissions & Recruitment
Director, Laboratory for Software Design
Iowa State University of Science and Technology

From: boa-...@googlegroups.com <boa-...@googlegroups.com> on behalf of Robert Dyer <psy...@gmail.com>
Sent: Wednesday, November 28, 2018 2:11:06 PM
To: boa-...@googlegroups.com
Subject: Re: [Boa Users] Mining Python Projects
 

Malinda Dilhara Malwala Arachchige

unread,
Nov 29, 2018, 1:02:31 PM11/29/18
to boa-...@googlegroups.com
Hi Hridesh and Robert,

Thank you so much for the information. Looking forward to see the python support. 

One small question, currently, is it possible to get the information of all the python projects (project name, link .. etc) from your project database.

Regards,
Malinda

Hoan Nguyen

unread,
Nov 29, 2018, 1:15:21 PM11/29/18
to boa-...@googlegroups.com, malw...@oregonstate.edu
Hi Malinda,

Yes, you can write a Boa query to get Python projects with all metadata.
I believe there are examples to do similar task for Java projects.

Best regards,
Hoan
Reply all
Reply to author
Forward
0 new messages