Get the multilanguage projects correctly

23 views
Skip to first unread message

Stratos K.

unread,
Sep 16, 2019, 10:32:32 AM9/16/19
to Boa Language and Infrastructure User Forum

Hello,
I'm new to Boa. I'm trying to query github repos 1015(small) to get all project with more than one programming languages using this query:

p: Project = input;

names: output collection[string] of string;
languagenames:string;
length := len(p.programming_languages);
if(len(p.programming_languages) > 1)
    {
for (i := 0; i < length; i++)
languagenames=languagenames+", "+p.programming_languages[i] ;   
}
names[p.name]<<languagenames;

I get an error in execution allthough it compiles. If I change the last line to: 
names[languagenames]<<p.name;
then I get results but they seem wierd. For example I get these:
names[null] = tvbabu/htc-Pico
or this:
names[null, PHP, Ruby, JavaScript, C, Java, D, C, Objective-C, C, Shell, Ruby, JavaScript, CoffeeScript, Python, C, Perl, Ruby, JavaScript, Erlang, Shell, Perl, Ruby, Java, D, JavaScript, Python, Awk, C++, C, Ruby, Shell, JavaScript, Perl, Python, Ruby, JavaScript, Ruby, CoffeeScript, JavaScript, C, Shell, Objective-C, Ruby, CoffeeScript, JavaScript, Objective-C, C, D, C++, CSS, JavaScript, Ruby, Perl, D, Java, C++, Shell, CoffeeScript, Ruby, JavaScript, JavaScript, CoffeeScript, Ruby, Python, C++, Go, Java, JavaScript, Perl, Ruby, VimL, JavaScript, PHP, M, Matlab, JavaScript, Shell, Perl, Puppet, Objective-C, C++, C, JavaScript, CSS, C#, Visual Basic, PHP, CSS, JavaScript, C++, Processing, C, Objective-C, Shell, Java, XML, Shell, Perl, Objective-C, Racket, C, C++, Shell, JavaScript, Python, C, Shell, PHP, JavaScript, CSS, JavaScript, Objective-C, C, C++, PHP, CSS, JavaScript, Ruby, JavaScript, Ruby, JavaScript, CSS, Haskell, Shell, C#, JavaScript, ASP, Perl, Java, Python, Perl, Ruby, Objective-C, Shell, Lua, C, C++, Java, Visual Basic, PHP, JavaScript, Java, PHP, JavaScript, Shell, JavaScript, Shell, TypeScript, PHP] = DirkWellmann/Swipe

Both seem wrong. Is something wrong with my query?

Robert Dyer

unread,
Sep 16, 2019, 11:24:21 AM9/16/19
to Boa Language and Infrastructure User Forum
Hi Stratos,

Actually the behavior you see is exactly as expected.

The reason the first code crashed was due to that null (that you see in your output after your change).  Some of the projects (for example, 'tvbabu/htc-Pico') have no data in the programming_languages field.  This just means GitHub wasnt able to confidently guess the PLs for that repo.  So for some projects, your original code wound up never assigning a value to languagenames and thus it was left undefined and causes a runtime error when trying to access it.

Thus when you make your change you see the list of languages includes 'null' in it.  This is because again, the first time through that variable is undefined.  So when you concatenate onto it, it was originally 'null' and apparently in Java (since we generate to Java code and rely on some of their semantics) if you take a null string and concatenate to it you get the string 'null' showing up.

Hope that clarifies things!

PS - maybe the simplest solution that leads to the behavior your were expecting is to initialize that string to an empty string: languagenames := "";

Stratos

unread,
Sep 16, 2019, 12:01:06 PM9/16/19
to boa-...@googlegroups.com

Thank you Robert. Still I was wondering if these results can be correct.For example the second one has so many languages that it can't be right.

 

Sent from my Windows 10 device

--
More information about Boa: http://boa.cs.iastate.edu/
---
You received this message because you are subscribed to the Google Groups "Boa Language and Infrastructure User Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to boa-user+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/boa-user/70f030b9-80ff-46e9-9f25-29a8f3537014%40googlegroups.com.

 

Robert Dyer

unread,
Sep 16, 2019, 12:51:09 PM9/16/19
to Boa Language and Infrastructure User Forum
Hi Stratos,

That does look a bit odd.  Especially when viewing the repository on GitHub.

We will investigate to see if this was some oddity on their API's end or on our side processing the data.  Thanks for pointing it out!

- Robert

Robert Dyer

unread,
Sep 16, 2019, 1:22:05 PM9/16/19
to Boa Language and Infrastructure User Forum
Ok - good news bad news!

The good news - the data is actually perfectly fine!  Phew!

If you initialize your string variable to empty string, you'll see the results you expect and want.

The bad news (for us) - you discovered a bug in the compiler, related to your leaving that string as undefined.  The map process actually gets called inside a single JVM multiple times.  So on later calls it is reusing the older values from prior maps instead of returning it back to null.  It's an easy fix, but one we cant deploy to older datasets such as this due to the fact it can potentially change the result of queries. I will document the bug for this dataset, and fix it for future ones.  Thanks for pointing it out!

- Robert

Robert Dyer

unread,
Sep 16, 2019, 2:58:55 PM9/16/19
to Boa Language and Infrastructure User Forum
Just to keep you in the loop:


- Robert
Reply all
Reply to author
Forward
0 new messages