[Niteration] May 2015

12 views
Skip to first unread message

Jean Privat

unread,
Jun 2, 2015, 10:15:18 AM6/2/15
to nitla...@googlegroups.com

Welcome to this year's sixth issue of Niteration, the newsletter for the Nit project.

Combined Changes for May 2015

  • changes: v0.7.4 .. v0.7.5
  • shortstat: 700 files changed, 22733 insertions(+), 10286 deletions(-)
  • Pull Requests: 82
  • non-merge commits: 410

Have contributed (or co-authored patches): Alexis Laferrière, Alexandre Terrasa, Jean Privat, Lucas Bajolet, Romain Chanoir, Julien Pagès, Arthur Delamare, Budi Kurniawan, Mehdi Ait Younes and Johan Kayser.

Highlight of The Month

The highlight of this month concerns low-level byte and file manipulations.

Bytes

Bytes are a fundamental low-level concept in computer science. Up to now, Strings were used in Nit to manipulate them. This was an non-optimal solution waiting for a replacement.

Here it comes. Two new classes are offered: Byte, as a low-level primitive representation of a single byte, and Bytes, as an efficient abstraction of a sequence of bytes (used to replace String and Buffer to hold bytes).

Readers and Writers can now read/write bytes in addition to string. Currently, there is no real differences, but this step is required in order to have Unicode characters and strings.

Moreover, a literal byte notation is provided. The notation distinguishes the value (that can be represented in decimal, hexadecimal, octal and binary) and the data-type that is Int by default and Byte if the suffix u8 is used (currently, only u8 is recognized)

var i = 10 # ten as an Int
var j = 0b1010 # still ten (in binary) and still an Int
assert i == j

var k = 10u8 # ten as a Byte
var l = 0b1010u8 # still ten (in binary) and still a Byte
assert k == l
  • Merge: Binary/Octal literal Ints 9f1d610
  • Merge: Byte literals be87b9f

Independently, new services were added to streams to manipulate standard low-level binary representations of data. The module binaryadds services to Writer and Reader to read/write binary data.

The module introduces different services for various types of data. They can be used to save space on a stream or for compatibility with external programs. They use the C typing convention:

  • write_float write a Nit Float on 32 bits, with a loss of precision.
  • write_double write a Nit Float on 64 bits.
  • write_int64 write a Nit Int on 64 bits, with usually no loss of data, but it may still happen depending on the platform.
  • write_bool and write_bits to write booleans as a byte.

The endianess of each stream is configurable.

  • Merge: Read/Write binary data ecc38fa
  • Merge: Binary over network aecfd53
  • Merge: Fix reading from Sockets decb8c2

Blocking Read

Since 39fcf4a7 and because people prefer the Python/Ruby semantic to the C semantic, eof is blocking.

Therefore BufferedReader.read was rewrote to follow the Python/Ruby semantic: if there is 1 character in the system buffer, and the programmer asks for 2, the program will wait to have 1 more character, that can block the whole program if we are waiting the missing character from some keyboard, pipe or TCP connection.

Here's a comparison of the various specifications in Nit/Ruby/Python:

The protocol is the following: I read(4) bytes from stdin and print them. On stdin, I write ab\n (3 bytes) then cd\n (3 other bytes) and see when the the read does its job. In the following, I only give the outputs since the input will be the same and adding it will require distinguishing them from the output.

$ ./nit_old -e 'print sys.stdin.read(4)'
a

Why was only a char a printed? OK, the implementation before the change was buggier than expected.

$ ./nit_new -e 'print sys.stdin.read(4)'
ab
c

As expected, nothing is printed after the first ab\n because exactly 4 bytes are waited for, then printed. Thus we get ab\nc. The extrad\n is lost in Nit's buffer.

$ python <-c 'import sys;print(sys.stdin.read(4))'
ab
c

So, exactly the same behavior as the new Nit.

$ ruby -e 'puts $stdin.read(4)'
ab
c
$ d
bash: d : commande introuvable

Same behavior with Ruby, so nothing is printed after the first ab\n. Nice surprise however, the extra d\n are not lost but kept in system buffer, so still available (and read) when the shell takes the control back. I am not really sure which behavior I prefer. The Ruby one might be saner from an OS point of view; but since I was surprised, one can assume that the POLA level is not that high.

Standard Library

In addition to all the stream and byte update, a lot of changes were added to the libraries.

Safe Collection Access

We updated the signatures of access methods in collections: has[] and anything related that takes an E, a K or a V in read-collections. This make these classes covariant-safe.

Previously, the behaviour was to strictly accept the bound, thus making the code unsafe but had the advantage of providing static hints.

assert [1,2,3].has("Hello") # static error, expected int got string

But this behavior had issues because:

  • unsafe code, it is not because Nit can be unsafe that Nit should be unsafe.
  • a perfectly semantic answer is possible, e.g. returning false in the previous example. thus the existing behavior is not POLA

Because the philosophy of Nit is that static types should help the programmer without being annoying, this PR lifts the constraint on parameters of all read access methods.

The semantic is now more aligned with the one of dynamically typed languages where the behavior is decided by messages and not by the types of things.

This especially allows to use collections of thing when == does not imply the equality of static types (e.g. subclasses of Sequence).

var a = [1,2,3]
var l = new List[Int].from(a)
assert a == l # by the law of Sequence

var aa = [a]
assert aa.has(l) # before the PR, error; with the PR, true

var ma = new Map[Array[Int],String]
ma[a] = "hello"
assert ma.has_key(l) # before the PR, error; with the PR, true
  • Merge: Safe collection access 2309d93

Steps on Iterations and Ranges

Iterator can locally (next_by) or globally (to_step) advance with more than a single next

Range gain step to have a generic bidirectional and stepable iterator. So now people can write:

for i in [1..10].step(2) do print i # in order 1 3 5 7 9
for i in [10..1].step(-2) do print i # in reverse order 10 8 6 5 2
  • Merge: Steps on iterations and ranges 77e92d2

Other Changes in Standard

  • Merge: Kill PushBackReader 90bb477
  • Merge: Copy to native 6257231
  • Merge: Fix nitunit for String::format 7c2fe92
  • Merge: lib/standard: reimplement files with light FFI ad9aa30
  • Merge: Intro an unrolled linked list, and fix List::insert c19db97
  • Merge: Int::times: clean nitunit and expose negative behavior 9183313
  • Merge: introduce plain_to_s d55330d
  • Merge: Nit objects to plain JSON a05897f

Other Libraries

  • Merge: lib: introduce MongoDB wrapper de6de4d

  • Merge: Markdown: some improvement and fixes 1e5151a

  • Merge: Markdown location 415ed3c

  • Merge: nitiwiki: introduce wikilinks 90770bf

  • Merge: nitiwiki next f502ba8

Language

Updates on the language side are mainly new warnings. There is also some preparations for a more rational specification of the variations on attributes.

Advice on Potential Null Pointer Exception

People do not like the current behavior to statically and silently accept a method call on a nullable receiver. In order to prepare to a most strict policy, such unsafe calls are now counted and do emit a advice if -W is used.

Currently, there is only 2000+ (3%) unsafe sites.

  • Merge: metrics: --nullables distinguishes safe and unsafe calls on null 52ecf97
  • Merge: Warn call on nullable receiver dff7da4

Advice on Useless Signatures

Nitpick and other tools are able to suggest the removal of useless types in signature redefinitions with the option -W.

Example:

class A
   fun foo(a: Bool) do end
end

class B
   super A
   redef fun foo(a: Bool) do end
end
$ nitc test.nit -W
test.nit:8,19--22: Warning: useless type repetition on parameter `a` for redefined method `foo` (useless-signature)
redef fun foo(a: Bool) do end
                 ^
Errors: 0. Warnings: 1.

Currently, there is 1500+ cases.

  • Merge: Advice on useless repeated types b5cd4c7

Annotation on Attributes

PR #857 introduced autoinit on attributes that have a default value to initialize them later.

This is a rarely used feature. It is also not POLA because it overloads the name autoinit that has a different meaning on methods cf #1308.

Thus in order to POLAize the spec, the annotation is renamed lateinit. Note: the annotation might be removed in a future PR, this one is a last attempt to keep it.

  • Merge: Annotation lateinit f06cf5a

Without value, attributes in introductions do not have the same semantic as attributes in refinements. In introduction attributes are implicitly autoinit, in refinements they are noautoinit.

This is not POLA since:

  • this confuses beginners
  • readers have to remember if they are in an intro or a refinement
  • additional cognitive fragility in constructors (more cases and rules to take in account)

This PR makes autoinit the default and asks that attributes declared in refinements are either annotated noautoinit or have a default value. This way, the writer has to think about the implication of adding new attributes in existing classes, especially to think about their initialization. Thus this might help the programmer to avoid errors.

The is only 163 cases of such attributes in refinements.

  • Merge: modelize: ask that attributes in refinements are either noautoinit or have a value 41dcf7c
  • Merge: Correct warn on noautoinit ff398dd

Abstract attributes are collected to be part of the initializers (autoinit) of the class.

  • Merge: Autoinit abstract attributes 3f00051

Semicolon

The semicolon ; is usable as a hard line break. While this is not really nit-ish, the only need for this is in fact the ability to write short one-liner scripts in environments where linefeeds are not an option or are not easy to use.

$ ./nit -e 'for line in stdin.each_line do; var xs = line.split(":"); if xs.not_empty then print xs.first; end' < /etc/passwd

Tools

Usual improvements in the tools used by Nit developers.

nitdoc and nitx

A lot of work was done on the documentation infrastructure. Now, nitx and nitdoc are fully merged with the same structure and services.

  • Merge: nitdoc: start using DocComponent rendering d0359ba
  • Merge: nitdoc: migrate the DocPage rendering to composite rendering ace6ea4
  • Merge: nitdoc: some fixes and cleaning aa7fd2f
  • Merge: nitx: start migration to doc_phases 479924a
  • Merge: nitx: finish migration to new nitdoc architecture b8d1117
  • Merge: nitdoc: introduce useful services 8e9acc1
  • Merge: nitdoc: refactoring and cleaning 2f99685
  • Merge: Kill model_utils f38919d

The only visible change is the introduction of a tabbed view to hide less important data like linearization tree, subclasses list etc. This makes the main page cleaner and more readable.

Readers just pass the mouse over a definition to make the tab menu appear and select what details they want to see.

  • Merge: nitdoc: introduce tabbed view to hide less important data 9a3c09d
  • Merge: nitdoc: display constructors list in MClass page 21446b5

Prettier Pretty Printing

Two new options --no-inline and --line-width allow to control some rules about line-breaks. This can be useful especially for people that want more relaxed rules.

$ ./nitpretty --check ../lib | wc -l
224
$ ./nitpretty --check ../lib --line-width 0 --no-inline | wc -l
188

So 36 more files in lib/ are identical to their pretty printing with those relaxed rules.

Clean Nit Compilation Directory

The compilation directory is now removed after compilation.

This auto-removal is disabled if --compile-dir or --no-cc is used because it is a sane thing to do.

Also, the compilation directory is renamed by default to nit_compile since it is not useful anymore to hide it, as its presence means an active request (--no-cc).

  • Merge: Clean nit compilation directory 5250589

Misc

  • Merge: vim: look for `{ in deciding to highlight ffi language name. eg "C". b30507c
  • Merge: Rename all README to README.md 0ae0811

Portable UI API

Introduction of a basic API to create portable UI applications compatible with Android and GNU/Linux. Each implementation uses refinement to implement the API using platform-specific native controls, and GTK on GNU/Linux.

A few notes on the API logic:

  • All attached AppComponents instances receive the life-cycle annotations. This includes all controls and view. So a visible Buttonwill be notified when the application pauses.

  • All controls can be observed by other instances so that they are notified of input events. In the calculator example, we use this to implement the behaviour on button press in the window logic.

The scope of the portable UI at this point is intentionally limited to keep a small number of views while still tweaking the portable UI basics.

  • Merge: GTK clean up and a few new features 1ab76ac
  • Merge: Clean up Android and Linux app.nit libraries 80f4c61
  • Merge: Standardize the name of Android API level annotations c76ec25
  • Merge: Revamp app.nit print_error and print_warning 0dfb7f2
  • Merge: Intro the portable UI of app.nit 103fea4
  • Merge: lib/gtk: add GtkEntry::input_purpose ac80316
  • Merge: Intro AppComponent to notify all elements of an app of its life cycle 97795ca
  • Merge: share/libgc: option to use a local version of the source pkgs aa935cd
  • Merge: Fix a bit of everything: typos, doc, android, bucketed_game and vim 066fc50

FFI

Light FFI

The new light FFI uses only features that should be easy to implement in new/alternative engines to quickly achieve a bootstrap. For this reason, core features of the Nit standard library should be limited to use the light FFI.

  • Merge: Intro the light FFI and use it in nith be7ab8c

Features of the Light FFI

  • Extern methods implemented in C, nested within the Nit code. The body of these methods is copied directly to the generated C files for compilation. Also supports extern new factories.
  • Module level C code blocks, both "C Body" (the default) and "C Header". They will be copied to the beginning of the generated C files.
  • Automatic transformation of Nit primitive types from/to their equivalent in C.
  • Extern classes to create a Nit class around a C pointer. Allows to specify the equivalent C type of the Nit extern class.

Features of the Full FFI

  • More foreign languages: C++, Java and Objective-C.
  • Callbacks to Nit from foreign codes. The callbacks are declared in Nit using the import annotation on extern methods. They are then generated on demand for the target foreign language.
  • Static Nit types in C for precise typing and static typing errors in C.
  • Propagating public code blocks at the module level (C Header). This allows to use extern classes in foreign code in other modules without having to import the related headers. This is optional in C as it is easy to find the correct importation. However it is important in Java and other complex FFIs.
  • Reference pinning of Nit objects from foreign code. This ensure that objects referenced from foreign code are not liberated by the GC.
  • FFI annotations:
    • cflagsldflags and cppflags pass arguments to the C and C++ compilers and linker.
    • pkgconfig calls the pkg-config program to get the arguments to pass to the C copiler and linker.
    • extra_java_files adds Java source file to the compilation chain.

Light FFI only Compilation

Compilers using the module compiler_ffi::light_only do not compile extern method with callbacks. This is a good heuristic to determine whether the method uses the full FFI of the light FFI.

The limitation of this heuristic is on 3 features: static Nit types, propagating public code blocks and reference pinning. These features do not require any declaration on the Nit side so they are not reliably detectable by the compiler. Using these features will cause GCC to raise errors on unfound types and functions.

In the case of public code block propagation, the user can fix it by importing the needed C headers in each module. In the other cases, static Nit types and reference pinning, they are used for callbacks, and the method should probably declare callbacks. Still, there are some very rare situations where these features could be used correctly and the method would still be recognized as light FFI. If this is becomes a problem, we could add an annotation such as is light_ffi to force the heuristic.

Objcwrapper: Filters preprocessed C-like header files to remove static code

This tool is used in the process of parsing header files to extract information on the declared services (the functions and structures). This information is then used to generate bindings for Nit code to access these services.

The C header seldom contains static code. It removes static code from headers, but keeps their signatures. This tool is an extension of header_keeper. It searches the keyword static to identify static code, and ignores the code into their brackets. The result is printed to stdout.

cat Pre-Processed/CGGeometry.h | header_static > Pre-Processed/static_header.h

This module can also be used as a library.

  • Merge: contrib/header_static: a cog in the toolchains to generate Objcwrapper 3bf3c96

FFI Code Now Use self

As requested by everyone, the identifier recv is replaced by self in all Nit FFIs and user code.

  • Merge: Use self in the FFI d522940

Serialization

The serialization system, including the phase, nitserial and the library, has been updated and completed to be useful in real-world programs.

The phase supports the annotation serialize as a replacement to auto_serialize and it is also more versatile. It can annotate a module so all its class definitions are serializable. It can also annotate an attribute so only this attribute is serialized. The noserializeannotation is for exceptions to serialize, such as passwords and data blobs.

The library declares more Nit collections and game-related classes to be serializable.

The extraction of the cache used by serialization engines allows for lasting remote communication (think client/server over network) by using references to already transmitted objects.

  • Merge: Serialization support SimpleCollection and Map, also bug fixes 42c96a2
  • Merge: Apply serialization in calculator, a_star, more_collections, Couple and Container bfb1f6b
  • Merge: Serialization caching and prepare for other engines 9be4b4e7
  • Merge: Serialization: change annotation to serialize and intro noserialize b68e456

Internationalization

Simple POC of internationalization with the module annotation i18n.

Right now, you can append it to a module declaration with, or without the locale of your choice. It then generates a standard gettext .pot template file and a .po file for the language you chose when defining your module, if defined. The generation of the corresponding .mo (using msgfmt) file is still of the responsibility of the user.

There are still several modifications to do for real usability, like the possibility to untranslate a literal string via some kind of notranslate annotation (though if no translation is found, the key is kept as argument, so this should not be an issue, at least using gettext).

So now, when using another value for the $LANGUAGE environment value, a different message will be printed when executing a Nit program.

  • Merge: i18n annotation a4969b4
  • Merge: i18n annot: superstrings fd08319

Examples and Contrib

More tasks of Rosetta Code are implemented in Nit.

  • Merge: Balanced brackets ef58241
  • Merge: Rosettacode grayscale implementation b621326
  • Merge: More Rosetta Code fcd9049

Implementation of convex polygon data structures in Nit. Based on the Java implementation done during the data structures course (INF7341). There also are some benchmarks for the Nit implementation and the Java implementation to compare their respective performance.

  • Merge: Introduction of convex polygons in geometry + benchmarks 91136cd

refund is a tool developed for the evaluation of the course INF2015: Développement de logiciels dans un environnement Agile. Its purpose is to parse a reclamation sheet submitted by an insurance client in JSON format, then to compute the allowed refunds depending on the policy contracted by the client. Even if this program is not really useful for the Nit project, it makes a concrete example on how we can use the language in the real world.

  • Merge: contrib: Introduce refund 6018caf

Miscellaneous

  • Merge: nit rpg ack 688ae37
  • Merge: Rename the visit function of minilang example. a90c906

Internal

Originally, the AST was very abstract and only included nodes that had a semantic use, thus dropped most keywords and symbols during the AST transformation done by sablecc. After all these years, we realized this was a very bad idea as it engendered useless complexity for people that wanted to program precise error messages or develop tools like nitlight or nitpretty. So, this PR adds back most of these nodes in the AST. Future PRs may simplify the code of the tools to deal with a more complete AST, thus removing heuristics to retrieve the missing tokens.

  • Merge: Add missing nodes in the AST 985a55c

Others changes are slight internal improvements

  • Merge: location: introduce from_string constructor 018e345
  • Merge: model: introduce the bottom type b70c8c4
  • Merge: Improve internal mechanisms of the nitvm 6af9921

Bug Fixes

During the work of low-level libraries like bytes and streams, some bugfixes were triggered. Mostly standard boxing & sizeof C bugs.

  • Merge: Fix calls on primitive receivers b9fdcab
  • Merge: Fix C compiler warning on Java FFI and Android apps 5cdb506
  • Merge: Separate Erasure Compiler bugfix 4eeeca4
  • Merge: Missing a unboxing when compiling a call to new NativeArray 1cbdc9f
  • Merge: sepcomp: fix hardening related to the instantiation of dead types 0767ae3
  • Merge: Nitg-g new NativeArray fix 470cd27
Reply all
Reply to author
Forward
0 new messages