Dart Language and Library Newsletter (2017-09-29)

112 views
Skip to first unread message

Florian Loitsch

unread,
Sep 29, 2017, 3:55:54 PM9/29/17
to General Dart Discussion

Dart Language and Library Newsletter

Welcome to the Dart Language and Library Newsletter.

Did You Know?

JSON Encoding

The dart:convert library has support to convert Dart objects to strings. By default only the usual JSON types are supported, but with a toEncodable (see the JsonEncoder constructor) all objects can be encoded.

Example:

import 'dart:convert';

class Contact {
  final String name;
  final String mail;
  const Contact(this.name, this.mail);
}

main() {
  dynamic toEncodable(dynamic o) {
    if (o is DateTime) return ["DateTime", o.millisecondsSinceEpoch];
    if (o is Contact) return ["Contact", o.name, o.mail];
    return o;
  }

  var data = [
    new DateTime.now(),
    new Contact("Sundar Pichai", "sun...@google.com")
  ];

  var encoder = new JsonEncoder(toEncodable);
  print(encoder.convert(data));
}

This program prints: [["DateTime",1506703358743],["Contact","Sundar Pichai","sun...@google.com"]] (at least if you print it at the exact same time as I did).

Another option is to provide a toJson method to classes that should be encodable, but that has two downsides:

  1. It doesn't work for classes one can't modify (like the DateTime in our example).
  2. It forces a specific encoding. Users might want to encode the same object differently depending on where the encoding happens. For the same object, one RPC call might require a different JSON, than another.

When the JSON encoding fails, a JsonUnsupportedObjectError is thrown.

This error class provides rich information to help finding the reason for the unexpected object.

Let's modify the previous example:

main() {
  var data = {
    "key": ["nested", "list", new Contact("Sundar Pichai", "sun...@google.com")]
  };
  var encoder = new JsonEncoder();
  print(encoder.convert(data));
}

Without a toEncodable function, the encoding now fails:

// On 1.24
Unhandled exception:
Converting object to an encodable object failed.


// On 2.0.0
Unhandled exception:
Converting object to an encodable object failed: Instance of 'Contact'

We recently (with Dart 2.0.0-dev) improved the output of the error message so it contains the name of the class that was the culprit of the failed conversion. For older versions, this information was already available in the cause field:

 try {
    print(encoder.convert(data));
  } on JsonUnsupportedObjectError catch (e) {
    print("Conversion failed");
    print("Root cause: ${e.cause}");
  }
}

which outputs:

Conversion failed
Root cause: NoSuchMethodError: Class 'Contact' has no instance method 'toJson'.
Receiver: Instance of 'Contact'
Tried calling: toJson()

The cause field is the error the JSON encoder caught while it tried to convert the object. The default toEncodable method tries to call toJson on any object that hasn't one of the JSON types, and, in this case, the Contact class didn't support this message.

The unsupportedObject field contains the culprit. Aside from simply printing the object, this also makes it possible to inspect it with a debugger.

Finally, there is a new field in Dart 2.0.0: partialResult.

// Only in Dart 2.0.0:
print("Partial result: ${e.partialResult}");

yields: Partial result: {"key":["nested","list",.

Usually, this really simplifies finding the offender in the source code.

Fixed-Size Integers

As part of our efforts to improve the output of AoT-compiled Dart programs we are limiting the size of integers. With Dart 2.0, integers in the VM will be 64 bit integers, and not arbitrary-sized anymore. Among all the investigated options, 64-bit integers have the smallest migration cost, and provide the most consistent and future-proof API capabilities.

Motivation

Dart 1 has infinite-precision integers (aka bignums). On the VM, almost every number-operation must check if the result overflowed, and if yes, allocate a next-bigger number type. In practice this means that most numbers are represented as SMIs (Small Integers), a tagged number type, that overflow into "mint"s (medium integers), and finally overflow into arbitrary-size big-ints.

In a jitted environment, the code for mints and bigints can be generated lazily during a bailout that is invoked when the overflow is detected. This means that almost all compiled code simply has the SMI assembly, and just checks for overflows. In the rare case where more than 31/63 bits (the SMI size on 32bit and 64bit architectures) are needed, does the JIT generate the code for more number types. For precompilation it's not possible to generate the mint/bigint code lazily, and all code must get emitted eagerly, increasing the size of the output.

Also, fixed-size integers play very nicely with non-nullability. Contrary to the JIT, the AoT compiler can do global analyses (or simply use the provided information when non-nullability makes it into Dart) and optimize programs accordingly.

If an integer is known to be not null the AoT compiler can remove the null check. With fixed-size integers, the compiler can then furthermore always transmit the value in a register or the stack (assuming both sides agree). Not only does this remove the SMI check, it also makes the GC faster, since it knows that the value cannot be a pointer. Without fixed-size integers the value could be in a mint or bigint box.

Experiments on Vipunen (an experimental Dart AoT compilation pipeline) have shown that the combination of non-nullability and fixed-size integers (unsurprisingly) yields the best results.

Semantics

An int represents a signed 64-bit two's complement integer. They have the following properties:

  • Integers wrap around, or worded differently, all operations are done modulo 2^64.
  • Integer literals must fit into the signed 64-bit range. For convenience, hexadecimal literals, such as 0xFFFFFFFFFFFFFFFF are also valid if they fit the unsigned 64-bit range.
  • The << operator is specified to shift "out" bits that leave the 64-bit range. - A new >>> operator is added to support "unsigned" right-shifts and is added as const-operation.

Dart 2.0's integers wrap around when they over/underflow. We considered alternatives, such as saturation, unspecified behavior, or exceptions, but found that wrap-around provides the most convenient properties:

  1. It's efficient (one CPU instruction on 64-bit machines).
  2. It makes it possible to do unsigned int64 operations without too much code. For example, addition, subtraction, and multiplication can be done on int64s (representing unsigned int64 values) and the bits of the result can then simply be interpreted as unsigned int64.
  3. Some architectures (for example RISC-V) don't support overflow checks. (See https://www.slideshare.net/YiHsiuHsu/riscv-introduction slide 12).

Compatibility & JavaScript

When compiling to JavaScript, we continue to use JavaScript's numbers. Implementing real 64 bit integers in JavaScript would degrade performance and make interop with existing JavaScript and the DOM much harder and slower.

With the exception of possible restrictions on the size of literals (only allowing literals that don't lose precision), there are no plans to change the behavior of numbers when compiling to JavaScript. These limitations are still under discussion.

Unfortunately, this also means that packages need to pay attention when developing for both the VM and the web. Their code might not behave the same on both platforms. These problems do exist already now, and there is unfortunately not a good solution.

Backwards-compatibility wise, this change has very positive properties: it is non-breaking for all web applications, and only affects VM programs that use integers of 65+ bits. These are relatively rare. In fact, many common operations will get simpler on the VM, since users don't need to think about SMIs anymore. For example, users often bit-and their numbers to ensure that the compiler can see that a number will never need more than a SMI. A typical example would be the JenkinsHash which has been modified to fit into SMIs:

/**
 * Jenkins hash function, optimized for small integers.
 * Borrowed from sdk/lib/math/jenkins_smi_hash.dart.
 */
class JenkinsSmiHash {
  static int combine(int hash, int value) {
    hash = 0x1fffffff & (hash + value);
    hash = 0x1fffffff & (hash + ((0x0007ffff & hash) << 10));
    return hash ^ (hash >> 6);
  }

  static int finish(int hash) {
    hash = 0x1fffffff & (hash + ((0x03ffffff & hash) << 3));
    hash = hash ^ (hash >> 11);
    return 0x1fffffff & (hash + ((0x00003fff & hash) << 15));
  }
  ...
}

For applications that compile to JavaScript this function would still be useful, but in a pure VM/AoT context the hash function could be simplified or updated to use a performant 64 bit hash instead.

Comparison to other Platforms

Among the common and popular languages we observe two approaches (different from bigints and ECMAScript's Number type):

  1. int having 32 bits.
  2. architecture specific integer sizes.

Java, C#, and all languages that compile onto their VMs use 32-bit integers. Given that Java was released in 1994 (JDK Beta), and C# first appeared in 2000, it is not surprising that they chose a 32 bit integer as default size for their ints. At that time, 64 bit processors were uncommon (the Athlon 64 was released in 2003), and a 32-bit int corresponds to the equivalent int type in the popular languages at that time (C, C++ and Pascal/Delphi).

32 bits are generally not big enough for most applications, so Java supports a long type. It also supports smaller sizes. However, contrary to C#, it only features signed numbers.

C, C++, Go, and Swift support a wide range of integer types, going from uint8 to int64. In addition to the specific types (imposing a specific size), Swift also supports Int and UInt which are architecture dependent: on 32-bit architectures an Int/UInt is 32 bits, whereas on a 64-bit architecture they are 64 bits.

C and C++ have a more complicated number hierarchy. Not only do they provide more architecture-specific types, shortintlong and long long, they also provide fewer guarantees. For example, an int simply has to be at least 16 bits. In practice shorts are exactly 16 bit, ints exactly 32 bits, and long long exactly 64 bits wide. However, there are no reasonable guarantees for the long type. See the cppreference for a detailed table. This is, why many developers use typedefs like int8uint32, instead of the builtin integer types.

Python uses architecture-dependent types, too. Their int type is mapped to C's long type (thus guaranteeing at least 32 bits, but otherwise being dependent on the architecture and the OS). Its long type is an unlimited-precision integer.

Looking forward, Swift will probably converge towards a 64-bit integer world since more and more architectures are 64 bits. Apple's iPhone 5S (2013) was the first 64-bit smartphone, and Android started shipping 64-bit Androids in 2014 with the Nexus 9.

Further Reading

The corresponding informal specification for this change contains more details and discussions of other alternatives that were considered.

Gerhard Piette

unread,
Sep 30, 2017, 1:39:00 PM9/30/17
to Dart Misc
Hi,

IMO:
- Integer numbers should be simple and efficient. Wrap of integers is simple and efficient and useful for the computation of checksums.
- The default integer size (also used for checksums and offsets/indexes) should be platform specific and correspond to the pointer size of the platform. Javascript integers as default for the Javascript platform is the right choice.
- Dart should offer the expected set of signed and unsigned integer types: 8, 16, 32 and 64 bit.
- Floating point numbers should be platform specific.
- Maybe the DartVM can implement efficiently unum or unum 2 numbers.
Message has been deleted

Gerhard Piette

unread,
Sep 30, 2017, 1:39:00 PM9/30/17
to Dart Misc
 
If we added multiple integer types we would furthermore have to solve the following issues:
  • subtyping: is a int8 a subtype of int16 or int?
  • coercions: should it be allowed to assign an int8 to an int16? What about uint8 to int16? or uint16 to int16?
IMO:
- No subtyping for numbers.
- Automatic creation of a i16 (int16) number from a i8 (int8) number. Because there is no loss of information.
- Explicit creation of a i8 number from a i16 number. Explicit because there might be loss of information.
- Explicit casting / coercion between signed and unsigned numbers.

Gerhard Piette

unread,
Oct 2, 2017, 9:57:15 AM10/2/17
to mi...@dartlang.org

Florian Loitsch

unread,
Oct 3, 2017, 3:26:45 PM10/3/17
to mi...@dartlang.org
I agree. The unum proposals are pretty cool. It's an alternative to floating-point numbers, though. They are orthogonal to this proposal: the int64 section only affects integers.
Also, we could not just simply change our floating-point numbers to unums. Not only would it be impossible to implement efficiently in JavaScript, we would still need to support `double`s for backwards-compatibility.

On Mon, Oct 2, 2017 at 3:57 PM Gerhard Piette <gerhard...@gmail.com> wrote:
John L. Gustafson proposed another number type after unum 1 and unum 2: posit

--
For other discussions, see https://groups.google.com/a/dartlang.org/
 
For HOWTO questions, visit http://stackoverflow.com/tags/dart
 
To file a bug report or feature request, go to http://www.dartbug.com/new
---
You received this message because you are subscribed to the Google Groups "Dart Misc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to misc+uns...@dartlang.org.

Gerhard Piette

unread,
Oct 3, 2017, 5:29:08 PM10/3/17
to mi...@dartlang.org
I used the occasion to post my opinion about number types in program languages and Dart.
IMO:
- Only the explicitly sized integer types should behave the same on all platforms. They can be used for storage of data or integer computations (e.g. finance).
- The default integer type should be platform specific and used for offsets and hashcodes.
- Regarding floating point numbers, a program language should offer what the contemporary and future platforms and devices offer. This is the most simple, efficient, useful, future proof and future benefiting approach.

In the presentation Keynote: Beating Floats at Their Own Game - John Gustafson, John Gustafson said that the IEEE and big companies, including Google, are interested in the "posit" number type.
The "unum" numbers could be supported by hardware in the next years. Maybe sooner if Dart (as one of Google's languages) and other languages allow or support them in the specification.
He also said that the IEEE 754 standard allows that different devices compute numbers in a different way and give different results.
Thus even if the Dart specification guaranteed the IEEE 754 standard, it would not guarantee portability (e.g. for unit tests, cloud, IOT, desktop, phones, CPU, GPU) in practice.
Reply all
Reply to author
Forward
0 new messages