Update on the generic api

30 views
Skip to first unread message

Maik Schünemann

unread,
Apr 4, 2014, 10:54:44 AM4/4/14
to numerica...@googlegroups.com
Hi,
After the last discussions on the generic api, I think I managed to get the generic api into a fairly usable state.
I give a summary of the design and how it addresses the challenges for a generic api and how to use it.

## Decomplecting the implementation and the scalar functions
This is the main idea of this proposed generic api. The design ideas came from a long discussion we had
on the mailing list during the GSoC period last year. Here, in short, again the problem:
- Implementing a new scalar type by a new implementation of the clojure.core.matrix api complects container type
  and scalar functions.
  It should be possible, that matrix implementations that support Objects as elements can be used with ANY scalar type.
  It should be possible to do complex arithmetic with ndarray or build up a computation graph with it.
- Implementing a new copy of the numeric clojure api for each new scalar type would solve that as ndarray could just implement
  the protocols for the new scalar type. However, it also introduces additional problems:
  1) It is much work to do for every scalar type - adding a new one is not easily possible.
  2) When different scalar types are used in one program one has to include each specific api and has to be explicit about where to use
      the functions for the scalar types. That selection would be automatic in the first approach of implementing the numeric protocols with
      implementations for different scalars. That is also the behaviour one expects from other matrix libraries. If only double matrices are involved,
      do double arithmetic without overhead for complex numbers, but do complex arithmetic with complex matrices. That way the actual choice
      of the scalar type can be abstracted away

It should also be clear that there is no black or white line about how the generic functionality should be provided. For some applications it is useful to let
the library decide when which scalar operations are performed. For other applications it is sensible to only use an api for one specific scalar type or that it
is really important to be sure that the right scalar operations are performed (for example not losing parts of the computation when one creates a graph of it).

The proposed generic api solves this problems in the following way:
The generic functions in the namespace clojure.core.matrix.generic are given the scalar operations as the first `argument` via a specialisation map.
A specialisation map is a map or better record with keys/fields :add, :mul, :sub, :div, :one, :zero, :<,  :>, etc.
Like the numeric methods in clojure.core.matrix, they are based upon protocols. The protocols are generic versions of the numeric protocols in clojure.core.matrix.protocols
but all method names start with generic- and all methods get as extra last argument the provided specialisation map. The default implementation for the generic-protocols are defined
in clojure.core.matrix/impl/generic-default and are basically copies from the normal default implementation, but 1) there is a let at the beginning of each method which extracts the
scalar operations from the specialisation map (let [scalar? (:scalar? spec) add (:add spec) ...) and the implementations for Number are gone - their implementations are merged into the
protocol Implementation for Object with a leading (if (scalar? m) generic-code-for-Number-implementation generic-code-for-Object-implementation).

Because all default implementations for the numeric methods are provided, the generic api is useable with `any` core.matrix implementation. Also, the generic protocols can be implemented
by the implementations to optimize performance.
Defining a new scalar type is just a matter of providing a new specialisation map - it can't be simpler that that.

Thereby, the generic api clearly decouples the implementation from the scalar operations. Using it however, one has to specify the specialisation to all operations which can become cumbersome quite
quickly e.g (add complex (mul real a b) (div real b c)). Also the two scenarios of how one might want to use a generic api should be supported (eg. using the clojure.core.matrix methods directly or using the methods from a separate namespace without having to specify the specialisation).
To accomplish these, I included:
1) a generate-api macro that takes a specialisation as argument and expands to a definition of the numeric methods like in clojure.core.matrix which just call the generic functions with the given specialisation.
2) a wrap-generic method, which makes it possible to use the generic api with the functions from clojure.core.matrix. the (wrap-generic container spec) method will create a instance of a GenericWrapper deftype
    which implements all mandatory core.matrix protocols, the coercion protocols and all numeric protocols and delegates in the implementation to either the implementation of the container in the case of a method   that has nothing to do with the scalar type (like mget, mset, etc) or to the generic-protocol implementation of the container in case of a numeric method (like add, mul, div, etc). That way
   (mul (add (wrap-generic (matrix :implementation data) spec) [[1 2][3 4]]) 5) are all using the generic api with the provided specialisation automatically.

The generic api is tested in clojure.core.matrix.test-generic with test.check generatively with a specialisation for real numbers against the numeric clojure.core.matrix api. Using test.check here is quite powerful
and ensures that the numerical and the generic api don't get out of sync.

I think the generic api is quite usable right now and that having the actual working code will speed up the process of getting all the details right (like it did for the selector functions ;)).

One thing one has to remember is that this flexibility comes at costs of extra protocol calls and one record lookup per method per scalar operation. That should not be that much, especially for symbolic matrices or a graph of the computation as scalar type, but could certanly be an overhead for a complex number implementation. In such a case it would be desireable to enjoy the flexibility of the generic api without paying the price for the extra protocol method invocations and record lookups. `Compiling in` a specialisation map would be a good way to provide better performance for such a scalar type.
However, all this can also be achieved manually right now and the I think the recent experiments with code generation (thanks also to ranko for this) demonstrated that `compiling in` a specialisation map is certainly feasible.
But I think we should add support for it when the performance is needed - when we get a complex number implementation for which the current generic api is too much overhead. And get going with the generic api without code generation, so that we can work with arbitrary scalar types in core.matrix and get the api right before we improve performance with code generation.

You can see what is added for the current generic implementation here: https://github.com/mikera/core.matrix/pull/132

Once this gets into core.matrix we will at least get support for symbolic matrices with expresso expressions as scalar types ;)

Reply all
Reply to author
Forward
0 new messages