dataToTag

7 views
Skip to first unread message

Simon Peyton Jones

unread,
Nov 3, 2022, 6:00:51 AM11/3/22
to Core Libraries Committee, Matthew Craven
Dear Core Libraries Committee

Matthew and I would like to consult you about the API exposed to users of `base` for the `dataToTag#` operator.

Currently we have a primop
dataToTag# :: forall (a :: Type). a -> Int#
and a strangely named alias in GHC.Base
getTag :: forall (a :: Type). a -> Int#
getTag x = dataToTag# x

Of course this is all wrong:
  • dataToTag# is way too polymorphic: it can't possibly work for every type
  • It is levity-monomorphic, so you can't use it on unlifted data types
  • Plus, its implementation is a mess -- there is an ad hoc check that all uses of dataToTag# are to data types; but it's a fragile check,.
So Matthew is fixing that by introducing

type DataToTag :: forall {lev :: Levity}. TYPE (BoxedRep lev) -> Constraint
class DataToTag a where
     dataToTag# :: a -> Int#

That fixes both things at one blow:
  • dataToTag# now has a type class constrained type, so it's not over-polymoprhic
  • Both the class and its operation are levity-polymorphic.
  • The implementation is very nice; no hacks any more.
So far so good: it's a change, but a backward-compatible change.

But we'd also like to kill off the strangely named `getTag` while we are about it (with a deprecation cycle).  And we propose to define
dataToTag :: forall (a :: TYPE LiftedRep). DataToTag a => a -> Int
dataToTag' :: forall (a :: TYPE UnliftedRep). DataToTag a => a -> Int
as wrappers for `dataToTag#` that return a civilised boxed Int.   It would be nice to make these levity-polymorphic too, but you can't write
dataToTag x = I# (dataToTag# x)
because there is a levity-polymorphic binder `x`.

An alternative would be to put dataToTag into the class like this

class DataToTag a where
     dataToTag# :: a -> Int#
     dataToTag :: a -> Int

and now dataToTag can be levity-monomorphic.  But the implementation is significantly more fiddly, because we have to build that dictionary on the fly.

The naming of the unlifted version is up for grabs.  I suggested dataToTag' by analogy with foldl'

Questions:
  • Is this in scope for CLC?  My answer is firmly yes: the API of the base library is one of the CLC's primary responsibilities.
  • Should we open a ticket somewhere with the content of this email?
  • Any thoughts about the design?
Thanks!

Simon


chessai

unread,
Nov 3, 2022, 9:47:00 AM11/3/22
to Simon Peyton Jones, Core Libraries Committee, Matthew Craven
Hi Simon,

This is a good idea. I've wanted such an improvement before.

That being said, can you make a proposal at https://github.com/haskell/core-libraries-committee ?

Note, while dataToTag# is in base, typically we (CLC) don't consider changes to the GHC namespace and instead leave that to GHC HQ. 

--
You received this message because you are subscribed to the Google Groups "haskell-core-libraries" group.
To unsubscribe from this group and stop receiving emails from it, send an email to haskell-core-libr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/haskell-core-libraries/CAJKmMz_WuwA%2Bi1gd1KbCjqRLSv%2B7HxSqDt0yUw6%2B05_ntZ3gRQ%40mail.gmail.com.

Simon Peyton Jones

unread,
Nov 3, 2022, 12:15:43 PM11/3/22
to chessai, Core Libraries Committee, Matthew Craven
Thanks!
 
That being said, can you make a proposal at https://github.com/haskell/core-libraries-committee ?

I opened this issue. I hope that's the right thing to do?

Note, while dataToTag# is in base, typically we (CLC) don't consider changes to the GHC namespace and instead leave that to GHC HQ. 

That's interesting!  I'd never seen that in writing before.  It relates to another thread (a GHC proposal that is going nowhere I think) which asks the question:

What, precisely, is the API of base?

That is, which types and functions are (A)
    • visible to clients,
    • protected by the PVP,
    • kept as stable as possible
    • part of the CLC's purview
and which types and functions are (B)
  • properly considered as artefacts of the implementation of base
  • not protected by the PVP
  • may change at any time without notice or deprecation
  • are not part of the CLC's purview
As an implementor I would find it super-helpful to have an answer to this question, because then I know what I can change at will, and what I need to consult the CLC about.  Users may find it super-helpful too because they know that they cannot rely on the stability of (B).  So if the find themselves depending on stuff in (B) they may want to petition the CLC to move that function or type from (B) to (A).

As an example, the type Int is clearly in (A).  But is its representation in (A)?  That is, can I (at whim) change
data Int = I# Int#
to
data Int = MkInt Int#
on the grounds that the representation of Int is an affair purely internal to GHC? I bet if I did that there would be much yelling.

Your implied answer is:
  • Anything exported by a module not starting GHC.* is in category (A)
  • Anything exported only by modules GHC.* is in category (B)
That may be a fine answer but, if that is the CLC's view then
  1. We should make that Crystal Clear to everyone.
  2. The CLC might want to audit the libraries to ensure that all important stuff in in (A)  -- see examples above
I think it would be great to make progress on this topic.

Simon


Reply all
Reply to author
Forward
0 new messages