[Haskell-cafe] Standard package file format

279 views
Skip to first unread message

Harendra Kumar

unread,
Sep 16, 2016, 2:20:22 AM9/16/16
to Haskell-community, haskell-cafe, Patrick Pelletier, Duncan Coutts
I am starting a new thread for the package file format related discussion.

From a developer's perspective, the major benefit of a standard and widely adopted format and is that people can utilize their knowledge acquired from elsewhere, they do not have to go through and learn differently looking and incomplete documentation of different tools. The benefit of a common config specification is that developers can choose tools freely without worrying about learning the same concepts presented in different ways.

Multiple formats flying around also create a psychological impression of complexity in the ecosystem for newcomers. If we have consistency there are better chances of attracting more people to the language ecosystem.

I gather the following from the discussion till now:

* We have cabal, YAML and TOML as potential candidates for a common package format which can additionally incorporate the concept of snapshots/package collections and potentially more extensions useful across build tools.

* cabal has the benefit of incumbency and backward compatibility, it has shortcomings which are being addressed but it is still a format which is very specific to Haskell ecosystem. It is not a standard and not going to become one. We have to always deal with it ourselves and everyone coming to Haskell will have to learn it.

* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.

* TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML and is being used by a few important projects but is still evolving and is not completely stable. On a first glance it looks pretty simple and a lot of other tools use a similar config format. It is aiming to become a standard and aiming for a wider adoption.

As a next step we can perhaps do an hpack like experiment using the TOML format. That way we will have some experience with that as well and get to know if there are any potential problems expressing the existing cabal files. 

More thoughts, opinions on the topic will help create a better understanding about it.

-harendra

Tobias Dammers

unread,
Sep 16, 2016, 2:37:40 AM9/16/16
to Harendra Kumar, Patrick Pelletier, Duncan Coutts, Haskell-community, haskell-cafe

Another factor in favor of YAML is that it is a superset of JSON, which eases the learning curve even more (with JSON being a de facto lingua franca for cross-platform untyped data structures), and offers some extra possibilities, although I admit that I can't think of any practical uses. The fact that both Yaml and JSON can be represented as Aeson Values would also make things (arguably) easier for tool writers.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

Imants Cekusins

unread,
Sep 16, 2016, 3:05:45 AM9/16/16
to Haskell-community, haskell-cafe
Why not adopt (a subset of) .hs AST file format to structure both project and package files?

This would simplify parsing config files as well as syncing code and config files in IDEs.


To draw an analogy, JSON derives from JavaScript. Isn't this a precedent?

Joachim Durchholz

unread,
Sep 16, 2016, 3:16:06 AM9/16/16
to haskel...@haskell.org
Am 16.09.2016 um 08:20 schrieb Harendra Kumar:
> * TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML
> and is being used by a few important projects but is still evolving and is
> not completely stable. On a first glance it looks pretty simple and a lot
> of other tools use a similar config format. It is aiming to become a
> standard and aiming for a wider adoption.

TOML is limited in its data types: numbers, dates, strings for
primitives, arrays and string-to-object maps.
I'd consider that too limited to ever become a universal configuration
format.

Harendra Kumar

unread,
Sep 16, 2016, 3:18:36 AM9/16/16
to Imants Cekusins, Haskell-community, haskell-cafe
On 16 September 2016 at 12:35, Imants Cekusins <ima...@gmail.com> wrote:
Why not adopt (a subset of) .hs AST file format to structure both project and package files?

Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.

For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases.

-harendra

Alan & Kim Zimmerman

unread,
Sep 16, 2016, 3:23:12 AM9/16/16
to Harendra Kumar, Haskell-community, haskell-cafe
The more power you put into the package file description, the harder it is for the surrounding ecosystem to reason about it.

So if you can execute arbitrary code in a new-gen cabal file, apart from the security aspects, it becomes difficult to be sure what is actually being specified, if you do not reproduce the original environment when evaluating the file.

Alan

MigMit

unread,
Sep 16, 2016, 3:31:05 AM9/16/16
to Alan & Kim Zimmerman, haskell-cafe, Haskell-community
Sbt seems to be doing rather well, using full Scala in configurations.

I think package descriptions should be limited, but not syntactically. Using some specific monad might work OK.

Imants Cekusins

unread,
Sep 16, 2016, 4:23:13 AM9/16/16
to MigMit, Alan & Kim Zimmerman, Haskell-community, haskell-cafe
> So if you can execute arbitrary code in a new-gen cabal file, apart from the security aspects, ...
well config files could use different (not .hs) extensions. They could use their own Prelude and not allow importing other modules.

The main benefit is to reuse existing parsers and simplify code-config sync.

Chris Smith

unread,
Sep 16, 2016, 4:24:49 AM9/16/16
to Haskell-community, haskell-cafe
I guess the overriding question I have here is: what is the PROBLEM being solved?  I know of basically no beginners who were confused or intimidated by the syntax of Cabal's file format.  It's fairly commonplace for beginners to be confused by the *semantics*: which fields are needed and what they mean, how package version bounds work, what flags are and how they interact with dependencies, the relationship between libraries and executables defined in the same file, etc.  But the syntax?  It's just not an issue.  I'm not sure what it means to say that people have to "learn" it, because in introducing dozens of people to building things in Haskell, I've never seen that learning process even be noticeable, much less an impediment.

With this in mind, a lot of the statements about these various languages are not entirely convincing.  That it's a superset of JSON?  It's not clear why this matters.  A psychological impression of complexity?  Just not anything I've seen evidence of.  Indeed, aside from the rather painful many-years-long migration, the *cost* (though certainly not a prohibitive one) of moving to something like YAML or TOML is that they have a bit louder syntax, that demands more attention and feels more complex.

There is one substantial disadvantage I'd point out to the Cabal file format as it stands, and that's that it's pretty non-obvious how to parse it, so we will always struggle to interact with it from automated tools, unless those tools are also written in Haskell and can use the Cabal library.  That's a real concern; pragmatic large-scale build environments are not tied to specific languages, and include a variety of ad-hoc third-party tooling that needs to be integrated, and Cabal remains opaque to them.  But that doesn't seem to be what's motivating this conversation.

Joachim Durchholz

unread,
Sep 16, 2016, 4:36:57 AM9/16/16
to haskel...@haskell.org
Am 16.09.2016 um 09:22 schrieb Alan & Kim Zimmerman:
> The more power you put into the package file description, the harder it is
> for the surrounding ecosystem to reason about it.
>
> So if you can execute arbitrary code in a new-gen cabal file, apart from
> the security aspects, it becomes difficult to be sure what is actually
> being specified, if you do not reproduce the original environment when
> evaluating the file.

A little-hyped aspect of Gradle is that it has two strictly divided
phases: Phase 1 builds the dependency model, phase 2 executes it.
Once phase 1 finishes, the dependency model becomes read-only, phase 2
is not allowed to modify it.

On the plus side, this makes it easy for tools to reason about the
model: it's static and easy to reproduce (just run phase 1 on the config
file, or even better, ask the Gradle daemon that's caching the model).

On the minus side, it's hard to make out which code in the config is
phase-1 and which is phase-2: Same syntax, no static types to guide the
intuition; essentially, you have to know which parameters of what
phase-1 library functions are closures to be executed in phase 2.
Haskell might be able to do better in this area, though I'm in no
position to make any proposals for that.

Imants Cekusins

unread,
Sep 16, 2016, 4:37:29 AM9/16/16
to Chris Smith, Haskell-community, haskell-cafe
>  what is the PROBLEM being solved? 

by making config files follow .hs syntax, cabal file structure may be defined as a data record. This would make it clear, which fields are compulsory, which are optional.

Enums may be used.

Harendra Kumar

unread,
Sep 16, 2016, 4:45:54 AM9/16/16
to Chris Smith, Haskell-community, haskell-cafe
The discussion originated in an earlier thread from a question about the possibility of using the same format across different tools, cabal and stack which currently use different file formats. If they have to use the same format what that format should be.

_______________________________________________
Haskell-community mailing list
Haskell-...@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community


Joachim Durchholz

unread,
Sep 16, 2016, 4:46:17 AM9/16/16
to haskel...@haskell.org
Am 16.09.2016 um 10:24 schrieb Chris Smith:
> With this in mind, a lot of the statements about these various languages
> are not entirely convincing. That it's a superset of JSON? It's not clear
> why this matters.

It does matter for people who already know JSON: They can skip over the
config file syntax and dive right into the semantics.
Given that a substantial fraction of programmers knows JSON, using that
syntax would create a lower entry barrier.

The same argument can be made for YAML.

This argument cannot be made for TOML at this time, maybe never if
TOML's limitations prevent widespread adoption.

> A psychological impression of complexity? Just not
> anything I've seen evidence of. Indeed, aside from the rather painful
> many-years-long migration, the *cost* (though certainly not a prohibitive
> one) of moving to something like YAML or TOML is that they have a bit
> louder syntax, that demands more attention and feels more complex.

YAML's complexity is partly because it tries to cover everything, partly
because it is pushing hard to be both human-readable and machine-readable.
It's pretty good at this actually, though I guess 20/20 hindsight could
lead to improvements - but not enough to make a new YAML version worth
the effort.

> There is one substantial disadvantage I'd point out to the Cabal file
> format as it stands, and that's that it's pretty non-obvious how to parse
> it, so we will always struggle to interact with it from automated tools,
> unless those tools are also written in Haskell and can use the Cabal
> library. That's a real concern; pragmatic large-scale build environments
> are not tied to specific languages, and include a variety of ad-hoc
> third-party tooling that needs to be integrated, and Cabal remains opaque
> to them. But that doesn't seem to be what's motivating this conversation.

That's implicit in the "it would be nice to have a standard format"
argument, even if it hasn't been explicitly voiced yet.

yogsototh

unread,
Sep 16, 2016, 4:57:29 AM9/16/16
to Haskell-cafe, haskell-...@haskell.org, haskel...@haskell.org, cds...@gmail.com

I guess the overriding question I have here is: what is the PROBLEM being solved?

Let me share my experience with Clojure and lein. They use a clojure hash-map for their configuration. So yes arbitrary code could be executed and I believe this is a _very good thing_.

Why? Because it makes it very easy to add sub-configuration that can be used by third party plugin. For example:

- a plugin that help the use of environment variables (lein-environ) which is really helpful for application development (not so much for library development)
- a plugin that use S3 for our private dependencies (not supported by default by lein)


For deployment: we were able to add request to our API server that provide not only the written version but also the git commit hash. So we could be certain of the version of the server. Too much time there were sys/admin deployment errors. And that could only be achieved because we were able to run arbitrary command in the project description file.

I certainly forget many other advantages of having a package description format which is simply a data structure in the hosted language. But this has by far my preference.

- cabal is ok, but very imperfect, I generally need to have a lot of copy/paste, I need to change it very often while writing application with many dependencies
- JSON/YAML/TOML are simply not powerful enough to match all semantics we might need to configure a project. For example we might want to have Set instead of List for some properties. Or I don't know maybe ternary tree structures.

The point is: we pay a price by adding a step between the semantic and the syntax.
While if our configuration format was in Haskell we could express the semantic more directly.

Imants Cekusins

unread,
Sep 16, 2016, 5:16:03 AM9/16/16
to yogsototh, Haskell-community, haskell Cafe, Haskell-cafe
.. for interop with other packagers / builders, .hs compatible config content could be transformed / exported to other formats.

.hs -> YAML, JSON, ... is likely to be possible and easier than the other way around.

Paolo Giarrusso

unread,
Sep 16, 2016, 5:37:01 AM9/16/16
to Haskell-cafe, haskell-...@haskell.org, Haskell-cafe
(Resending from right address)

We're talking about *three* options:
1. syntax for pure Haskell values, which I'll call HSON (Haskell
jSON). That's just an alternative to YAML/TOML/... That would need
extensions to allow omitting optional fields entirely.
2. a pure Haskell embedded domain-specific language (EDSL) that simply
generates cabal description records (GenericPackageDescription
values). That would allow abstraction over some patterns but not much
more. But that alone is already an argument for EDSLs—the one Harendra
already presented.
3. a Haskell embedded domain-specific language (EDSL) designed for an
extensible build tool, like Clojure's (apparently), SBT for Scala or
many others. That would potentially be a rabbit hole leading to a
rather *different* tool—with a different package format to boot. That
can't work as long as all libraries have to be built using the same
tool. But stack and cabal are really about how to manage package
databases/GHC/external environments, while extensible build tools are
about (a more powerful form) of writing custom setup scripts. I
suspect some extensions might be easier if more of the actual building
was done by the setup script, but I'm not sure.
> _______________________________________________
> Haskell-community mailing list
> Haskell-...@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community
>



--
Paolo G. Giarrusso - Ph.D. Student, Tübingen University
http://ps.informatik.uni-tuebingen.de/team/giarrusso/

Imants Cekusins

unread,
Sep 16, 2016, 5:50:02 AM9/16/16
to Paolo Giarrusso, Haskell-community, Haskell-cafe, Haskell-cafe
this may be one of the 3 points on Paolo's list. In case it is not, here is another option (4?):

  • define .hs data records for project config, package configs
  • write export tools to export config records to existing formats: 
cabal
stack yaml
...
this way, there is no need to revise the current workflow or modify tools. 

However we define a common standard content structure, most users do not need to worry about .cabal, .yaml syntax

Paolo Giarrusso

unread,
Sep 16, 2016, 6:38:48 AM9/16/16
to Patrick Pelletier, haskell-...@haskell.org, Haskell-cafe
On 16 September 2016 at 12:13, Patrick Pelletier
<ppel...@funwithsoftware.org> wrote:

> On 9/16/16 2:36 AM, Paolo Giarrusso wrote:
>>
>> (Resending from right address)
>>
>> We're talking about *three* options:
>> 1. syntax for pure Haskell values, which I'll call HSON (Haskell
>> jSON). That's just an alternative to YAML/TOML/... That would need
>> extensions to allow omitting optional fields entirely.
>> 2. a pure Haskell embedded domain-specific language (EDSL) that simply
>> generates cabal description records (GenericPackageDescription
>> values). That would allow abstraction over some patterns but not much
>> more. But that alone is already an argument for EDSLs—the one Harendra
>> already presented.
>> 3. a Haskell embedded domain-specific language (EDSL) designed for an
>> extensible build tool, like Clojure's (apparently), SBT for Scala or
>> many others. That would potentially be a rabbit hole leading to a
>> rather *different* tool—with a different package format to boot. That
>> can't work as long as all libraries have to be built using the same
>> tool. But stack and cabal are really about how to manage package
>> databases/GHC/external environments, while extensible build tools are
>> about (a more powerful form) of writing custom setup scripts. I
>> suspect some extensions might be easier if more of the actual building
>> was done by the setup script, but I'm not sure.
>
>
> Options 2 and 3 both require running Haskell code at build time.

> But if all packages had to use the new EDSL, then cross-compilation would essentially become impossible.

"All packages migrate to new format" doesn't seem really a plausible
option, as I already hinted in the text you quote.
There are multiple JVM build tools because they're interoperable (like
cabal-install and Stack): each library picks its own build tool, but
they can still be linked together.
Hpack generates cabal files, stack reuses cabal or hpack files.

In principle, option 2 just needs a non-cross-compiled program to
produce a package description—say by producing a cabal file. You just
need to runghc it, either via ghci or by compiling and running a
binary. Option 3 can be trickier depending on details, but the as long
as you account for cross-compilation in the design it should be
doable. For Template Haskell the problem is deeper (see
http://blog.ezyang.com/2016/07/what-template-haskell-gets-wrong-and-racket-gets-right/),
so let's *not* use it here.


--
Paolo G. Giarrusso - Ph.D. Student, Tübingen University
http://ps.informatik.uni-tuebingen.de/team/giarrusso/

Harendra Kumar

unread,
Sep 16, 2016, 7:05:52 AM9/16/16
to Paolo Giarrusso, Haskell-community, Haskell-cafe, Patrick Pelletier
This seems to have gone into a different direction. The original point was about the package specification format and not expressing a full fledged build system. That is an entirely different ballgame. The main point of the thread was whether it makes sense to use a single specification format for both stack and cabal install (YAML vs .cabal and then TOML came into picture). Haskell does not seem to be a choice for a package specification format unless we have a very different goal in mind.

-harendra

_______________________________________________

Paolo Giarrusso

unread,
Sep 16, 2016, 7:22:09 AM9/16/16
to Harendra Kumar, Haskell-community, Haskell-cafe, Patrick Pelletier
On 16 September 2016 at 13:05, Harendra Kumar <harendr...@gmail.com> wrote:
> This seems to have gone into a different direction. The original point was
> about the package specification format and not expressing a full fledged
> build system. That is an entirely different ballgame. The main point of the
> thread was whether it makes sense to use a single specification format for
> both stack and cabal install (YAML vs .cabal and then TOML came into
> picture). Haskell does not seem to be a choice for a package specification
> format unless we have a very different goal in mind.

I agree "full-fledged build system" is not a possible immediate goal.
But an EDSL for expressing cabal projects (as they are today) would
still be in scope of your proposal—and I thought you liked the idea
(see quote below). Using the earlier options: option 3 is not in scope
of this thread, but option 2 is, with the only danger that the design
space is so big to present a challenge.

Quoting from Harendra Kumar's earlier mail:

>> Why not adopt (a subset of) .hs AST file format to structure both project and package files?

> Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.

> For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases.

> On 16 September 2016 at 16:08, Paolo Giarrusso <p.gia...@gmail.com>
> wrote:
>>
>> On 16 September 2016 at 12:13, Patrick Pelletier
>> <ppel...@funwithsoftware.org> wrote:
>> > On 9/16/16 2:36 AM, Paolo Giarrusso wrote:

>> >> We're talking about *three* options:
>> >> 1. syntax for pure Haskell values, which I'll call HSON (Haskell
>> >> jSON). That's just an alternative to YAML/TOML/... That would need
>> >> extensions to allow omitting optional fields entirely.
>> >> 2. a pure Haskell embedded domain-specific language (EDSL) that simply
>> >> generates cabal description records (GenericPackageDescription
>> >> values). That would allow abstraction over some patterns but not much
>> >> more. But that alone is already an argument for EDSLs—the one Harendra
>> >> already presented.
>> >> 3. a Haskell embedded domain-specific language (EDSL) designed for an
>> >> extensible build tool, like Clojure's (apparently), SBT for Scala or
>> >> many others. That would potentially be a rabbit hole leading to a
>> >> rather *different* tool—with a different package format to boot. That
>> >> can't work as long as all libraries have to be built using the same
>> >> tool. But stack and cabal are really about how to manage package
>> >> databases/GHC/external environments, while extensible build tools are
>> >> about (a more powerful form) of writing custom setup scripts. I
>> >> suspect some extensions might be easier if more of the actual building
>> >> was done by the setup script, but I'm not sure.

--
Paolo G. Giarrusso - Ph.D. Student, Tübingen University
http://ps.informatik.uni-tuebingen.de/team/giarrusso/
_______________________________________________

Harendra Kumar

unread,
Sep 16, 2016, 7:47:16 AM9/16/16
to Paolo Giarrusso, Haskell-community, Haskell-cafe, Patrick Pelletier
On 16 September 2016 at 16:51, Paolo Giarrusso <p.gia...@gmail.com> wrote:

I agree "full-fledged build system" is not a possible immediate goal.
But an EDSL for expressing cabal projects (as they are today) would
still be in scope of your proposal—and I thought you liked the idea
(see quote below). Using the earlier options: option 3 is not in scope
of this thread, but option 2 is, with the only danger that the design
space is so big to present a challenge.

Yeah I like the idea of using Haskell for configs but perhaps in a different problem space e.g. in a build spec. See the quote from my earlier quote below, sorry for the confusion :-) Yes, maybe option 2 might work for package specifications but sounds pretty hairy to explore for this use case alone, unless we have other motivations.


Quoting from Harendra Kumar's earlier mail:

 If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.

-harendra 

David McBride

unread,
Sep 16, 2016, 8:39:23 AM9/16/16
to yogsototh, Haskell Cafe, Haskell-cafe
While I would personally love having a package description in haskell, I don't think it is a good idea.

If you can't start or modify a package without already knowing haskell, it is a huge barrier to entry.  I remember trying to get started in scala and having a lot of trouble with sbt because I didn't know their operators for lists and arrays or hash tables or whatever it is that they use in their files.

Chris Kahn

unread,
Sep 16, 2016, 9:25:46 AM9/16/16
to David McBride, yogsototh, Haskell Cafe, Haskell-cafe
I would like to second this thought. Using Haskell for package descriptions needs to be thought out and executed with great care and attention. It's really easy to go off the rails.

Scala's build system lets you do very powerful things, but it also makes things unnecessarily complicated and mystifying for beginners. At my previous work where we used Scala extensively, there were many times where the team simply resorted to external tools because figuring out how to make some seemingly trivial change to an SBT module was too time consuming.

Kosyrev Serge

unread,
Sep 16, 2016, 9:37:36 AM9/16/16
to David McBride, Haskell Cafe
David McBride writes:
> While I would personally love having a package description in haskell,
> I don't think it is a good idea.

I think we all can agree, that using the fully-fledged language for
configuration is an extremely bad idea from many perspectives.

The worst of all, IMO, is that it makes reasoning about the
configuration equivalent to the halting problem.

And god, does it hurt in practice! -- speaking as someone who had spent
a non-trivial amount of time on doing exactly this stuff in another age
and for another language.

However.

This does not mean that we cannot find a subset of the language that
would be a point of balance between the needs of expressivity,
learnability and decidability.

After all JSON was born in roughly this spirit, wasn't it?

The wins are obvious to me:

- the syntax is immediately obvious to the target audience

- minimum effort to get existent Haskell tools to work with the "new"
format at the source level -- syntax highlighting, checking, etc.
The only required additions would be restriction enforcement

- no third-party libraries need to be used as dependencies for our
core tooling

> If you can't start or modify a package without already knowing
> haskell, it is a huge barrier to entry.

I'm unconvinced that this problem cannot be resolved within the subsetting approach.

> I remember trying to get
> started in scala and having a lot of trouble with sbt because I didn't
> know their operators for lists and arrays or hash tables or whatever
> it is that they use in their files.

That is because they committed to the sin of employing the whole of
Scala for the thing. Bad for them.

But also.. let's not commit the mistake of conflating the surface syntax
and the semantics.

The semantics are dictated by need -- whose sharpening effect on the
learning curve is unavoidable. I'm willing to argue that a large part
of your confusion came from the /semantics/ of sbt, not the syntax.

The syntax differences, OTOH, can and ought to be trivialized.

--
с уважениeм / respectfully,
Косырев Сергей

Imants Cekusins

unread,
Sep 16, 2016, 9:38:08 AM9/16/16
to Chris Kahn, Haskell Cafe, Haskell-cafe
my experience with Gradle was frustrating too. Even with some experience with Java.

a lot of what is often called "complexity" stems from lack of types and gaps in comments / documentation.

language needs to be learnt anyway. If config syntax needs to be learnt in addition to language syntax, this is always extra work.


for 1st timers to be able to write hello world, sample config files and a few tutorials would go a long way.

Kosyrev Serge

unread,
Sep 16, 2016, 9:42:48 AM9/16/16
to Chris Kahn, Haskell Cafe, Haskell-cafe
Chris Kahn writes:
> I would like to second this thought. Using Haskell for package
> descriptions needs to be thought out and executed with great care and
> attention. It's really easy to go off the rails.
>
> Scala's build system lets you do very powerful things, but it also
> makes things unnecessarily complicated and mystifying for beginners.
> At my previous work where we used Scala extensively, there were many
> times where the team simply resorted to external tools because
> figuring out how to make some seemingly trivial change to an SBT
> module was too time consuming.

Let me guess (have no idea about sbt) -- unbridled Turing completeness?

Declarativity is king for configuration, and Turing completeness ain't it --
please, see my other mail about subsetting Haskell.


--
с уважениeм / respectfully,
Косырев Сергей

Andrew Butterfield

unread,
Sep 16, 2016, 9:47:48 AM9/16/16
to Haskell-community, Haskell Cafe

> On 16 Sep 2016, at 09:24, Chris Smith <cds...@gmail.com> wrote:
>
> I guess the overriding question I have here is: what is the PROBLEM being solved? I know of basically no beginners who were confused or intimidated by the syntax of Cabal's file format.

As a "beginner"(*), I fully agree.
However having more than one language in the mix can be confusing and complicating...

> It's fairly commonplace for beginners to be confused by the *semantics*: which fields are needed and what they mean, how package version bounds work, what flags are and how they interact with dependencies, the relationship between libraries and executables defined in the same file, etc.

It's all about the semantics - it should preferably be formalised, and ideally the relevant library/package system should be able to check/enforce rules.

> But the syntax? It's just not an issue. I'm not sure what it means to say that people have to "learn" it, because in introducing dozens of people to building things in Haskell, I've never seen that learning process even be noticeable, much less an impediment.

I quite agree
>

Andrew Butterfield
School of Computer Science & Statistics
Trinity College
Dublin 2, Ireland

(*) I've only started to use cabal recently, because a TA of mine built a cabal-based coursework grading system for me - I generally do application devpt in Haskell
and the only build command I need is ghc --make.... Currently moving quickly onto stack this year....

Joachim Durchholz

unread,
Sep 16, 2016, 9:51:56 AM9/16/16
to haskel...@haskell.org
Am 16.09.2016 um 15:37 schrieb Kosyrev Serge:
> The worst of all, IMO, is that it makes reasoning about the
> configuration equivalent to the halting problem.

That's a solved problem: Generate an execution plan, which would need to
be fully evaluated in Haskell; then execute it and don't feed anything
back into it.
It's easy to reason about the plan in that scenario.

This is what Gradle does.

> And god, does it hurt in practice! -- speaking as someone who had spent
> a non-trivial amount of time on doing exactly this stuff in another age
> and for another language.

Which language?

> This does not mean that we cannot find a subset of the language that
> would be a point of balance between the needs of expressivity,
> learnability and decidability.

Subsettings makes it hard to know what works and what doesn't.
A Haskell subset would have to be strict - which begs the question
what's the point in calling this a subset of Haskell (and even if there
is a point, it will draw ridicule along the lines of "Haskell is
unsuitable for describing its own configurations").

> After all JSON was born in roughly this spirit, wasn't it?

JSON was/is a serialization format, first and foremost.

>> If you can't start or modify a package without already knowing
>> haskell, it is a huge barrier to entry.
>
> I'm unconvinced that this problem cannot be resolved within the subsetting approach.

Actually subsetting is making this worse: Things freshly learned for
Haskell won't work in the config language, restrictions encountered in
the config language will be unthinkingly transferred to Haskell.

Having two subtly but fundamentally different languages is about the
worst thing you can expose a learner to.

Imants Cekusins

unread,
Sep 16, 2016, 9:52:48 AM9/16/16
to Kosyrev Serge, Haskell Cafe, Haskell-cafe
.. if this helps the discussion, here is cabal file "spec"
(see also other files in the same directory)

these types are used to parse .cabal.

one benefit from using these types in reverse (to generate .cabal from .hs) would be:

it is possible to write a few libs which generate well-formed cabal files from a simplified API. It is possible this was done already.

Joachim Durchholz

unread,
Sep 16, 2016, 10:00:21 AM9/16/16
to haskel...@haskell.org
Am 16.09.2016 um 15:38 schrieb Imants Cekusins:
> a lot of what is often called "complexity" stems from lack of types and
> gaps in comments / documentation.

That's a big issue with Gradle.

The third problem I know Gradle for is that it makes it surprisingly
difficult to inspect the execution plan.

Geraldus

unread,
Sep 16, 2016, 10:03:34 AM9/16/16
to Joachim Durchholz, haskel...@haskell.org
> Actually subsetting is making this worse: Things freshly learned for
> Haskell won't work in the config language, restrictions encountered in
> the config language will be unthinkingly transferred to Haskell.

I agree.  This is exactly what I felt when I tried to use Fay language, which is a «proper subset» of Haskell (in the end I switched to GHCJS).

пт, 16 сент. 2016 г. в 19:00, Joachim Durchholz <j...@durchholz.org>:

Mario Blažević

unread,
Sep 16, 2016, 10:10:55 AM9/16/16
to haskel...@haskell.org
On 2016-09-16 09:51 AM, Joachim Durchholz wrote:
>
>> This does not mean that we cannot find a subset of the language that
>> would be a point of balance between the needs of expressivity,
>> learnability and decidability.
>
> Subsettings makes it hard to know what works and what doesn't.
> A Haskell subset would have to be strict - which begs the question
> what's the point in calling this a subset of Haskell (and even if there
> is a point, it will draw ridicule along the lines of "Haskell is
> unsuitable for describing its own configurations").

Haskell is indeed unsuitable for describing the package configuration,
IMO, but not because it's lazy. It's because it lacks any syntax for
long and human-readable string literals (package description, anyone?).
That also condemns every subset of Haskell.


>> After all JSON was born in roughly this spirit, wasn't it?

Yes, and JSON (and JavaScript) would suck for the very same reason.
This deficiency of JSON was a major incentive for creating YAML.

I'm mildly in favour of supporting another package format in addition
to .cabal, as long as compatibility is kept, and as long as the new
format is actually superior. I think any subset of Haskell would be a
setback from usability perspective.

One major benefit of YAML that I haven't seen mentioned is that it
could be used to replace the README.md file at the same time. Right now
a package description consists of both .cabal and (optionally) Markdown.
I suspect the latter language is actually harder for complete beginners.

Imants Cekusins

unread,
Sep 16, 2016, 10:29:09 AM9/16/16
to Mario Blažević, haskell Cafe
> it lacks any syntax for long and human-readable string literals (package description, anyone?).
can  {- comments -} be used for package description?

Mario Blažević

unread,
Sep 16, 2016, 10:38:40 AM9/16/16
to haskell Cafe
On 2016-09-16 10:29 AM, Imants Cekusins wrote:
>> it lacks any syntax for long and human-readable string literals (package description, anyone?).
> ​
> can {- comments -} be used for package description?
>

I suppose they could, but that would rather defeat the purpose of using
a Haskell subset in the first place. Haskell ignores comments, package
descriptions should not be ignored.

Imants Cekusins

unread,
Sep 16, 2016, 10:48:23 AM9/16/16
to Mario Blažević, haskell Cafe
ok how about a pragma:

7.13.6.3. Annotating modules

You can annotate modules with the ANN pragma by using the module keyword. For example:

{-# ANN module (Just "A `Maybe String' annotation") #-}


if the topic is Standard package file format, why not agree on e.g. adopting GenericPackageDescription or another similar haskell type (rather than a text-based file) as the standard?

then any format (cabal, yaml, json, ...) may be used as long as a library exists and is maintained for each such format,  which parses / produces the format from / to the standard type?


how about this?

Mario Blažević

unread,
Sep 16, 2016, 11:21:32 AM9/16/16
to haskell Cafe
On 2016-09-16 10:48 AM, Imants Cekusins wrote:
> ok how about a pragma:
>
> 7.13.6.3. Annotating modules
>
> You can annotate modules with the |ANN| pragma by using
> the |module| keyword. For example:
>
> {-# ANN module (Just "A `Maybe String' annotation") #-}

I suppose this could do, but there are some downsides:
- somewhat cumbersome syntax,
- reliance on a GHC extension, and worst of all,
- not a Haskell value.

The last point implies that the package.hs with this kind of module
annotation could not produce a proper GenericPackageDescription when
executed as a Haskell program.


> if the topic is _Standard package file format_, why not agree on e.g.
> adopting *GenericPackageDescription* or another similar haskell type


> (rather than a text-based file) as the standard?
>
> then any format (cabal, yaml, json, ...) may be used as long as a
> library exists and is maintained for each such format, which parses /
> produces the format from / to the standard type?

This makes perfect sense to me. The devil may be in the details. Would
cabal-install need to link in all these maintained libraries statically?
Or would there be some plug-in mechanism to load them on demand?

Imants Cekusins

unread,
Sep 16, 2016, 11:35:58 AM9/16/16
to Mario Blažević, haskell Cafe
> Would cabal-install need to link in all these maintained libraries statically? Or would there be some plug-in mechanism to load them on demand?

well the libraries would need to be official and some with the packager.​

the formats would be perfectly interchangeable i.e.
cabal -> standard_type -> yaml -> standard_type -> json -> standard_type -> cabal
would produce the same cabal file 

only 1 config file per package to avoid confusion

however if the user prefers working with format F, they can always convert the format which came with the package, to F

the file can always be validated by virtue of parsing and reproducing the original file without errors.


it comes at a price of duplicated efforts however it would give every choice one can wish for. If one must use yaml, they use yaml etc.

Imants Cekusins

unread,
Sep 16, 2016, 11:36:58 AM9/16/16
to Mario Blažević, haskell Cafe
... the libraries would need to be official and come with the packager.​

Bardur Arantsson

unread,
Sep 16, 2016, 1:08:20 PM9/16/16
to haskel...@haskell.org, haskell-...@haskell.org
On 2016-09-16 09:30, MigMit wrote:
> Sbt seems to be doing rather well, using full Scala in configurations.
>

Sbt is a *build* description, *NOT* a package description format. Sbt
uses ivy.xml files for the latter. (With interop for consuming Maven
pom.xml files such that it can leverage the already-huge Maven
repositories.)

Bardur Arantsson

unread,
Sep 16, 2016, 1:15:21 PM9/16/16
to haskel...@haskell.org
On 2016-09-16 16:10, Mario Blažević wrote:
>>> After all JSON was born in roughly this spirit, wasn't it?
>
> Yes, and JSON (and JavaScript) would suck for the very same reason.
> This deficiency of JSON was a major incentive for creating YAML.
>
> I'm mildly in favour of supporting another package format in
> addition to .cabal, as long as compatibility is kept, and as long as the
> new format is actually superior. I think any subset of Haskell would be
> a setback from usability perspective.
>

This may be somewhat heretical, but I don't actually think we need to
have a human-editable format. (Of course it should probably be
*reasonably* human-readable/editable just for debugging and such.)

Just provide simple commands to view/manipulate whatever package
settings there are. Helpfully said commands could also sanity check
whatever you're trying to do and perhaps provide better error messages
than a tool which only has the "final" package description to work with.

For beginners a simple GUI could be provided and IDEs could do their own
thing.

Problem solves.

Imants Cekusins

unread,
Sep 16, 2016, 1:58:13 PM9/16/16
to Bardur Arantsson, haskell Cafe
 I don't actually think we need to have a human-editable format.

.. store settings as serialized Haskell type, and use custom (non-official) viewers / editors to display them formatted to user preferences?
sounds good.

Sven Panne

unread,
Sep 16, 2016, 2:49:45 PM9/16/16
to Bardur Arantsson, Haskell Cafe
2016-09-16 19:14 GMT+02:00 Bardur Arantsson <sp...@scientician.net>:
This may be somewhat heretical, but I don't actually think we need to
have a human-editable format. [...]

Coming back to the central question (see Chris' mail): What problem do we solve by doing that? Replacing a relatively easy to read format by something unreadable by humans? That's probably the opposite of what we want...
 
[...] For beginners a simple GUI could be provided and IDEs could do their own
thing.

If somebody thinks a GUI is a good idea, we don't need to change something at all: Just write a GUI for reading/editing .cabal files.
 
Problem solves.

Which problem? :-) Unless we really define what we want to improve and why, the whole discussion is pointless. Is it readability by humans? Being "standard" (whatever that means)? Being easily parsable, probably by a separate library? Being more flexible by what one can express? Having more abstraction facilities in the description? I have the impression that different people in this discussion try to solve different problems.

Cheers,
    S.

Imants Cekusins

unread,
Sep 16, 2016, 4:03:05 PM9/16/16
to Haskell Cafe
> what we want to improve and why

how about these:
  1. Adopt common standard for different package tools.
  2. Give users and packager devs a choice of config file formats / representations.
  3. Explore ways to simplify manual package configuration.
?

Sven Panne

unread,
Sep 16, 2016, 4:59:25 PM9/16/16
to Imants Cekusins, Haskell Cafe
2016-09-16 22:02 GMT+02:00 Imants Cekusins <ima...@gmail.com>:
[...]
  1. Adopt common standard for different package tools.
What are these tools? AFAICT we are talking about cabal and stack only, and from the recent discussion it seems that stack has slightly different goals: One stack.yaml can reference vaious cabal package descriptions, something I've never use until now, because I wasn''t even aware that it is possible. :-). So apart from the different surface syntax, there seems to be more fundamental differences.
  1. Give users and packager devs a choice of config file formats / representations.
Why is this even a goal? On the contrary, I see this as an anti-goal, because it leads to useless creativity and fragmentation.
  1. Explore ways to simplify manual package configuration.
This is a worthwhile goal IMHO, but we need to be more concrete, e.g. how can repetitive stuff like the tons of almost-copy-n-paste in https://github.com/haskell-opengl/GLUT/blob/master/GLUT.cabal be avoided? This has nothing to do with syntax, more with abstraction facilities and semantics: If we just switch to JSON or YAML, GLUT.cabal would as repetitive as before, only in a different surface syntax.

Herbert Valerio Riedel

unread,
Sep 16, 2016, 6:14:07 PM9/16/16
to Harendra Kumar, Patrick Pelletier, Duncan Coutts, Haskell-community, haskell-cafe

(resent from different account, sorry if dupe)

On 2016-09-16 at 08:20:15 +0200, Harendra Kumar wrote:

[...]

> * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A
> significant chunk of developer community is already familiar with it. It is
> being used by stack and by hpack as an alternative to cabal format. The
> complaint against it is that the specification/implementation is overly
> complex.

I'm not sure if this has been pointed out already, but beyond turning a
proper grammar into a stringly-typed one, shoehorning some features of
.cabal files into YAML syntax really appear like a case of the "Genius
Tailor"[1], e.g. consider the `hpack` example

when:
- condition: flag(fast)
then:
ghc-options: -O2
else:
ghc-options: -O0

besides looking quite awkward IMHO (just as an exercise, try inserting a
nested if/then/else in that example above), the prospect that a standard
format like YAML would allow to reuse standard tooling/libraries for
YAML seems quite weak to me; if, for instance, you run the above through
a YAML pretty-printer, you easily end up with something like

when:
- else:
ghc-options: -O0
then:
ghc-options: -O2
condition: flag(fast)

or any other ordering depending on how the keys are sorted/hashed.

Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if
by accident you place a 2nd `else:` branch somewhere, you end up with an
ambiguous .yaml file which may either result in an error, in the first
key getting dropped (most likely variant), or in the 2nd key getting
dropped. Which one you get depends on the YAML parser implementation.


I really don't understand the appeal of applying the golden hammer of
YAML, if `.cabal`'s grammar is already self-evident and concise with its
syntax:

if flag(fast)
ghc-options: -O2
else
ghc-options: -O0

where this if/then/else construct is encoded in the grammar proper
rather than being merely a semantic interpretation after decoding a
general grammar designed for simpler typed data-representations which
isn't even accurate enough (since it has additional symmetries/freedoms)
to capture the desired grammar faithfully, which make YAML quite
error-prone for this specific application.

[1]: The "Genius Tailor" was mentioned recently in a related discussion here:
https://mail.haskell.org/pipermail/haskell-cafe/2016-September/124868.html

-- hvr

Harendra Kumar

unread,
Sep 16, 2016, 7:53:28 PM9/16/16
to Herbert Valerio Riedel, Patrick Pelletier, Duncan Coutts, Haskell-community, haskell-cafe


On 17 September 2016 at 03:43, Herbert Valerio Riedel <hvri...@gmail.com> wrote:

I'm not sure if this has been pointed out already, but beyond turning a
proper grammar into a stringly-typed one, shoehorning some features of
.cabal files into YAML syntax really appear like a case of the "Genius
Tailor"[1], e.g. consider the `hpack` example

   when:
     - condition: flag(fast)
       then:
         ghc-options: -O2
       else:
         ghc-options: -O0


I agree. Supporting conditionals with YAML looks hacky!

-harendra

Paolo Giarrusso

unread,
Sep 16, 2016, 8:27:14 PM9/16/16
to Haskell-cafe, ch...@kahn.pro, haskel...@haskell.org, skos...@ptsecurity.com


On Friday, September 16, 2016 at 3:42:48 PM UTC+2, Kosyrev Serge wrote:
Chris Kahn writes:
> I would like to second this thought. Using Haskell for package
> descriptions needs to be thought out and executed with great care and
> attention. It's really easy to go off the rails.
>
> Scala's build system lets you do very powerful things, but it also
> makes things unnecessarily complicated and mystifying for beginners.
> At my previous work where we used Scala extensively, there were many
> times where the team simply resorted to external tools because
> figuring out how to make some seemingly trivial change to an SBT
> module was too time consuming.

Let me guess (have no idea about sbt) -- unbridled Turing completeness?

 

Declarativity is king for configuration, and Turing completeness ain't it --
please, see my other mail about subsetting Haskell.


That's not the main problem with SBT. How do I explain it? Take this as an example of what Haskell should *not* do.

# SBT made difficult

Look, we all know a monad is just a monoid in the category of endofunctors, right?. Now, a SBT build configuration is just a heterogeneously-typed map from keys to monadic values that can be evaluated to a graph of setting transformers and build actions, so what's the problem?
And oh, I forgot to mention keys aren't simple strings but have a hierarchy themselves, and this hierarchy is used for inheritance and overriding of settings (nothing as simple as OO inheritance, mind you, think of something like CSS but different).
Isn't using Haskell supposed to require a PhD? So why would its build tools use something so simple as nested records, like Cabal does?
</sarcasm>
I think I'm trolling, but the above is somewhat accurate (except for any misunderstanding of SBT I might have)—I personally enjoy using SBT and its power, and once you learn it can be reasonably easy, but I think Kmett's lens library might be simpler to learn.

In fairness, many SBT builds can be read without having any clue of the above, because they look like imperative programs. But as soon as you need to do a bit more or you make a type error, you end up facing some of the above complexity—if you want, the "imperative program" abstraction is extremely leaky.

For instance, here's something "easy" (but count the amount of custom symbolic operators):

scalaVersion := "2.11.0"
scalacOptions += "-deprecation"
libraryDependencies += org.scalatest" %% "scalatest" % "2.0"

Then you want to use one setting when defining another, and suddenly you end up with:

libraryDependencies <+= scalaVersion (ver => "org.scala-lang" % "scala-compiler" % ver)

Luckily, this can be done more easily nowadays, thanks to Scala macros O_O.

Herbert Valerio Riedel

unread,
Sep 16, 2016, 9:05:31 PM9/16/16
to Harendra Kumar, Patrick Pelletier, Duncan Coutts, Haskell-community, haskell-cafe
On 2016-09-16 at 08:20:15 +0200, Harendra Kumar wrote:

[...]

> * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A
> significant chunk of developer community is already familiar with it. It is
> being used by stack and by hpack as an alternative to cabal format. The
> complaint against it is that the specification/implementation is overly
> complex.

I'm not sure if this has been pointed out already, but beyond turning a


proper grammar into a stringly-typed one, shoehorning some features of
.cabal files into YAML syntax really appear like a case of the "Genius
Tailor"[1], e.g. consider the `hpack` example

when:
- condition: flag(fast)
then:
ghc-options: -O2
else:
ghc-options: -O0

besides looking quite awkward IMHO (just as an exercise, try inserting a


nested if/then/else in that example above), the prospect that a standard
format like YAML would allow to reuse standard tooling/libraries for
YAML seems quite weak to me; if, for instance, you run the above through
a YAML pretty-printer, you easily end up with something like

when:
- else:
ghc-options: -O0
then:
ghc-options: -O2
condition: flag(fast)

or any other ordering depending on how the keys are sorted/hashed.

Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if
by accident you place a 2nd `else:` branch somewhere, you end up with an
ambiguous .yaml file which may either result in an error, in the first
key getting dropped (most likely variant), or in the 2nd key getting
dropped. Which one you get depends on the YAML parser implementation.


I really don't understand the appeal of applying the golden hammer of
YAML, if `.cabal`'s grammar is already self-evident and concise with its
syntax:

if flag(fast)


ghc-options: -O2
else
ghc-options: -O0

where this if/then/else construct is encoded in the grammar proper

Harendra Kumar

unread,
Sep 16, 2016, 10:35:46 PM9/16/16
to Paolo Giarrusso, haskell-cafe, Haskell-cafe
Since I triggered this discussion I feel obligated to summarize the important points that were presented. Is there a good place to record Haskell ecosystem related discussions (some wiki)?

-harendra

Bardur Arantsson

unread,
Sep 17, 2016, 12:48:19 AM9/17/16
to haskel...@haskell.org, haskell-...@haskell.org
On 2016-09-16 23:57, Herbert Valerio Riedel wrote:

> Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if
> by accident you place a 2nd `else:` branch somewhere, you end up with an
> ambiguous .yaml file which may either result in an error, in the first
> key getting dropped (most likely variant), or in the 2nd key getting
> dropped. Which one you get depends on the YAML parser implementation.

I was actually curious about this, and it's interesting to note that
even JSON which was supposed to have *ONE STANDARD* now apparently has
two, an ECMA one and and IETF RFC (seems to be more recent).

So I'd say JSON technically _allows_ duplicate keys, but that you cannot
reasonably any type of sane behavior in practice if you do
that.

Source: http://stackoverflow.com/a/23195243

(Didn't check up on what the situation is in YAML. YAML is too awful to
contemplate regardless.)

Regards,

Joachim Durchholz

unread,
Sep 17, 2016, 2:27:45 AM9/17/16
to haskel...@haskell.org
Am 17.09.2016 um 01:53 schrieb Harendra Kumar:
> I agree. Supporting conditionals with YAML looks hacky!

All I have seen was direct translation and conclusion that it doesn't work.
I haven't seen any attempts at making it look well.

Also, while aesthetics isn't irrelevant, it's a pretty weak argument.

Herbert Valerio Riedel

unread,
Sep 17, 2016, 2:29:10 AM9/17/16
to Bardur Arantsson, haskell-...@haskell.org, haskel...@haskell.org
On 2016-09-17 at 06:47:52 +0200, Bardur Arantsson wrote:

[...]

> I was actually curious about this, and it's interesting to note that
> even JSON which was supposed to have *ONE STANDARD* now apparently has
> two, an ECMA one and and IETF RFC (seems to be more recent).

Btw, that's partly because ECMA and IETF weren't able to agree who
"owns" JSON, for more details see

https://www.tbray.org/ongoing/When/201x/2014/03/05/RFC7159-JSON

-- hvr

Joachim Durchholz

unread,
Sep 17, 2016, 2:41:45 AM9/17/16
to haskel...@haskell.org
Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
> the prospect that a standard
> format like YAML would allow to reuse standard tooling/libraries for
> YAML seems quite weak to me;

It's not about standard tooling, it's about tools written by third
parties. Tools that you didn't have the time or interest to write
yourself, but which still help make your ecosystem more useful to others.

> if, for instance, you run the above through
> a YAML pretty-printer, you easily end up with something like
>
> when:
> - else:
> ghc-options: -O0
> then:
> ghc-options: -O2
> condition: flag(fast)
>
> or any other ordering depending on how the keys are sorted/hashed.

Only if you use a bad pretty-printer that parses the YAML, then writes
it in prettified form.
Such a pretty-printer would also lose comments.

In other words: I'd be surprised to find a pretty-printer in actual use
that works that way.

> Besides, many YAML (& JSON) parsers silently drop duplicate keys,

That's indeed a common bug/misfeature due to historical accidents.
It's easy to fix though, and libraries have started to acquire options
to get that reported as an error.

> I really don't understand the appeal of applying the golden hammer of
> YAML, if `.cabal`'s grammar is already self-evident and concise with its
> syntax:
>
> if flag(fast)
> ghc-options: -O2
> else
> ghc-options: -O0
>
> where this if/then/else construct is encoded in the grammar proper
> rather than being merely a semantic interpretation after decoding a
> general grammar designed for simpler typed data-representations which
> isn't even accurate enough (since it has additional symmetries/freedoms)
> to capture the desired grammar faithfully, which make YAML quite
> error-prone for this specific application.

Yeah it isn't nice.
Changing the grammar always produces that kind of awkwardnesses.
However, for a fair comparison, you need to actively look for things
that work better with the alternate grammar before you conclude it's worse.

Imants Cekusins

unread,
Sep 17, 2016, 2:54:22 AM9/17/16
to haskell Cafe
> Give users and packager devs a choice of config file formats / representations.
> Why is this even a goal? On the contrary, I see this as an anti-goal, because it leads to useless creativity and fragmentation.
such creativity and fragmentation may actually give benefits.

can MVC [1] be relevant here?

currently both config content (let's call it a model) and representation (view: specific config file type) are bundled.

if a common model is agreed on, package tool and IDE devs could pick any view (format) that best suits their / users needs.

such fragmentation would not break the workflow. If someone thinks of a convenient format and believe it worth their time to write a controller for it, why not?

[1] mvc

Brandon Allbery

unread,
Sep 17, 2016, 2:57:39 AM9/17/16
to Joachim Durchholz, haskell-cafe

On Sat, Sep 17, 2016 at 2:41 AM, Joachim Durchholz <j...@durchholz.org> wrote:
Changing the grammar always produces that kind of awkwardnesses.
However, for a fair comparison, you need to actively look for things that work better with the alternate grammar before you conclude it's worse.

The burden is on you to prove that the massive upheaval of a switch is justified, not on others to prove that your preference won't work.

--
brandon s allbery kf8nh                               sine nomine associates
allb...@gmail.com                                  ball...@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

Brandon Allbery

unread,
Sep 17, 2016, 3:00:55 AM9/17/16
to Imants Cekusins, haskell Cafe

On Sat, Sep 17, 2016 at 2:54 AM, Imants Cekusins <ima...@gmail.com> wrote:
currently both config content (let's call it a model) and representation (view: specific config file type) are bundled.

if a common model is agreed on, package tool and IDE devs could pick any view (format) that best suits their / users needs.

such fragmentation would not break the workflow. If someone thinks of a convenient format and believe it worth their time to write a controller for it, why not?

Do I have to obtain whatever whizzy new controller you've come up with in order to work with your packages?
Do I have to do this when everyone has come up with their own whizzy new controller and I need to fit their packages into whatever I am trying to write?

Imants Cekusins

unread,
Sep 17, 2016, 3:06:28 AM9/17/16
to haskell Cafe
> Do I have to obtain whatever whizzy new controller you've come up with in order to work with your packages?
> Do I have to do this when everyone has come up with their own whizzy new controller and I need to fit their packages into whatever I am trying to write?
that's the while point. If we could agree on a standard serializeable model, each controller would ensure the link between the view and the model.

user could open a package in any IDE / environment. The environment's controller would display the model in its own / user preferred view.

Imants Cekusins

unread,
Sep 17, 2016, 3:17:30 AM9/17/16
to haskell Cafe
.. the model would be shipped with packages.

pretty printing the config model to formatted yet non-editable config view (like the docs) may be made part of build process.


Brandon Allbery

unread,
Sep 17, 2016, 3:18:40 AM9/17/16
to Imants Cekusins, haskell Cafe

On Sat, Sep 17, 2016 at 3:06 AM, Imants Cekusins <ima...@gmail.com> wrote:
that's the while point. If we could agree on a standard serializeable model,

That seems like a big "if". Especially since many dev tools exist to extend the model, and quite aside from "so where's the 'standard' now", conflicts you can currently control (mostly) suddenly become problematic. (I'm tempted to point to how gtk2hs's configuration phase works. pTk may be an even more severe example, although non-Haskell.)

Joachim Durchholz

unread,
Sep 17, 2016, 3:22:58 AM9/17/16
to haskell Cafe
Am 17.09.2016 um 08:57 schrieb Brandon Allbery:
> On Sat, Sep 17, 2016 at 2:41 AM, Joachim Durchholz <j...@durchholz.org> wrote:
>
>> Changing the grammar always produces that kind of awkwardnesses.
>> However, for a fair comparison, you need to actively look for things that
>> work better with the alternate grammar before you conclude it's worse.
>
> The burden is on you to prove that the massive upheaval of a switch is
> justified, not on others to prove that your preference won't work.

I do like YAML, but I know far too little about the various use cases to
justify any preference; it's quite possible that it's not a good fit,
but I can't really decide it.

All I can do is provide knowledge about YAML, which in some cases was
really necessary, and pointing out one-sided arguments such as
Herbert's; doing a review of Cabal config usecases and see how well they
map to YAML is, sadly, beyond my capabilities.
Contributing the best I can and all that.

Imants Cekusins

unread,
Sep 17, 2016, 3:23:29 AM9/17/16
to haskell Cafe
That seems like a big "if". 

the model may be versioned. it may include "tool T only section" which the other tools skip over or simply display with show

Herbert Valerio Riedel

unread,
Sep 17, 2016, 3:25:57 AM9/17/16
to Joachim Durchholz, haskel...@haskell.org
Hello,

On 2016-09-17 at 08:41:37 +0200, Joachim Durchholz wrote:
> Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
>> the prospect that a standard format like YAML would allow to reuse
>> standard tooling/libraries for YAML seems quite weak to me;
>
> It's not about standard tooling, it's about tools written by third
> parties. Tools that you didn't have the time or interest to write
> yourself, but which still help make your ecosystem more useful to
> others.

Sure, but we don't need to throw out the baby with the bathwater to
accomplish that!

Oleg is currently working on a new parser for cabal.config,
cabal.project & ${pkg}.cabal grammar (NB: cabal already uses one
standard unified syntax for all its configuration/description files)
which lends itself better to provide equivalent of ghc-exactprint
(i.e. perfect roundtripping, allowing for faithful refactoring
tooling). Then 3rd parties can then use this new parser as a library.

[..]

>> I really don't understand the appeal of applying the golden hammer of
>> YAML, if `.cabal`'s grammar is already self-evident and concise with its
>> syntax:
>>
>> if flag(fast)
>> ghc-options: -O2
>> else
>> ghc-options: -O0
>>
>> where this if/then/else construct is encoded in the grammar proper
>> rather than being merely a semantic interpretation after decoding a
>> general grammar designed for simpler typed data-representations which
>> isn't even accurate enough (since it has additional symmetries/freedoms)
>> to capture the desired grammar faithfully, which make YAML quite
>> error-prone for this specific application.
>
> Yeah it isn't nice.
> Changing the grammar always produces that kind of awkwardnesses.
> However, for a fair comparison, you need to actively look for things
> that work better with the alternate grammar before you conclude it's
> worse.

Well, that burden of proof lies with those who argue YAML to be
superior to .cabal syntax, doesn't it?

The if/then/else awkwardness is just one aspect I pointed out
explicitly. I hinted at other issues which result from first parsing
into an inappropriate data-model just for the sake of using YAML, and
then having to re-parse that interim lossy data-model for real into the
actual data-model we're interested in (and hoping we didn't loose some
of the essential information).

But I see no need to invest time to spell those problems out until I see
a compelling argument that e.g. YAML syntax is really preferable (to
justify the costs incurred) to the status quo in the first place.


-- hvr

Bardur Arantsson

unread,
Sep 17, 2016, 3:52:03 AM9/17/16
to haskel...@haskell.org
On 2016-09-17 09:25, Herbert Valerio Riedel wrote:
> Hello,
>
> On 2016-09-17 at 08:41:37 +0200, Joachim Durchholz wrote:
>> Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
>>> the prospect that a standard format like YAML would allow to reuse
>>> standard tooling/libraries for YAML seems quite weak to me;
>>
>> It's not about standard tooling, it's about tools written by third
>> parties. Tools that you didn't have the time or interest to write
>> yourself, but which still help make your ecosystem more useful to
>> others.
>
> Sure, but we don't need to throw out the baby with the bathwater to
> accomplish that!
>
> Oleg is currently working on a new parser for cabal.config,
> cabal.project & ${pkg}.cabal grammar (NB: cabal already uses one
> standard unified syntax for all its configuration/description files)
> which lends itself better to provide equivalent of ghc-exactprint
> (i.e. perfect roundtripping, allowing for faithful refactoring
> tooling). Then 3rd parties can then use this new parser as a library.

I didn't see anything in the PR about exporting that parser as a
library. Do you have a reference for that?

Regardless: It will only help third party code written in Haskell. Much
as I like most userland software to be written in Haskell it won't help
e.g. IntelliJ IDEA one whit.

Regards,

Joachim Durchholz

unread,
Sep 17, 2016, 4:50:37 AM9/17/16
to haskel...@haskell.org
Am 17.09.2016 um 09:51 schrieb Bardur Arantsson:
> Regardless: It will only help third party code written in Haskell. Much
> as I like most userland software to be written in Haskell it won't help
> e.g. IntelliJ IDEA one whit.

Unless Haskell runs on the JVM.
Do you know whether Frege (https://github.com/Frege) is a viable option
for that? At least at the surface, it qualifies, but I don't know
whether the details (performance, Java library interoperability,
stability, availability of Haskell language extensions) work out well
enough for that.

Bardur Arantsson

unread,
Sep 17, 2016, 5:36:21 AM9/17/16
to haskel...@haskell.org
On 2016-09-17 10:50, Joachim Durchholz wrote:
> Am 17.09.2016 um 09:51 schrieb Bardur Arantsson:
>> Regardless: It will only help third party code written in Haskell. Much
>> as I like most userland software to be written in Haskell it won't help
>> e.g. IntelliJ IDEA one whit.
>
> Unless Haskell runs on the JVM.

I think people have been wishing for that for a while... some people
even worked on it, but so far nothing's come of it AFAIK.

> Do you know whether Frege (https://github.com/Frege) is a viable option
> for that?

Not in the least last time I checked. It's missing far too many of the
extensions that almost everybody uses as a matter of course.

Maybe given a few more years, but I'm not holding my breath.

Regards,

Edward Z. Yang

unread,
Sep 17, 2016, 5:48:20 AM9/17/16
to Harendra Kumar, Paolo Giarrusso, haskell-cafe, Haskell-cafe
https://github.com/ghc-proposals/ghc-proposals/ could be used for
this purpose. There is also the Trac wiki but I find it is a bit
too hard to keep under control with comments.

Edward

Excerpts from Harendra Kumar's message of 2016-09-17 08:05:38 +0530:

Joachim Durchholz

unread,
Sep 17, 2016, 7:31:49 AM9/17/16
to haskel...@haskell.org
Am 17.09.2016 um 11:35 schrieb Bardur Arantsson:
> On 2016-09-17 10:50, Joachim Durchholz wrote:
>> Do you know whether Frege (https://github.com/Frege) is a viable option
>> for that?
>
> Not in the least last time I checked. It's missing far too many of the
> extensions that almost everybody uses as a matter of course.

Pity.
Any idea how hard it would be to make it compile ghc?

Rahul Muttineni

unread,
Sep 17, 2016, 10:05:48 AM9/17/16
to Joachim Durchholz, haskell-cafe
Hi Joachim,

Besides Frege, Haskell does indeed run on the JVM now via GHCVM [1] - it was my HSoC project. I'll be doing a release in a couple days once I get a couple issues sorted out and the installation is streamlined. I'm currently working with Cary Robbins on getting the HaskForce Intellij Plugin working for GHCVM. If all goes well, GHCVM 0.0.1 will ship with ghcvm, ghcvm-pkg, cabalvm (a fork of cabal-install 1.22.9.0/Cabal 1.22.8.0 that supports GHCVM), a working Intellij plugin, and will support all of GHC 7.10.3 extensions other than Template Haskell + interoperation with Java libraries. You can join us on Gitter for live updates [2].


Thanks,
Rahul
--
Rahul Muttineni

Daniel Trstenjak

unread,
Sep 18, 2016, 6:31:19 AM9/18/16
to haskel...@haskell.org
On Sat, Sep 17, 2016 at 09:51:25AM +0200, Bardur Arantsson wrote:
> Regardless: It will only help third party code written in Haskell. Much
> as I like most userland software to be written in Haskell it won't help
> e.g. IntelliJ IDEA one whit.

If you're talking about more IDEs supporting Haskell, then having a
more standard package format really won't help that much.

Getting good and stable support there's a need for tools that can be called
by IDEs. Building a Haskell project IDEs won't read the cabal file
and call ghc, but they just call cabal.

The same is the case for e.g. auto completion or any other IDE operation
that needs to consider the whole project, the configuration and all of
its dependencies.

Reimplemeting cabals logic in every IDE doesn't make that much sense and
at the end it won't work that well and it will easily break.

Tom Murphy

unread,
Sep 18, 2016, 7:45:33 AM9/18/16
to Joachim Durchholz, haskell-cafe
On Sat, Sep 17, 2016 at 7:27 AM, Joachim Durchholz <j...@durchholz.org> wrote:
Am 17.09.2016 um 01:53 schrieb Harendra Kumar:
I agree. Supporting conditionals with YAML looks hacky!

All I have seen was direct translation and conclusion that it doesn't work.
I haven't seen any attempts at making it look well.

Also, while aesthetics isn't irrelevant, it's a pretty weak argument.


Read the next paragraph in hvr's email: he was very much not talking about aesthetics.

Tom Murphy

unread,
Sep 18, 2016, 8:03:47 AM9/18/16
to Imants Cekusins, haskell Cafe
On Fri, Sep 16, 2016 at 4:35 PM, Imants Cekusins <ima...@gmail.com> wrote:
> Would cabal-install need to link in all these maintained libraries statically? Or would there be some plug-in mechanism to load them on demand?

well the libraries would need to be official and some with the packager.​

the formats would be perfectly interchangeable i.e.
cabal -> standard_type -> yaml -> standard_type -> json -> standard_type -> cabal
would produce the same cabal file 

only 1 config file per package to avoid confusion

however if the user prefers working with format F, they can always convert the format which came with the package, to F


Even just looking at the set of features which is 1:1 betw. YAML and JSON, we're essentially just talking about key-value pairs with a couple of common types for the values. This isn't all .cabal files contain (e.g. see hvr's points about conditionals), but if it were true, is it really worth changing how Cabal works for a diffferent color bikeshed?


Why not have .cabal files be the standard model, and anyone can write tools on top to translate to/from .cabal if users really want to use something else?

In general, though, I don't think the fragmentation is worth it.

Tom
 

Imants Cekusins

unread,
Sep 18, 2016, 10:27:39 AM9/18/16
to haskell Cafe
here are some charts to highlight differences between 
  • currently used text based config and
  • suggested model based config

Imants Cekusins

unread,
Sep 18, 2016, 11:40:19 AM9/18/16
to haskell Cafe
> Why not have .cabal files be the standard model, and anyone can write tools on top to translate to/from .cabal if users really want to use something else?
.cabal file is representation rather than a model. It is parsed to model. Being a distinct file type with its own AST, it needs quite a bit of attention. It  needs to be parsed, updated, validated, formatted.

Another config format emerged. More problems (distinct file type etc). More formats may follow.

Is there hope to agree on common format (as per thread title), if common content can not be agreed on? Isn't config first of all about content? Is the common format going to contain incompatible / conflicting data items?

With common content, display format will not matter at all, neither will package tool nor IDE used to work on a project.

Config being a Haskell type, it would be well formed. The options would be well known.

Users and IDE devs will not need to worry about indenting, commas, line breaks and other goodies.

Sven Panne

unread,
Sep 18, 2016, 12:58:54 PM9/18/16
to Imants Cekusins, haskell Cafe
2016-09-18 17:40 GMT+02:00 Imants Cekusins <ima...@gmail.com>:
.cabal file is representation rather than a model. It is parsed to model.

Well, that's the case for basically everything you give to a program, so I don't see the point here. A .hs file is e.g. just a textual representation of the more abstract notion of a Haskell program/module, too. A .cabal file is just a textual representation of a the abstract notion of a Haskell package description.
 
Being a distinct file type with its own AST,

Distinct from what?
 
it needs quite a bit of attention.

From whom?
 
It  needs to be parsed, updated, validated, formatted.

This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.
 
Another config format emerged.

I'm not sure what config format is meant here. If it's stack.yaml, it *must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.
 
More problems (distinct file type etc).

What are the actual problems here?
 
More formats may follow.

If they are for different purposes, that's OK and is to be expected.
 
Is there hope to agree on common format (as per thread title), if common content can not be agreed on? Isn't config first of all about content? Is the common format going to contain incompatible / conflicting data items?

.cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?
 
With common content, display format will not matter at all, neither will package tool nor IDE used to work on a project.

Config being a Haskell type, it would be well formed. The options would be well known.

Users and IDE devs will not need to worry about indenting, commas, line breaks and other goodies.

Somehow you will always need a concrete representation of abstract notions (call them "models", "ASTs", etc.), otherwise you won't be able to process them. So you will always need to care about some kind of syntax etc., I can't see how using a "Haskell type" will help here. And you will need some semantics for the representation. Even if we used e.g. JSON (or whatever is en vogue at the moment), IDEs will not magically start understanding and supporting Haskell projects.

Again: What is the actual problem we're trying to solve? I still haven't seen a concrete use case which is hard/impossible with the current state of affairs. Personally, I would e.g. like to see some abstraction facilities to avoid repetition in .cabal files with lots of executables, but I don't care about the concrete syntax (and Cabal's internal model/AST wouldn't be affected, either).

Imants Cekusins

unread,
Sep 18, 2016, 1:38:14 PM9/18/16
to haskell Cafe
> Well, that's the case for basically everything you give to a program, so I don't see the point here. A .hs file is e.g. just a textual representation of the more abstract notion of a Haskell program/module, too. A .cabal file is just a textual representation of a the abstract notion of a Haskell package description.

yes, .hs AST etc must be implemented. However implementing cabal in addition to that is more work.
 
> Distinct from what?
 from .hs. 
 
> [attention] From whom?
IDE devs
 
>> It  needs to be parsed, updated, validated, formatted.
> This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.

if serialized model is used, 
then parsing, update, validation, formatting are no longer necessary


I'm not sure what config format is meant here. If it's stack.yaml, it *must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.

What standard package format are we trying to agree then?
 
 
More problems (distinct file type etc).

What are the actual problems here?

implementing each new file type in IDE is a lot of work. That is, if IDE is trying to do anything with contents of that file. Such as support syncing renamed file to config.
 
 
More formats may follow.

If they are for different purposes, that's OK and is to be expected.

Each new format would need to be implemented. Time spent on implementing new formats is time not spent on implementing any other features. It may take nearly as long as implementing .hs support itself. Is this even thought about?

If this may be avoided, why not at least consider this as an option?
 
 .cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?


Standard package file format (as the thread is called). Isn't it about cabal and yaml? 

Anyway, can not a common config file be used for both purposes? If not, can common file type / model be used for both purposes - sharing the common parts of the type structure?

 
Somehow you will always need a concrete representation of abstract notions (call them "models", "ASTs", etc.), otherwise you won't be able to process them. So you will always need to care about some kind of syntax etc., I can't see how using a "Haskell type" will help here. And you will need some semantics for the representation. Even if we used e.g. JSON (or whatever is en vogue at the moment), IDEs will not magically start understanding and supporting Haskell projects.


well if config is expressed in terms of Haskell syntax, implemented .hs support will be enough to support editing these config files. 
Each file type (including .cabal) takes time to implement.

 
Again: What is the actual problem we're trying to solve? I still haven't seen a concrete use case which is hard/impossible with the current state of affairs. Personally, I would e.g. like to see some abstraction facilities to avoid repetition in .cabal files with lots of executables, but I don't care about the concrete syntax (and Cabal's internal model/AST wouldn't be affected, either).

adopting standard package file format. Which could be  addressed even better by adopting typed standard config content.

the problems as I see them are:
  • users need to learn .cabal (.yaml, ...) syntax in addition to .hs syntax
  • IDE need to implement each such syntax on top of .hs. That is, if support / sync of these configs to code files is expected. 

Am I the only one who sees these as issues that need / can be solved?

Also maybe let's be more specific: what is this thread - Standard package file format - all about?

Joachim Durchholz

unread,
Sep 18, 2016, 2:39:59 PM9/18/16
to haskel...@haskell.org
Am 18.09.2016 um 14:03 schrieb Tom Murphy:
> Even just looking at the set of features which is 1:1 betw. YAML and JSON,
> we're essentially just talking about key-value pairs with a couple of
> common types for the values.

This is just as correct as saying that Haskell is about functions - i.e.
superficially correct but mostly beside the point.

For JSON, it's string-to-whatever maps, arrays, and primitive types.

For YAML, it's string-to-whatever maps, arrays, primitive types,
references (so you can have shared and circular data structures), and
arbitrary types (it will use constructors to deserialize).

> This isn't all .cabal files contain (e.g. see
> hvr's points about conditionals), but if it were true, is it really worth
> changing how Cabal works for a diffferent color bikeshed?

It's bikeshedding if and only if interoperability is irrelevant.
However, in today's world, rejecting interoperability is insanity.
So: no bikeshedding, there are real issues.

It's still quite possible that it's simply not worth it; the cons
associated with changing the buildfile format are pretty weighty after
all, and if the Cabal people say they can fix the known problems with
that format, it's probably a better idea to see what comes of that
before pursuing alternate formats.

Sven Panne

unread,
Sep 18, 2016, 3:38:46 PM9/18/16
to Imants Cekusins, haskell Cafe
2016-09-18 19:38 GMT+02:00 Imants Cekusins <ima...@gmail.com>:
yes, .hs AST etc must be implemented.

If we're talking about a Haskell tool, it *is* already implemented: Just look into the Cabal project on github. If we're not talking about a Haskell tool and something outside the cabal/hackage/stackage ecosystem, writing an AST and a parser for it will be the least of your problems: The main problem will be how to map cabal's/stack's view what is a package/project to your tool's view. I don't think there's an universally agreed upon notion of what is a package or project, almost every IDE out there has its own view of what those mean, each with its pros and cons (which may be heavily influenced by the package/project programming language, not the language in which the IDE is written).
 
However implementing cabal in addition to that is more work.

Are we talking about parsing/printing here? If yes, there's already work in that direction (making the frontend, i.e. parser/printer/AST, a separate library), at least that's what I understood so far.
  
> Distinct from what?
 from .hs. 

And that's perfectly fine: A project/package description is something fundamentally different than a turing-complete general-purpose programming language. A project/package description should be a mostly static, declarative thing, perhaps with a few conditionals and/or (hygienic) macros or such for convenience/brevity, but not something which can calculate fibonacci numbers or solve differential equations.
 
> [attention] From whom?
IDE devs

Hmmm, so parsing some package/project description is a problem when writing an IDE? I highly doubt that this is relevant compared to the amount of work needed for an average IDE.
 
>> It  needs to be parsed, updated, validated, formatted.
> This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.

if serialized model is used, 
then parsing, update, validation, formatting are no longer necessary

Huh? What's a "serialized model" then? Whatever you do, you have to parse/validate/... any description. Even if you choose some subset of Haskell (which is probably a bad idea IMHO because it's either too general or not really Haskell anymore), there has to be *some* parser etc. Where should that come from? Neither Emacs nor VIM can e.g. parse/print Haskell out of the box, VS probably can't either.
 

I'm not sure what config format is meant here. If it's stack.yaml, it *must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.

What standard package format are we trying to agree then?

stack.yaml is not a "package format", so there is nothing to agree on.
  
More problems (distinct file type etc).

What are the actual problems here?

implementing each new file type in IDE is a lot of work. That is, if IDE is trying to do anything with contents of that file. Such as support syncing renamed file to config.
 
 
More formats may follow.

If they are for different purposes, that's OK and is to be expected.

Each new format would need to be implemented. Time spent on implementing new formats is time not spent on implementing any other features. It may take nearly as long as implementing .hs support itself. Is this even thought about?

stack.yaml is not .cabal in a new syntax, there is new functionality. Even if both were e.g. written in YAML, your shiny hypothetical IDE wouldn't suddenly support reproducible multi-package builds out of the box if it couldn't do so before.
 

If this may be avoided, why not at least consider this as an option?
 
 .cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?


Standard package file format (as the thread is called). Isn't it about cabal and yaml?

If we are really only talking about a *package* format, there is currently only .cabal and a single format is by definition "standard". :-)
  
well if config is expressed in terms of Haskell syntax, implemented .hs support will be enough to support editing these config files. 
Each file type (including .cabal) takes time to implement.

Again: Where is that ominous ".hs support" coming from?
 
the problems as I see them are:
  • users need to learn .cabal (.yaml, ...) syntax in addition to .hs syntax
As has already been mentioned by others, I *highly* doubt that the .cabal syntax itself poses the slightest problem for anyone. The semantics are a different story, but you have to learn them anyway.
  • IDE need to implement each such syntax on top of .hs. That is, if support / sync of these configs to code files is expected. 
You just update your internal view of the package/project and write out the changed part. With library support for .cabal and YAML files that's trivial.

Am I the only one who sees these as issues that need / can be solved?

Also maybe let's be more specific: what is this thread - Standard package file format - all about?

That's the central question IMHO. :-) The current discussion seems to drift towards: Do we need the current package/project dichotomy or can we throw everything together? (Note that e.g. Visual Studio distinguishes projects and solutions, too, perhaps there's a reason for that?)

Richard A. O'Keefe

unread,
Sep 18, 2016, 7:22:12 PM9/18/16
to haskel...@haskell.org
YAML and TOML are not, strictly speaking, package file formats.
They are *meta-formats*.
There is, by design, nothing about them that ties them in any
way to any kind of package system.
That means that other, even more popular, meta-formats
should be considered.
In particular, while XML and JSON are not by any means
*wonderful*, they are far better known than TOML or even YAML.

On 16/09/16 6:20 PM, Harendra Kumar wrote:
> From a developer's perspective, the major benefit of a standard and
> widely adopted format and is that people can utilize their knowledge
> acquired from elsewhere, they do not have to go through and learn
> differently looking and incomplete documentation of different tools. The
> benefit of a common config specification is that developers can choose
> tools freely without worrying about learning the same concepts presented
> in different ways.

If we are talking about *meta-formats*, this is only half
true. No amount of knowledge about YAML per se will tell
you how to use YAML to describe Haskell packages. Nor will
it let you choose tools freely if what you want is tools
that understand your *package file format* specifically.
(For example, editors that can drop in handy templates,
or validate a description.)

> * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A
> significant chunk of developer community is already familiar with it. It
> is being used by stack and by hpack as an alternative to cabal format.
> The complaint against it is that the specification/implementation is
> overly complex.

It's not clear what "standard" means in this context.
yaml.org *calls* it "standard", but as the joke puts it,
"CALLING a tail a leg doesn't MAKE it a leg."
XML is a standard: it's managed by a well-known body.
JSON is both an ECMA standard and an Internet RFC.

There are other complaints:
- that there is no *other* reason for most Haskell programmers
to be aware of YAML,
- that stack and hpack do not use "YAML" but an underspecified
subset of YAML, and that
- that due to YAML's complexity different implementations tend to
implement different subsets, meaning less interoperability than
you'd expect,
- that the Ruby documentation for its YAML module
http://ruby-doc.org/stdlib-1.9.3/libdoc/yaml/rdoc/YAML.html
says "Do not use YAML to load untrusted data. Doing so is
unsafe and could allow malicious input to execute arbitrary
code inside your application." I must admit I'm surprised.
- ...

Could I respectfully suggest that the first step in a project
like this is to describe the *semantics* of your package management
information in a language-neutral way? I know a great language for
describing abstract data types and giving them semantics. It's
named for some logician, I think his surname was Curry. (:-)

Seriously, there seems to be an endemic problem with programmers
racing to syntax without thinking over-much about semantics. It
happened with XML. It happened again with RDF. Eventually the
semantics gets patched up, after pointless pain and suffering.

Having nutted out exactly what the issues are with the semantics,
then you can experiment with syntax.

Richard A. O'Keefe

unread,
Sep 18, 2016, 8:12:13 PM9/18/16
to haskel...@haskell.org

On 16/09/16 6:37 PM, Tobias Dammers wrote:
> Another factor in favor of YAML is that it is a superset of JSON,

Here is a simple string in JSON:

"Where's the Golden Fleece?"

Here is the same string in YAML:

--- Where's the Golden Fleece?
...

Superset? I understand "language X is a superset of language Y"
to mean that if I have a document in language Y it can be correctly
processed by a language X processor.

If you mean that any data value that can be represented in JSON
can be represented (differently!) in YAML, fine, but that's not
the same thing. There are many textual formats that generalise
JSON. Heck, even GNUSTEP Property List format does *that*.
(And no, I do not recommend adopting that for anything.)

For that matter, any JSON document can be transcoded with no
loss of structural information into XML and vice versa. That
doesn't mean that JSON is a superset of XML!

Familiarity with JSON semantics and syntax did not help me AT ALL
when faced with YAML.

Here's another meta-format worthy of consideration.
A *package* is a collection of resources with relationships
between them and relationships linking them to other things
like authors (think Dublin Core).
Is there a standard (genuinely standard) notation specifically
for describing resources and their relationships, with quite a
few tools for not just reading it and writing it but actually
reasoning with it?

Why yes. It's called RDF.
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
The design of RDF is intended to meet the following goals:

* having a simple data model
* having formal semantics and provable inference
* using an extensible URI-based vocabulary
* using an XML-based syntax
* supporting use of XML schema datatypes
* allowing anyone to make statements about any resource

There is a human-friendly syntax interconvertible with the XML
one, Turtle.
http://www.w3.org/TR/turtle/

Now RDF (whether XML or Turtle) is *not* designed for presenting
single data values. But that's not really what a package format
wants to do anyway.

Am I seriously recommending RDF (or possibly OWL-DL) as a good
way to describe packages? I am certainly serious that it should
be CONSIDERED. And I'm particularly serious about that for two
reasons.

(1) JSON, XML, TOML, and YAML are all about serialising *data values*.
That's all they do. Anything beyond that is up to you.
RDF and OWL are all about describing *relationships* between
*resources*. It's worth considering carefully what you want to
say in a package file format. If you want to describe
*relationships*, then something that deals with data values may
not be the right *kind* of "language".

Simply jarring people loose from the idea that a "single possibly
structured data value" language is the ONLY kind of language is
of value in itself.

(2) JSON, XML, TOML, and YAML are all about serialising *data values*.
*Single* possibly structured data values.
That's all they do. There is no sense in which there is any
standard way to *combine* data in these forms.
In contrast, RDF was *invented* to have a way of patching together
multiple sets of facts from multiple sources. Given a collection
of package descriptions in YAML, all you have is a bunch of text
files; what you do with them is *entirely* up to you. Given a
bunch of RDF/XML or RDF/Turtle files, there is a *standard* way
to write a query (SPARQL) which integrates them. It becomes
possible to write consistency-checking queries that can be processed
by multiple tools. It becomes possible to ask "if I need these,
what else do I need?" in a standard way.

Again, the idea here is to get people thinking that having a
documented semantics that can be processed by existing description
logic tools has value, so that something at a higher semantic level
than YAML or XML might be worth thinking about.

Richard A. O'Keefe

unread,
Sep 18, 2016, 8:27:26 PM9/18/16
to haskel...@haskell.org

On 17/09/16 4:47 PM, Bardur Arantsson wrote:
>>
> I was actually curious about this, and it's interesting to note that
> even JSON which was supposed to have *ONE STANDARD* now apparently has
> two, an ECMA one and and IETF RFC (seems to be more recent).

It's a long sad story. The ECMA standard exists for largely politcal
reasons. The RFC is the "active" one.

JSON is a textbook example of "syntax-first".

MarLinn via Haskell-Cafe

unread,
Sep 18, 2016, 11:29:25 PM9/18/16
to haskel...@haskell.org
Here's another meta-format worthy of consideration.
A *package* is a collection of resources with relationships
between them and relationships linking them to other things
like authors (think Dublin Core).

I really like this approach of thinking about packages. Apart from the obvious benefits of some concrete format like RDF (e.g. mixing with OpenDocument files), this way of thinking could open the door to some interesting ways to see existing problems. As an extension, those perspectives might lead to fruitful experiments. (After all Haskell was originally created as an language to be experimented with - so why not expand that to packaging?) Just two ideas off the top of my head:

  • One could proclaim that part of the orphan-instances-/forced-imports-ugliness stems from the fact that instances formulate a kind of relationship between declarations and are thus fundamentally different from these other declarations. So one might want to experiment with separating both by adapting the package format and see what benefits or drawbacks that brings.

  • Similarly one could proclaim that one of the things that makes the dependency purgatory difficult to navigate is that version numbers alone do not encode enough information - but a finer grained analysis might. E.g. imagine you could say "I have tested with package X in version 0.3.4. But feel free to substitute any other version as long as functions A and B haven't changed, because these are the only ones I use." There are obvious problems with such an approach, but I propose that we can only find a way forward by experimenting - including experimenting with such details of the relationship.

Note that I'm not saying that experiments like these are impossible with a format like cabal or yaml. All I'm saying is that thinking of packages as resources in relationships makes it easier to think in these ways. An appropriate representation could be both a tool to shape our thoughts and a tool that, being specialized for the representation of relationships, makes it easier to incorporate experimental features without breaking the ecosystem. It will probably be unavoidable to "inject" some code into the solver for a specific package for such experiments, but I hope that understanding the details will make it possible to see ways how to do that in safe and portable ways.


MarLinn

Christopher Allen

unread,
Sep 18, 2016, 11:55:40 PM9/18/16
to Haskell Cafe
While y'all are going 'round about this, an argument parser in Rust
has its own blog, api docs, twitter account, github, and tutorial
videos.

https://clap.rs/


And it supports YAML in addition to plain old Rust code.

o...@cs.otago.ac.nz

unread,
Sep 19, 2016, 1:15:50 AM9/19/16
to Christopher Allen, Haskell Cafe
> While y'all are going 'round about this, an argument parser in Rust
> has its own blog, api docs, twitter account, github, and tutorial
> videos.
>
> https://clap.rs/

An *argument parser*?
Visits web page incredulously.
Great balls of fire, it's true.

I really don't want to use an argument parser that requires that
much documentation. Life is too short.

Joachim Durchholz

unread,
Sep 19, 2016, 3:21:58 AM9/19/16
to haskel...@haskell.org
Am 19.09.2016 um 02:12 schrieb Richard A. O'Keefe:
>
>
> On 16/09/16 6:37 PM, Tobias Dammers wrote:
>> Another factor in favor of YAML is that it is a superset of JSON,
>
> Here is a simple string in JSON:
>
> "Where's the Golden Fleece?"
>
> Here is the same string in YAML:
>
> --- Where's the Golden Fleece?
> ...
>
> Superset?

Yes. The original string is also valid in YAML if used in the position
where JSON allows a string.

> If you mean that any data value that can be represented in JSON
> can be represented (differently!) in YAML, fine, but that's not
> the same thing.

Sure, but any valid JSON is also valid YAML.
Modulo some exotic exceptions for valid-but-useless and
valid-but-probably-not-what-the-sender-intended JSON.

> Familiarity with JSON semantics and syntax did not help me AT ALL
> when faced with YAML.

Sure, YAML is a massive superset.
The advantage is more in interoperability - you can hook a YAML parser
to JSON-outputting processes and expect that it will "just work", so you
don't have to worry about syntax, so you don't need separate frontends
for YAML and JSON for your webservice.

> Am I seriously recommending RDF (or possibly OWL-DL) as a good
> way to describe packages? I am certainly serious that it should
> be CONSIDERED.

+1

> (1) JSON, XML, TOML, and YAML are all about serialising *data values*.
> That's all they do. Anything beyond that is up to you.
> RDF and OWL are all about describing *relationships* between
> *resources*. It's worth considering carefully what you want to
> say in a package file format. If you want to describe
> *relationships*, then something that deals with data values may
> not be the right *kind* of "language".
>
> Simply jarring people loose from the idea that a "single possibly
> structured data value" language is the ONLY kind of language is
> of value in itself.

It does have its advantages.
That's why everybody is using XML these days, after all. Even though XML
does have some pretty horrible properties (too much noise being the most
prominent).

> (2) JSON, XML, TOML, and YAML are all about serialising *data values*.
> *Single* possibly structured data values.
> That's all they do. There is no sense in which there is any
> standard way to *combine* data in these forms.

Yes, that's supposed to live at the semantic level, i.e. in the types.
For JSON and TOML that's a serious restriction.
In XML and YAML, you can keep type information (better standardization
for that in YAML than in XML), so you can stick user-defined semantics
into the serialization format if you want to.

I.e. you can achieve RDF in XML or YAML by writing types that handle
combinability or anything else that you want, these things aren't tied
into the language.

It is still possible that RDF is more convenient :-)

Evan Laforge

unread,
Sep 20, 2016, 2:25:45 PM9/20/16
to Harendra Kumar, Patrick Pelletier, Duncan Coutts, Haskell-community, haskell-cafe
I haven't totally followed this whole thread, so apologies if this
isn't entirely relevant, but I use shake for building, and cabal for
dependencies. The shakefile has the list of packages and required
versions, and generates the .cabal file, which is used with
--only-dependencies to get dependencies.

I think it works well. I can't do builds in cabal anyway since it
can't handle anything complicated, but even if I had a simple build
I'd prefer shake since it's so much nicer. Since it's in haskell,
it's flexible but can't be analyzed, though I can't think of why you'd
want to analyze it.

Meanwhile, cabal is just fine at expressing packages and versions, and
is basically just a way to tell cabal-install what to download and
install. Since I generate it, I don't care much about the format, but
the existing one seems perfectly adequate.

Reply all
Reply to author
Forward
0 new messages