Another factor in favor of YAML is that it is a superset of JSON, which eases the learning curve even more (with JSON being a de facto lingua franca for cross-platform untyped data structures), and offers some extra possibilities, although I admit that I can't think of any practical uses. The fact that both Yaml and JSON can be represented as Aeson Values would also make things (arguably) easier for tool writers.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
TOML is limited in its data types: numbers, dates, strings for
primitives, arrays and string-to-object maps.
I'd consider that too limited to ever become a universal configuration
format.
Why not adopt (a subset of) .hs AST file format to structure both project and package files?
I think package descriptions should be limited, but not syntactically. Using some specific monad might work OK.
A little-hyped aspect of Gradle is that it has two strictly divided
phases: Phase 1 builds the dependency model, phase 2 executes it.
Once phase 1 finishes, the dependency model becomes read-only, phase 2
is not allowed to modify it.
On the plus side, this makes it easy for tools to reason about the
model: it's static and easy to reproduce (just run phase 1 on the config
file, or even better, ask the Gradle daemon that's caching the model).
On the minus side, it's hard to make out which code in the config is
phase-1 and which is phase-2: Same syntax, no static types to guide the
intuition; essentially, you have to know which parameters of what
phase-1 library functions are closures to be executed in phase 2.
Haskell might be able to do better in this area, though I'm in no
position to make any proposals for that.
_______________________________________________
Haskell-community mailing list
Haskell-...@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community
It does matter for people who already know JSON: They can skip over the
config file syntax and dive right into the semantics.
Given that a substantial fraction of programmers knows JSON, using that
syntax would create a lower entry barrier.
The same argument can be made for YAML.
This argument cannot be made for TOML at this time, maybe never if
TOML's limitations prevent widespread adoption.
> A psychological impression of complexity? Just not
> anything I've seen evidence of. Indeed, aside from the rather painful
> many-years-long migration, the *cost* (though certainly not a prohibitive
> one) of moving to something like YAML or TOML is that they have a bit
> louder syntax, that demands more attention and feels more complex.
YAML's complexity is partly because it tries to cover everything, partly
because it is pushing hard to be both human-readable and machine-readable.
It's pretty good at this actually, though I guess 20/20 hindsight could
lead to improvements - but not enough to make a new YAML version worth
the effort.
> There is one substantial disadvantage I'd point out to the Cabal file
> format as it stands, and that's that it's pretty non-obvious how to parse
> it, so we will always struggle to interact with it from automated tools,
> unless those tools are also written in Haskell and can use the Cabal
> library. That's a real concern; pragmatic large-scale build environments
> are not tied to specific languages, and include a variety of ad-hoc
> third-party tooling that needs to be integrated, and Cabal remains opaque
> to them. But that doesn't seem to be what's motivating this conversation.
That's implicit in the "it would be nice to have a standard format"
argument, even if it hasn't been explicitly voiced yet.
I guess the overriding question I have here is: what is the PROBLEM being solved?
cabalstack yaml
...
> But if all packages had to use the new EDSL, then cross-compilation would essentially become impossible.
"All packages migrate to new format" doesn't seem really a plausible
option, as I already hinted in the text you quote.
There are multiple JVM build tools because they're interoperable (like
cabal-install and Stack): each library picks its own build tool, but
they can still be linked together.
Hpack generates cabal files, stack reuses cabal or hpack files.
In principle, option 2 just needs a non-cross-compiled program to
produce a package description—say by producing a cabal file. You just
need to runghc it, either via ghci or by compiling and running a
binary. Option 3 can be trickier depending on details, but the as long
as you account for cross-compilation in the design it should be
doable. For Template Haskell the problem is deeper (see
http://blog.ezyang.com/2016/07/what-template-haskell-gets-wrong-and-racket-gets-right/),
so let's *not* use it here.
--
Paolo G. Giarrusso - Ph.D. Student, Tübingen University
http://ps.informatik.uni-tuebingen.de/team/giarrusso/
_______________________________________________
I agree "full-fledged build system" is not a possible immediate goal.
But an EDSL for expressing cabal projects (as they are today) would
still be in scope of your proposal—and I thought you liked the idea
(see quote below). Using the earlier options: option 3 is not in scope
of this thread, but option 2 is, with the only danger that the design
space is so big to present a challenge.
Quoting from Harendra Kumar's earlier mail:
>> Why not adopt (a subset of) .hs AST file format to structure both project and package files?
> Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.
> For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases.
> On 16 September 2016 at 16:08, Paolo Giarrusso <p.gia...@gmail.com>
> wrote:
>>
>> On 16 September 2016 at 12:13, Patrick Pelletier
>> <ppel...@funwithsoftware.org> wrote:
>> > On 9/16/16 2:36 AM, Paolo Giarrusso wrote:
>> >> We're talking about *three* options:
>> >> 1. syntax for pure Haskell values, which I'll call HSON (Haskell
>> >> jSON). That's just an alternative to YAML/TOML/... That would need
>> >> extensions to allow omitting optional fields entirely.
>> >> 2. a pure Haskell embedded domain-specific language (EDSL) that simply
>> >> generates cabal description records (GenericPackageDescription
>> >> values). That would allow abstraction over some patterns but not much
>> >> more. But that alone is already an argument for EDSLs—the one Harendra
>> >> already presented.
>> >> 3. a Haskell embedded domain-specific language (EDSL) designed for an
>> >> extensible build tool, like Clojure's (apparently), SBT for Scala or
>> >> many others. That would potentially be a rabbit hole leading to a
>> >> rather *different* tool—with a different package format to boot. That
>> >> can't work as long as all libraries have to be built using the same
>> >> tool. But stack and cabal are really about how to manage package
>> >> databases/GHC/external environments, while extensible build tools are
>> >> about (a more powerful form) of writing custom setup scripts. I
>> >> suspect some extensions might be easier if more of the actual building
>> >> was done by the setup script, but I'm not sure.
--
Paolo G. Giarrusso - Ph.D. Student, Tübingen University
http://ps.informatik.uni-tuebingen.de/team/giarrusso/
_______________________________________________
I agree "full-fledged build system" is not a possible immediate goal.
But an EDSL for expressing cabal projects (as they are today) would
still be in scope of your proposal—and I thought you liked the idea
(see quote below). Using the earlier options: option 3 is not in scope
of this thread, but option 2 is, with the only danger that the design
space is so big to present a challenge.
Quoting from Harendra Kumar's earlier mail:
If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.
I think we all can agree, that using the fully-fledged language for
configuration is an extremely bad idea from many perspectives.
The worst of all, IMO, is that it makes reasoning about the
configuration equivalent to the halting problem.
And god, does it hurt in practice! -- speaking as someone who had spent
a non-trivial amount of time on doing exactly this stuff in another age
and for another language.
However.
This does not mean that we cannot find a subset of the language that
would be a point of balance between the needs of expressivity,
learnability and decidability.
After all JSON was born in roughly this spirit, wasn't it?
The wins are obvious to me:
- the syntax is immediately obvious to the target audience
- minimum effort to get existent Haskell tools to work with the "new"
format at the source level -- syntax highlighting, checking, etc.
The only required additions would be restriction enforcement
- no third-party libraries need to be used as dependencies for our
core tooling
> If you can't start or modify a package without already knowing
> haskell, it is a huge barrier to entry.
I'm unconvinced that this problem cannot be resolved within the subsetting approach.
> I remember trying to get
> started in scala and having a lot of trouble with sbt because I didn't
> know their operators for lists and arrays or hash tables or whatever
> it is that they use in their files.
That is because they committed to the sin of employing the whole of
Scala for the thing. Bad for them.
But also.. let's not commit the mistake of conflating the surface syntax
and the semantics.
The semantics are dictated by need -- whose sharpening effect on the
learning curve is unavoidable. I'm willing to argue that a large part
of your confusion came from the /semantics/ of sbt, not the syntax.
The syntax differences, OTOH, can and ought to be trivialized.
--
с уважениeм / respectfully,
Косырев Сергей
Let me guess (have no idea about sbt) -- unbridled Turing completeness?
Declarativity is king for configuration, and Turing completeness ain't it --
please, see my other mail about subsetting Haskell.
--
с уважениeм / respectfully,
Косырев Сергей
As a "beginner"(*), I fully agree.
However having more than one language in the mix can be confusing and complicating...
> It's fairly commonplace for beginners to be confused by the *semantics*: which fields are needed and what they mean, how package version bounds work, what flags are and how they interact with dependencies, the relationship between libraries and executables defined in the same file, etc.
It's all about the semantics - it should preferably be formalised, and ideally the relevant library/package system should be able to check/enforce rules.
> But the syntax? It's just not an issue. I'm not sure what it means to say that people have to "learn" it, because in introducing dozens of people to building things in Haskell, I've never seen that learning process even be noticeable, much less an impediment.
I quite agree
>
Andrew Butterfield
School of Computer Science & Statistics
Trinity College
Dublin 2, Ireland
(*) I've only started to use cabal recently, because a TA of mine built a cabal-based coursework grading system for me - I generally do application devpt in Haskell
and the only build command I need is ghc --make.... Currently moving quickly onto stack this year....
That's a solved problem: Generate an execution plan, which would need to
be fully evaluated in Haskell; then execute it and don't feed anything
back into it.
It's easy to reason about the plan in that scenario.
This is what Gradle does.
> And god, does it hurt in practice! -- speaking as someone who had spent
> a non-trivial amount of time on doing exactly this stuff in another age
> and for another language.
Which language?
> This does not mean that we cannot find a subset of the language that
> would be a point of balance between the needs of expressivity,
> learnability and decidability.
Subsettings makes it hard to know what works and what doesn't.
A Haskell subset would have to be strict - which begs the question
what's the point in calling this a subset of Haskell (and even if there
is a point, it will draw ridicule along the lines of "Haskell is
unsuitable for describing its own configurations").
> After all JSON was born in roughly this spirit, wasn't it?
JSON was/is a serialization format, first and foremost.
>> If you can't start or modify a package without already knowing
>> haskell, it is a huge barrier to entry.
>
> I'm unconvinced that this problem cannot be resolved within the subsetting approach.
Actually subsetting is making this worse: Things freshly learned for
Haskell won't work in the config language, restrictions encountered in
the config language will be unthinkingly transferred to Haskell.
Having two subtly but fundamentally different languages is about the
worst thing you can expose a learner to.
That's a big issue with Gradle.
The third problem I know Gradle for is that it makes it surprisingly
difficult to inspect the execution plan.
Haskell is indeed unsuitable for describing the package configuration,
IMO, but not because it's lazy. It's because it lacks any syntax for
long and human-readable string literals (package description, anyone?).
That also condemns every subset of Haskell.
>> After all JSON was born in roughly this spirit, wasn't it?
Yes, and JSON (and JavaScript) would suck for the very same reason.
This deficiency of JSON was a major incentive for creating YAML.
I'm mildly in favour of supporting another package format in addition
to .cabal, as long as compatibility is kept, and as long as the new
format is actually superior. I think any subset of Haskell would be a
setback from usability perspective.
One major benefit of YAML that I haven't seen mentioned is that it
could be used to replace the README.md file at the same time. Right now
a package description consists of both .cabal and (optionally) Markdown.
I suspect the latter language is actually harder for complete beginners.
I suppose they could, but that would rather defeat the purpose of using
a Haskell subset in the first place. Haskell ignores comments, package
descriptions should not be ignored.
You can annotate modules with the ANN
pragma by using the module
keyword. For example:
{-# ANN module (Just "A `Maybe String' annotation") #-}
I suppose this could do, but there are some downsides:
- somewhat cumbersome syntax,
- reliance on a GHC extension, and worst of all,
- not a Haskell value.
The last point implies that the package.hs with this kind of module
annotation could not produce a proper GenericPackageDescription when
executed as a Haskell program.
> if the topic is _Standard package file format_, why not agree on e.g.
> adopting *GenericPackageDescription* or another similar haskell type
> (rather than a text-based file) as the standard?
>
> then any format (cabal, yaml, json, ...) may be used as long as a
> library exists and is maintained for each such format, which parses /
> produces the format from / to the standard type?
This makes perfect sense to me. The devil may be in the details. Would
cabal-install need to link in all these maintained libraries statically?
Or would there be some plug-in mechanism to load them on demand?
Sbt is a *build* description, *NOT* a package description format. Sbt
uses ivy.xml files for the latter. (With interop for consuming Maven
pom.xml files such that it can leverage the already-huge Maven
repositories.)
This may be somewhat heretical, but I don't actually think we need to
have a human-editable format. (Of course it should probably be
*reasonably* human-readable/editable just for debugging and such.)
Just provide simple commands to view/manipulate whatever package
settings there are. Helpfully said commands could also sanity check
whatever you're trying to do and perhaps provide better error messages
than a tool which only has the "final" package description to work with.
For beginners a simple GUI could be provided and IDEs could do their own
thing.
Problem solves.
This may be somewhat heretical, but I don't actually think we need to
have a human-editable format. [...]
[...] For beginners a simple GUI could be provided and IDEs could do their own
thing.
Problem solves.
- Adopt common standard for different package tools.
- Give users and packager devs a choice of config file formats / representations.
- Explore ways to simplify manual package configuration.
On 2016-09-16 at 08:20:15 +0200, Harendra Kumar wrote:
[...]
> * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A
> significant chunk of developer community is already familiar with it. It is
> being used by stack and by hpack as an alternative to cabal format. The
> complaint against it is that the specification/implementation is overly
> complex.
I'm not sure if this has been pointed out already, but beyond turning a
proper grammar into a stringly-typed one, shoehorning some features of
.cabal files into YAML syntax really appear like a case of the "Genius
Tailor"[1], e.g. consider the `hpack` example
when:
- condition: flag(fast)
then:
ghc-options: -O2
else:
ghc-options: -O0
besides looking quite awkward IMHO (just as an exercise, try inserting a
nested if/then/else in that example above), the prospect that a standard
format like YAML would allow to reuse standard tooling/libraries for
YAML seems quite weak to me; if, for instance, you run the above through
a YAML pretty-printer, you easily end up with something like
when:
- else:
ghc-options: -O0
then:
ghc-options: -O2
condition: flag(fast)
or any other ordering depending on how the keys are sorted/hashed.
Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if
by accident you place a 2nd `else:` branch somewhere, you end up with an
ambiguous .yaml file which may either result in an error, in the first
key getting dropped (most likely variant), or in the 2nd key getting
dropped. Which one you get depends on the YAML parser implementation.
I really don't understand the appeal of applying the golden hammer of
YAML, if `.cabal`'s grammar is already self-evident and concise with its
syntax:
if flag(fast)
ghc-options: -O2
else
ghc-options: -O0
where this if/then/else construct is encoded in the grammar proper
rather than being merely a semantic interpretation after decoding a
general grammar designed for simpler typed data-representations which
isn't even accurate enough (since it has additional symmetries/freedoms)
to capture the desired grammar faithfully, which make YAML quite
error-prone for this specific application.
[1]: The "Genius Tailor" was mentioned recently in a related discussion here:
https://mail.haskell.org/pipermail/haskell-cafe/2016-September/124868.html
-- hvr
I'm not sure if this has been pointed out already, but beyond turning a
proper grammar into a stringly-typed one, shoehorning some features of
.cabal files into YAML syntax really appear like a case of the "Genius
Tailor"[1], e.g. consider the `hpack` example
when:
- condition: flag(fast)
then:
ghc-options: -O2
else:
ghc-options: -O0
Chris Kahn writes:
> I would like to second this thought. Using Haskell for package
> descriptions needs to be thought out and executed with great care and
> attention. It's really easy to go off the rails.
>
> Scala's build system lets you do very powerful things, but it also
> makes things unnecessarily complicated and mystifying for beginners.
> At my previous work where we used Scala extensively, there were many
> times where the team simply resorted to external tools because
> figuring out how to make some seemingly trivial change to an SBT
> module was too time consuming.Let me guess (have no idea about sbt) -- unbridled Turing completeness?
Declarativity is king for configuration, and Turing completeness ain't it --
please, see my other mail about subsetting Haskell.
[...]
> * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A
> significant chunk of developer community is already familiar with it. It is
> being used by stack and by hpack as an alternative to cabal format. The
> complaint against it is that the specification/implementation is overly
> complex.
I'm not sure if this has been pointed out already, but beyond turning a
proper grammar into a stringly-typed one, shoehorning some features of
.cabal files into YAML syntax really appear like a case of the "Genius
Tailor"[1], e.g. consider the `hpack` example
when:
- condition: flag(fast)
then:
ghc-options: -O2
else:
ghc-options: -O0
besides looking quite awkward IMHO (just as an exercise, try inserting a
nested if/then/else in that example above), the prospect that a standard
format like YAML would allow to reuse standard tooling/libraries for
YAML seems quite weak to me; if, for instance, you run the above through
a YAML pretty-printer, you easily end up with something like
when:
- else:
ghc-options: -O0
then:
ghc-options: -O2
condition: flag(fast)
or any other ordering depending on how the keys are sorted/hashed.
Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if
by accident you place a 2nd `else:` branch somewhere, you end up with an
ambiguous .yaml file which may either result in an error, in the first
key getting dropped (most likely variant), or in the 2nd key getting
dropped. Which one you get depends on the YAML parser implementation.
I really don't understand the appeal of applying the golden hammer of
YAML, if `.cabal`'s grammar is already self-evident and concise with its
syntax:
if flag(fast)
ghc-options: -O2
else
ghc-options: -O0
where this if/then/else construct is encoded in the grammar proper
> Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if
> by accident you place a 2nd `else:` branch somewhere, you end up with an
> ambiguous .yaml file which may either result in an error, in the first
> key getting dropped (most likely variant), or in the 2nd key getting
> dropped. Which one you get depends on the YAML parser implementation.
I was actually curious about this, and it's interesting to note that
even JSON which was supposed to have *ONE STANDARD* now apparently has
two, an ECMA one and and IETF RFC (seems to be more recent).
So I'd say JSON technically _allows_ duplicate keys, but that you cannot
reasonably any type of sane behavior in practice if you do
that.
Source: http://stackoverflow.com/a/23195243
(Didn't check up on what the situation is in YAML. YAML is too awful to
contemplate regardless.)
Regards,
All I have seen was direct translation and conclusion that it doesn't work.
I haven't seen any attempts at making it look well.
Also, while aesthetics isn't irrelevant, it's a pretty weak argument.
[...]
> I was actually curious about this, and it's interesting to note that
> even JSON which was supposed to have *ONE STANDARD* now apparently has
> two, an ECMA one and and IETF RFC (seems to be more recent).
Btw, that's partly because ECMA and IETF weren't able to agree who
"owns" JSON, for more details see
https://www.tbray.org/ongoing/When/201x/2014/03/05/RFC7159-JSON
-- hvr
It's not about standard tooling, it's about tools written by third
parties. Tools that you didn't have the time or interest to write
yourself, but which still help make your ecosystem more useful to others.
> if, for instance, you run the above through
> a YAML pretty-printer, you easily end up with something like
>
> when:
> - else:
> ghc-options: -O0
> then:
> ghc-options: -O2
> condition: flag(fast)
>
> or any other ordering depending on how the keys are sorted/hashed.
Only if you use a bad pretty-printer that parses the YAML, then writes
it in prettified form.
Such a pretty-printer would also lose comments.
In other words: I'd be surprised to find a pretty-printer in actual use
that works that way.
> Besides, many YAML (& JSON) parsers silently drop duplicate keys,
That's indeed a common bug/misfeature due to historical accidents.
It's easy to fix though, and libraries have started to acquire options
to get that reported as an error.
> I really don't understand the appeal of applying the golden hammer of
> YAML, if `.cabal`'s grammar is already self-evident and concise with its
> syntax:
>
> if flag(fast)
> ghc-options: -O2
> else
> ghc-options: -O0
>
> where this if/then/else construct is encoded in the grammar proper
> rather than being merely a semantic interpretation after decoding a
> general grammar designed for simpler typed data-representations which
> isn't even accurate enough (since it has additional symmetries/freedoms)
> to capture the desired grammar faithfully, which make YAML quite
> error-prone for this specific application.
Yeah it isn't nice.
Changing the grammar always produces that kind of awkwardnesses.
However, for a fair comparison, you need to actively look for things
that work better with the alternate grammar before you conclude it's worse.
Changing the grammar always produces that kind of awkwardnesses.
However, for a fair comparison, you need to actively look for things that work better with the alternate grammar before you conclude it's worse.
currently both config content (let's call it a model) and representation (view: specific config file type) are bundled.if a common model is agreed on, package tool and IDE devs could pick any view (format) that best suits their / users needs.such fragmentation would not break the workflow. If someone thinks of a convenient format and believe it worth their time to write a controller for it, why not?
that's the while point. If we could agree on a standard serializeable model,
I do like YAML, but I know far too little about the various use cases to
justify any preference; it's quite possible that it's not a good fit,
but I can't really decide it.
All I can do is provide knowledge about YAML, which in some cases was
really necessary, and pointing out one-sided arguments such as
Herbert's; doing a review of Cabal config usecases and see how well they
map to YAML is, sadly, beyond my capabilities.
Contributing the best I can and all that.
On 2016-09-17 at 08:41:37 +0200, Joachim Durchholz wrote:
> Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
>> the prospect that a standard format like YAML would allow to reuse
>> standard tooling/libraries for YAML seems quite weak to me;
>
> It's not about standard tooling, it's about tools written by third
> parties. Tools that you didn't have the time or interest to write
> yourself, but which still help make your ecosystem more useful to
> others.
Sure, but we don't need to throw out the baby with the bathwater to
accomplish that!
Oleg is currently working on a new parser for cabal.config,
cabal.project & ${pkg}.cabal grammar (NB: cabal already uses one
standard unified syntax for all its configuration/description files)
which lends itself better to provide equivalent of ghc-exactprint
(i.e. perfect roundtripping, allowing for faithful refactoring
tooling). Then 3rd parties can then use this new parser as a library.
[..]
>> I really don't understand the appeal of applying the golden hammer of
>> YAML, if `.cabal`'s grammar is already self-evident and concise with its
>> syntax:
>>
>> if flag(fast)
>> ghc-options: -O2
>> else
>> ghc-options: -O0
>>
>> where this if/then/else construct is encoded in the grammar proper
>> rather than being merely a semantic interpretation after decoding a
>> general grammar designed for simpler typed data-representations which
>> isn't even accurate enough (since it has additional symmetries/freedoms)
>> to capture the desired grammar faithfully, which make YAML quite
>> error-prone for this specific application.
>
> Yeah it isn't nice.
> Changing the grammar always produces that kind of awkwardnesses.
> However, for a fair comparison, you need to actively look for things
> that work better with the alternate grammar before you conclude it's
> worse.
Well, that burden of proof lies with those who argue YAML to be
superior to .cabal syntax, doesn't it?
The if/then/else awkwardness is just one aspect I pointed out
explicitly. I hinted at other issues which result from first parsing
into an inappropriate data-model just for the sake of using YAML, and
then having to re-parse that interim lossy data-model for real into the
actual data-model we're interested in (and hoping we didn't loose some
of the essential information).
But I see no need to invest time to spell those problems out until I see
a compelling argument that e.g. YAML syntax is really preferable (to
justify the costs incurred) to the status quo in the first place.
-- hvr
I didn't see anything in the PR about exporting that parser as a
library. Do you have a reference for that?
Regardless: It will only help third party code written in Haskell. Much
as I like most userland software to be written in Haskell it won't help
e.g. IntelliJ IDEA one whit.
Regards,
Unless Haskell runs on the JVM.
Do you know whether Frege (https://github.com/Frege) is a viable option
for that? At least at the surface, it qualifies, but I don't know
whether the details (performance, Java library interoperability,
stability, availability of Haskell language extensions) work out well
enough for that.
I think people have been wishing for that for a while... some people
even worked on it, but so far nothing's come of it AFAIK.
> Do you know whether Frege (https://github.com/Frege) is a viable option
> for that?
Not in the least last time I checked. It's missing far too many of the
extensions that almost everybody uses as a matter of course.
Maybe given a few more years, but I'm not holding my breath.
Regards,
Edward
Excerpts from Harendra Kumar's message of 2016-09-17 08:05:38 +0530:
Pity.
Any idea how hard it would be to make it compile ghc?
If you're talking about more IDEs supporting Haskell, then having a
more standard package format really won't help that much.
Getting good and stable support there's a need for tools that can be called
by IDEs. Building a Haskell project IDEs won't read the cabal file
and call ghc, but they just call cabal.
The same is the case for e.g. auto completion or any other IDE operation
that needs to consider the whole project, the configuration and all of
its dependencies.
Reimplemeting cabals logic in every IDE doesn't make that much sense and
at the end it won't work that well and it will easily break.
Am 17.09.2016 um 01:53 schrieb Harendra Kumar:
I agree. Supporting conditionals with YAML looks hacky!
All I have seen was direct translation and conclusion that it doesn't work.
I haven't seen any attempts at making it look well.
Also, while aesthetics isn't irrelevant, it's a pretty weak argument.
> Would cabal-install need to link in all these maintained libraries statically? Or would there be some plug-in mechanism to load them on demand?well the libraries would need to be official and some with the packager.the formats would be perfectly interchangeable i.e.cabal -> standard_type -> yaml -> standard_type -> json -> standard_type -> cabalwould produce the same cabal fileonly 1 config file per package to avoid confusionhowever if the user prefers working with format F, they can always convert the format which came with the package, to F
.cabal file is representation rather than a model. It is parsed to model.
Being a distinct file type with its own AST,
it needs quite a bit of attention.
It needs to be parsed, updated, validated, formatted.
Another config format emerged.
More problems (distinct file type etc).
More formats may follow.
Is there hope to agree on common format (as per thread title), if common content can not be agreed on? Isn't config first of all about content? Is the common format going to contain incompatible / conflicting data items?
With common content, display format will not matter at all, neither will package tool nor IDE used to work on a project.Config being a Haskell type, it would be well formed. The options would be well known.Users and IDE devs will not need to worry about indenting, commas, line breaks and other goodies.
I'm not sure what config format is meant here. If it's stack.yaml, it *must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.
More problems (distinct file type etc).What are the actual problems here?
More formats may follow.If they are for different purposes, that's OK and is to be expected.
.cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?
Somehow you will always need a concrete representation of abstract notions (call them "models", "ASTs", etc.), otherwise you won't be able to process them. So you will always need to care about some kind of syntax etc., I can't see how using a "Haskell type" will help here. And you will need some semantics for the representation. Even if we used e.g. JSON (or whatever is en vogue at the moment), IDEs will not magically start understanding and supporting Haskell projects.
Again: What is the actual problem we're trying to solve? I still haven't seen a concrete use case which is hard/impossible with the current state of affairs. Personally, I would e.g. like to see some abstraction facilities to avoid repetition in .cabal files with lots of executables, but I don't care about the concrete syntax (and Cabal's internal model/AST wouldn't be affected, either).
This is just as correct as saying that Haskell is about functions - i.e.
superficially correct but mostly beside the point.
For JSON, it's string-to-whatever maps, arrays, and primitive types.
For YAML, it's string-to-whatever maps, arrays, primitive types,
references (so you can have shared and circular data structures), and
arbitrary types (it will use constructors to deserialize).
> This isn't all .cabal files contain (e.g. see
> hvr's points about conditionals), but if it were true, is it really worth
> changing how Cabal works for a diffferent color bikeshed?
It's bikeshedding if and only if interoperability is irrelevant.
However, in today's world, rejecting interoperability is insanity.
So: no bikeshedding, there are real issues.
It's still quite possible that it's simply not worth it; the cons
associated with changing the buildfile format are pretty weighty after
all, and if the Cabal people say they can fix the known problems with
that format, it's probably a better idea to see what comes of that
before pursuing alternate formats.
However implementing cabal in addition to that is more work.
> Distinct from what?from .hs.
> [attention] From whom?IDE devs
>> It needs to be parsed, updated, validated, formatted.> This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.if serialized model is used,then parsing, update, validation, formatting are no longer necessary
I'm not sure what config format is meant here. If it's stack.yaml, it *must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.What standard package format are we trying to agree then?
More problems (distinct file type etc).What are the actual problems here?implementing each new file type in IDE is a lot of work. That is, if IDE is trying to do anything with contents of that file. Such as support syncing renamed file to config.More formats may follow.If they are for different purposes, that's OK and is to be expected.Each new format would need to be implemented. Time spent on implementing new formats is time not spent on implementing any other features. It may take nearly as long as implementing .hs support itself. Is this even thought about?
If this may be avoided, why not at least consider this as an option?.cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?Standard package file format (as the thread is called). Isn't it about cabal and yaml?
well if config is expressed in terms of Haskell syntax, implemented .hs support will be enough to support editing these config files.Each file type (including .cabal) takes time to implement.
the problems as I see them are:
- users need to learn .cabal (.yaml, ...) syntax in addition to .hs syntax
- IDE need to implement each such syntax on top of .hs. That is, if support / sync of these configs to code files is expected.
Am I the only one who sees these as issues that need / can be solved?Also maybe let's be more specific: what is this thread - Standard package file format - all about?
On 16/09/16 6:20 PM, Harendra Kumar wrote:
> From a developer's perspective, the major benefit of a standard and
> widely adopted format and is that people can utilize their knowledge
> acquired from elsewhere, they do not have to go through and learn
> differently looking and incomplete documentation of different tools. The
> benefit of a common config specification is that developers can choose
> tools freely without worrying about learning the same concepts presented
> in different ways.
If we are talking about *meta-formats*, this is only half
true. No amount of knowledge about YAML per se will tell
you how to use YAML to describe Haskell packages. Nor will
it let you choose tools freely if what you want is tools
that understand your *package file format* specifically.
(For example, editors that can drop in handy templates,
or validate a description.)
> * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A
> significant chunk of developer community is already familiar with it. It
> is being used by stack and by hpack as an alternative to cabal format.
> The complaint against it is that the specification/implementation is
> overly complex.
It's not clear what "standard" means in this context.
yaml.org *calls* it "standard", but as the joke puts it,
"CALLING a tail a leg doesn't MAKE it a leg."
XML is a standard: it's managed by a well-known body.
JSON is both an ECMA standard and an Internet RFC.
There are other complaints:
- that there is no *other* reason for most Haskell programmers
to be aware of YAML,
- that stack and hpack do not use "YAML" but an underspecified
subset of YAML, and that
- that due to YAML's complexity different implementations tend to
implement different subsets, meaning less interoperability than
you'd expect,
- that the Ruby documentation for its YAML module
http://ruby-doc.org/stdlib-1.9.3/libdoc/yaml/rdoc/YAML.html
says "Do not use YAML to load untrusted data. Doing so is
unsafe and could allow malicious input to execute arbitrary
code inside your application." I must admit I'm surprised.
- ...
Could I respectfully suggest that the first step in a project
like this is to describe the *semantics* of your package management
information in a language-neutral way? I know a great language for
describing abstract data types and giving them semantics. It's
named for some logician, I think his surname was Curry. (:-)
Seriously, there seems to be an endemic problem with programmers
racing to syntax without thinking over-much about semantics. It
happened with XML. It happened again with RDF. Eventually the
semantics gets patched up, after pointless pain and suffering.
Having nutted out exactly what the issues are with the semantics,
then you can experiment with syntax.
On 16/09/16 6:37 PM, Tobias Dammers wrote:
> Another factor in favor of YAML is that it is a superset of JSON,
Here is a simple string in JSON:
"Where's the Golden Fleece?"
Here is the same string in YAML:
--- Where's the Golden Fleece?
...
Superset? I understand "language X is a superset of language Y"
to mean that if I have a document in language Y it can be correctly
processed by a language X processor.
If you mean that any data value that can be represented in JSON
can be represented (differently!) in YAML, fine, but that's not
the same thing. There are many textual formats that generalise
JSON. Heck, even GNUSTEP Property List format does *that*.
(And no, I do not recommend adopting that for anything.)
For that matter, any JSON document can be transcoded with no
loss of structural information into XML and vice versa. That
doesn't mean that JSON is a superset of XML!
Familiarity with JSON semantics and syntax did not help me AT ALL
when faced with YAML.
Here's another meta-format worthy of consideration.
A *package* is a collection of resources with relationships
between them and relationships linking them to other things
like authors (think Dublin Core).
Is there a standard (genuinely standard) notation specifically
for describing resources and their relationships, with quite a
few tools for not just reading it and writing it but actually
reasoning with it?
Why yes. It's called RDF.
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
The design of RDF is intended to meet the following goals:
* having a simple data model
* having formal semantics and provable inference
* using an extensible URI-based vocabulary
* using an XML-based syntax
* supporting use of XML schema datatypes
* allowing anyone to make statements about any resource
There is a human-friendly syntax interconvertible with the XML
one, Turtle.
http://www.w3.org/TR/turtle/
Now RDF (whether XML or Turtle) is *not* designed for presenting
single data values. But that's not really what a package format
wants to do anyway.
Am I seriously recommending RDF (or possibly OWL-DL) as a good
way to describe packages? I am certainly serious that it should
be CONSIDERED. And I'm particularly serious about that for two
reasons.
(1) JSON, XML, TOML, and YAML are all about serialising *data values*.
That's all they do. Anything beyond that is up to you.
RDF and OWL are all about describing *relationships* between
*resources*. It's worth considering carefully what you want to
say in a package file format. If you want to describe
*relationships*, then something that deals with data values may
not be the right *kind* of "language".
Simply jarring people loose from the idea that a "single possibly
structured data value" language is the ONLY kind of language is
of value in itself.
(2) JSON, XML, TOML, and YAML are all about serialising *data values*.
*Single* possibly structured data values.
That's all they do. There is no sense in which there is any
standard way to *combine* data in these forms.
In contrast, RDF was *invented* to have a way of patching together
multiple sets of facts from multiple sources. Given a collection
of package descriptions in YAML, all you have is a bunch of text
files; what you do with them is *entirely* up to you. Given a
bunch of RDF/XML or RDF/Turtle files, there is a *standard* way
to write a query (SPARQL) which integrates them. It becomes
possible to write consistency-checking queries that can be processed
by multiple tools. It becomes possible to ask "if I need these,
what else do I need?" in a standard way.
Again, the idea here is to get people thinking that having a
documented semantics that can be processed by existing description
logic tools has value, so that something at a higher semantic level
than YAML or XML might be worth thinking about.
On 17/09/16 4:47 PM, Bardur Arantsson wrote:
>>
> I was actually curious about this, and it's interesting to note that
> even JSON which was supposed to have *ONE STANDARD* now apparently has
> two, an ECMA one and and IETF RFC (seems to be more recent).
It's a long sad story. The ECMA standard exists for largely politcal
reasons. The RFC is the "active" one.
JSON is a textbook example of "syntax-first".
Here's another meta-format worthy of consideration.
A *package* is a collection of resources with relationships
between them and relationships linking them to other things
like authors (think Dublin Core).
I really like this approach of thinking about packages. Apart from the obvious benefits of some concrete format like RDF (e.g. mixing with OpenDocument files), this way of thinking could open the door to some interesting ways to see existing problems. As an extension, those perspectives might lead to fruitful experiments. (After all Haskell was originally created as an language to be experimented with - so why not expand that to packaging?) Just two ideas off the top of my head:
One could proclaim that part of the orphan-instances-/forced-imports-ugliness stems from the fact that instances formulate a kind of relationship between declarations and are thus fundamentally different from these other declarations. So one might want to experiment with separating both by adapting the package format and see what benefits or drawbacks that brings.
Similarly one could proclaim that one of the things that
makes the dependency purgatory difficult to navigate is that
version numbers alone do not encode enough information - but a
finer grained analysis might. E.g. imagine you could say "I
have tested with package X in version 0.3.4. But feel
free to substitute any other version as long as functions A
and B haven't changed, because these are the only ones
I use." There are obvious problems with such an approach, but
I propose that we can only find a way forward by experimenting
- including experimenting with such details of the
relationship.
Note that I'm not saying that experiments like these are impossible with a format like cabal or yaml. All I'm saying is that thinking of packages as resources in relationships makes it easier to think in these ways. An appropriate representation could be both a tool to shape our thoughts and a tool that, being specialized for the representation of relationships, makes it easier to incorporate experimental features without breaking the ecosystem. It will probably be unavoidable to "inject" some code into the solver for a specific package for such experiments, but I hope that understanding the details will make it possible to see ways how to do that in safe and portable ways.
MarLinn
And it supports YAML in addition to plain old Rust code.
An *argument parser*?
Visits web page incredulously.
Great balls of fire, it's true.
I really don't want to use an argument parser that requires that
much documentation. Life is too short.
Yes. The original string is also valid in YAML if used in the position
where JSON allows a string.
> If you mean that any data value that can be represented in JSON
> can be represented (differently!) in YAML, fine, but that's not
> the same thing.
Sure, but any valid JSON is also valid YAML.
Modulo some exotic exceptions for valid-but-useless and
valid-but-probably-not-what-the-sender-intended JSON.
> Familiarity with JSON semantics and syntax did not help me AT ALL
> when faced with YAML.
Sure, YAML is a massive superset.
The advantage is more in interoperability - you can hook a YAML parser
to JSON-outputting processes and expect that it will "just work", so you
don't have to worry about syntax, so you don't need separate frontends
for YAML and JSON for your webservice.
> Am I seriously recommending RDF (or possibly OWL-DL) as a good
> way to describe packages? I am certainly serious that it should
> be CONSIDERED.
+1
> (1) JSON, XML, TOML, and YAML are all about serialising *data values*.
> That's all they do. Anything beyond that is up to you.
> RDF and OWL are all about describing *relationships* between
> *resources*. It's worth considering carefully what you want to
> say in a package file format. If you want to describe
> *relationships*, then something that deals with data values may
> not be the right *kind* of "language".
>
> Simply jarring people loose from the idea that a "single possibly
> structured data value" language is the ONLY kind of language is
> of value in itself.
It does have its advantages.
That's why everybody is using XML these days, after all. Even though XML
does have some pretty horrible properties (too much noise being the most
prominent).
> (2) JSON, XML, TOML, and YAML are all about serialising *data values*.
> *Single* possibly structured data values.
> That's all they do. There is no sense in which there is any
> standard way to *combine* data in these forms.
Yes, that's supposed to live at the semantic level, i.e. in the types.
For JSON and TOML that's a serious restriction.
In XML and YAML, you can keep type information (better standardization
for that in YAML than in XML), so you can stick user-defined semantics
into the serialization format if you want to.
I.e. you can achieve RDF in XML or YAML by writing types that handle
combinability or anything else that you want, these things aren't tied
into the language.
It is still possible that RDF is more convenient :-)
I think it works well. I can't do builds in cabal anyway since it
can't handle anything complicated, but even if I had a simple build
I'd prefer shake since it's so much nicer. Since it's in haskell,
it's flexible but can't be analyzed, though I can't think of why you'd
want to analyze it.
Meanwhile, cabal is just fine at expressing packages and versions, and
is basically just a way to tell cabal-install what to download and
install. Since I generate it, I don't care much about the format, but
the existing one seems perfectly adequate.