named sub-expressions, n-ary functions, things and stuff

Darren Duncan

unread,

Nov 13, 2006, 5:04:37 AM11/13/06

to perl6-l...@perl.org

All,

As I've continued to develop my Perl-implemented and integratable
RDBMS, a number of aspects have inspired thought for posible
improvements for the Perl 6 language design.

For context, the query and command language of my RDBMS intentionally
overlaps with Perl 6 as much as reasonable; that is, it is a subset
of Perl 6 with a very simple syntax and with domain-specific
additions; so using it should be loosely like using Perl 6. Suffice
it to say that the more of these "additions" that end up being
provided by Perl 6 itself as options or features, the easier my job
will be in making an easily Perl 6 integratable RDBMS product.

The language has a partial profile like this:
- The type system consists of just strong types, each value and
variable is of a specific type, and all type conversions are explicit.
- The type system is explicitly finite, so no Inf etc values, and all
type generators take parameters which specify applicable limits (eg,
0 <= N < 256); a notable exception is that the Bool type is used as
is, without parameterization, because it is already a finite domain.
- There are no Undef or NaN etc values or variables.
- All type definitions include an explicit default value, eg 0 or ''.
- A failure always manifests as a thrown exception, and an exception
is the result of an operator that can't return a value within the
allowed domain, eg when one divides by zero.
- All logic is 2VL not 3+VL.
- All data types are immutable.
- All operators are prefix operators, invoked on their package name,
like with modules that don't export, and not as object methods.
- All operators and functions take exclusively named arguments, and
argument lists are always bounded in parenthesis.
- All core operators and types are pure functions, with no
side-effects, except for the assignment operator, certain shorthands,
and IO-like or monad functions.
- System defined storable types/type-generators include, otherwise as
defined in Perl 6: Bool, Int, Num, Str, Blob.
- Additional system defined storable types include: DateTime etc,
spacial types, the set based concept of a Tuple type, the set based
Relation type.
- All operators that make sense in an n-ary form are declared with
just one main argument which is the list of operands; this includes:
'+', '*', '~', 'and', 'or', 'min', 'max', 'avg', 'union',
'intersection', (relational) 'join'; said operators can also double
for use in list (eg, relation) summarization.
- System defined transient (non-storable) types include: Seq, Set,
Bag; their primary purpose is to facilitate list arguments such for
n-ary operators that hold the operands, or as a short hand for
representing a sorted query result; note that if one wants to store
the same sort of thing, they define an appropriate Relation type
instead.
- It is valid for all generic collection type values to consist of
zero elements; so eg, a Tuple can have zero attributes; zero-ary
values also happen to be the default values for their corresponding
types.
- Users can define their own types and operators.
- Operators can be recursive.
- Any collection type can be composed of any other type, including
collection types.
- Multiple update operations aka variable assignments can be
performed in a single statement, and this statement is atomic; rvalue
expressions see the same consistent system state before any
assignments, and all assignments are performed after all rvalues are
computed; I suppose like Perl's list assignment.
- Multi-level transactions are supported, where any statements within
a transaction level are collectively atomic and can succeed or fail;
any block marked as atomic, and all named routines and try-catch
blocks are atomic; in the last case, a thrown exception indicates a
failure of the block.
- A database is centrally a persistent-like collection of Relation variables.
- A database as a whole, and each of its parts by extension, is
always perceived by users as being in a consistent state, where all
of its defined constraints or business rules are satisfied; any given
mutating statement will only change it from one consistent state to
another, with no inconsistent state visible between statement
boundaries at any level (in ACID terms, it is serializable isolation).

Note that a number of the above features in combination result in a
language grammar that is extremely simple, though somewhat verbose.
But then, it is largely meant to be an explicit intermediate language
or AST that others can target.

Anyway, a few questions or suggestions about Perl 6 ...

1. I'm not sure if it is possible yet, but like Haskell et al (or
some SQL dialects "WITH" clause), it should be possible to write a
Perl 6 routine or program in a pure functional notation or paradigm,
such that the entire routine body is a single expression, but that
has named reusable sub-expressions.

For example, in pseudo-code:

routine foo ($bar) {
return
with
$bar * 17 -> $baz,
$baz - 3 -> $quux,
$baz / $quux;
}

This is instead of either of:

routine foo ($bar) {
return ($bar * 17) / ($bar * 17 - 3);
}

routine foo ($bar) {
my $baz = $bar * 17;
my $quux = $baz - 3;
return $baz / $quux;
}

The former is an expression that can be embedded in other
expressions, and any redundant parts are explicitly only coded or
calculated once.

2. While it is not strictly necessary, I think it would provide a
useful syntactical short-hand to add an actual immutable "Bag" type.
In context of Synopsis 6, it could look like this:

Seq Completely evaluated (hence immutable) sequence
Set Unordered Seqs that allow no duplicates
Bag Unordered Seqs that do allow duplicates

Declaration of Bag values can be parameterized with 'of' etc the same
as Set or Seq or Array etc can.

A Bag type could be implemented as a Mapping of values to occurance
counts, the latter of which are Int > 0.

Unlike a Seq, which conceptually preserves either an input order of
its elements or a specific sorting of its elements, the Bag doesn't
care to preserve them because the order doesn't matter.

Within the context of the n-ary operators I mentioned earlier, each
one would conceptually take their list of arguments as a of
Seq|Set|Bag of the values:
- Seq: '~'.
- Set: 'and', 'or', 'min', 'max', 'union', 'intersection', (relational) 'join'.
- Bag: '+', '*', 'avg'.

With string concatenation, both input duplicates and the order they
appear will determine the output. With math ops like sum, product,
average, the order of the input doesn't affect the output, but any
duplicates do. With the other above ops, neither order nor
duplicates affect the output, so dups can conceptually be filtered
out first via set construction for efficiency of use.

If nothing else, I note that a lot of code examples in the Synopsis
reference a Bag type and it has the generic-enough appearance to look
like a built-in.

3. I don't know if it is the case now, but there should be separate
operators (which can have the same base name) for Int and Num ops,
particularly the division; a division of 2 Int always returns an Int;
a division involving a Num will return a Num; a division of 2 weak
types that contain numbers will do the Num version even if they look
like integers, so that type alone can determine behaviour, which I
see as being more predictable and consistent.

4. There should be floor() and ceil() functions that take a Num as
input and return an Int; likewise with round() etc. FYI, this is the
method I use for explicit Num->Int type conversion; users can specify
how the conversion is done by which function they explicitly use to
do it.

5. It would help simplify my implementation tasks if all the built-in
Perl 6 types had multis for their operators such that the operators
could all be invoked exclusively with named arguments, even if there
is just 1 argument. Though if you don't want to do this, then its
not a big deal, and I'll just subclass them with wrappers that do
provide such.

Thank you in advance for any consideration or feedback.

-- Darren Duncan

Mark J. Reed

unread,

Nov 13, 2006, 11:00:02 AM11/13/06

to Darren Duncan, perl6-l...@perl.org

On 11/13/06, Darren Duncan <dar...@darrenduncan.net> wrote:
> - There are no Undef or NaN etc values or variables.

A RDBMS language with no "null" would seem to be problematic..
although i guess you could just use 1-tuples where the empty tuple is
treated as null.

--
Mark J. Reed <mark...@mail.com>

Mark A Biggar

unread,

Nov 13, 2006, 12:35:05 PM11/13/06

to Mark J. Reed, Darren Duncan, perl6-l...@perl.org

And you may be forced to deal with NaN and Inf values if you are storing raw binary float values as they are built into the bit patterns.

--
Mark Biggar
ma...@biggar.org
mark.a...@comcast.net
mbi...@paypal.com

Darren Duncan

unread,

Nov 13, 2006, 4:41:01 PM11/13/06

to perl6-l...@perl.org

At 11:00 AM -0500 11/13/06, Mark J. Reed wrote:
>On 11/13/06, Darren Duncan <dar...@darrenduncan.net> wrote:
>>- There are no Undef or NaN etc values or variables.
>
>A RDBMS language with no "null" would seem to be problematic..
>although i guess you could just use 1-tuples where the empty tuple is
>treated as null.

In SQL, the "null" is used for multiple distinct meanings, including
'unknown' and 'not applicable', and having to deal with it makes an
RDBMS more complicated to implement and use by an order of magnitude.
In practice, there are multiple better ways that users can indicate
"unknown" or "not applicable" etc, and that can be done using the
other features.

At 5:35 PM +0000 11/13/06, mark.a...@comcast.net wrote:
>And you may be forced to deal with NaN and Inf values if you are
>storing raw binary float values as they are built into the bit
>patterns.

All data types in my RDBMS are boxed types that hide their
implementation from the user, so details about bit patterns used by
numbers are abstracted away; as particular implementations define it,
numbers may not even be floats at all; they could be rationals or
strings or whatever the implementer wants to use, but the user
doesn't have to care.

The only place raw bit patterns appear is in the Blob type, but those
are undifferentiated so the bits don't mean anything but to the user.

If users have a NaN or Inf they want to store, they can't do it as a
database native finite integer or number; but like with nulls, there
are other ways to record what users want to know.

In any event, I'm interested in knowing what people think about
having named sub-expressions supported in Perl 6 and/or giving it
stronger pure functional syntax or paradigm support; pure functional
means there are no variables or assignment, as far as users are
concerned.

-- Darren Duncan

Smylers

unread,

Nov 13, 2006, 6:24:13 PM11/13/06

to perl6-l...@perl.org

Darren Duncan writes:

> 1. I'm not sure if it is possible yet, but like Haskell et al ..., it

> should be possible to write a Perl 6 routine or program in a pure
> functional notation or paradigm, such that the entire routine body is
> a single expression, but that has named reusable sub-expressions.

I realize it isn't pure functional, but in Perl a C<do> block permits
arbitrary code to be treated as a single expression. Or to put it
another way round, you can use temporary variables inside the expression
that don't 'leak out' of it.

> For example, in pseudo-code:
>
> routine foo ($bar) {
> return
> with
> $bar * 17 -> $baz,
> $baz - 3 -> $quux,
> $baz / $quux;
> }
>
> This is instead of either of:
>
> routine foo ($bar) {
> return ($bar * 17) / ($bar * 17 - 3);
> }

That's obviously bad cos of the repetition.

> routine foo ($bar) {
> my $baz = $bar * 17;
> my $quux = $baz - 3;
> return $baz / $quux;
> }

But what does a functional form have over that? Or over the C<do>
version:

my $whatever
= do { my $baz = $bar * 17; my $quux = $baz - 3; $baz / $quux };

Sure there are variables. But in terms of how your brain thinks about
it is it any different from the functional version -- labels being
associated with intermediate parts of the calculation?

Smylers

Darren Duncan

unread,

Nov 13, 2006, 8:32:58 PM11/13/06

to perl6-l...@perl.org

At 11:24 PM +0000 11/13/06, Smylers wrote:
>Darren Duncan writes:
> > 1. I'm not sure if it is possible yet, but like Haskell et al ..., it
>> should be possible to write a Perl 6 routine or program in a pure
>> functional notation or paradigm, such that the entire routine body is
>> a single expression, but that has named reusable sub-expressions.
>
>I realize it isn't pure functional, but in Perl a C<do> block permits
>arbitrary code to be treated as a single expression. Or to put it
>another way round, you can use temporary variables inside the expression
>that don't 'leak out' of it.

Hmm. I may have to think some more, but it appears that a C<do>
block may be sufficient for what I wanted, which was to embed
reusable named parts inside of an arbitrary larger expression. Thank
you. -- Darren Duncan

Dr.Ruud

unread,

Nov 14, 2006, 5:46:40 AM11/14/06

to perl6-l...@perl.org

Smylers schreef:

> my $whatever
> = do { my $baz = $bar * 17; my $quux = $baz - 3; $baz / $quux };

($bar better not be 3/17)

Just a rewrite:

my $whatever
= do { my $quux = (my $baz = $bar * 17) - 3; $baz / $quux };

And maybe even something like:

my $whatever
= do { $.quux = ($.baz = $bar * 17) - 3; $.baz / $.quux };

(where quux and baz are topicals of the embracing do)

--
Affijn, Ruud

"Gewoon is een tijger."