Roman Numerals

David Sletten

unread,

Mar 3, 2009, 6:40:35 AM3/3/09

to clo...@googlegroups.com

Here's a little toy to practice my basic Clojure skills. I didn't see
any similar topic in the archives, so here we go...

This code performs conversions between Roman and Arabic numerals.
Roman numerals are represented as strings and Arabic numerals are
just integers. Arabic numerals in the range of 1 and 3999 inclusive
are allowed. Conventionally, 3999 is the upper limit since the Roman
numeral for 5000 (a V with a bar over it) cannot be represented in
ASCII (although it exists in Unicode).

The function roman-to-arabic-aux correctly converts any valid Roman
numeral. This can be exhaustively tested by comparing the Roman
numeral output of Common Lisp's FORMAT function for example (e.g.,
(format nil "~@R" n)). The trick, however, is weeding out bogus
strings representing invalid Roman numerals such as IVI, IXV, etc...
For this purpose the function "roman?" uses a regular expression I
cribbed from Perl's CPAN module Roman.pm (http://search.cpan.org/
~chorny/Roman-1.23/lib/Roman.pm).

arabic-to-roman performs the obvious conversion in the other direction.

Incidentally, the "cond" form in roman-to-arabic-aux seems to me
harder to read in Clojure than it would be in CL with its additional
set of grouping parentheses. When you can't fit both the predicate
and the consequent expression on the same line it gets confusing.

I'd appreciate any feedback regarding my Clojure style. Any more
natural ways to do things?

Aloha,
David Sletten

(def roman-values-map {\I 1 \V 5 \X 10 \L 50 \C 100 \D 500 \M 1000})

(defn value [roman]
(get roman-values-map (Character/toUpperCase roman)))

(defn roman? [roman-string]
(and (not (empty? roman-string))
(re-matches
#"(?:M{0,3})(?:D?C{0,3}|C[DM])(?:L?X{0,3}|X[LC])(?:V?I{0,3}|I
[VX])$"
roman-string)))

(defn roman-to-arabic-aux [roman-string]
(cond (empty? roman-string) 0
(empty? (rest roman-string)) (value (first roman-string))
(< (value (first roman-string)) (value (second roman-string)))
(- (roman-to-arabic-aux (rest roman-string))
(value (first roman-string)))
:else
(+ (value (first roman-string))
(roman-to-arabic-aux (rest roman-string)))))

(defn roman-to-arabic [roman-string]
(if (roman? roman-string)
(roman-to-arabic-aux roman-string)
(format "'%s' is not a valid Roman numeral." roman-string)))

(def arabic-values '((1000 "M") (900 "CM") (500 "D") (400 "CD")
(100 "C") (90 "XC") (50 "L") (40 "XL")
(10 "X") (9 "IX") (5 "V") (4 "IV") (1 "I")))

(defn arabic-to-roman
([n] (if (<= 1 n 3999)
(apply str (arabic-to-roman n arabic-values))
(format "%d cannot be converted." n)))
([n num-list] (cond (empty? num-list) '()
(zero? n) '()
:else (let [[[arabic roman] & tail] num-list]
(if (>= n arabic)
(cons roman (arabic-to-roman (- n
arabic)
num-list))
(arabic-to-roman n tail)))) ))

Chouser

unread,

Mar 3, 2009, 9:53:41 AM3/3/09

to clo...@googlegroups.com

On Tue, Mar 3, 2009 at 6:40 AM, David Sletten <da...@bosatsu.net> wrote:
>
> I'd appreciate any feedback regarding my Clojure style. Any more
> natural ways to do things?

Looks good, thanks for sharing.

> (defn roman? [roman-string]
> (and (not (empty? roman-string))
> (re-matches
> #"(?:M{0,3})(?:D?C{0,3}|C[DM])(?:L?X{0,3}|X[LC])(?:V?I{0,3}|I
> [VX])$"
> roman-string)))

The normal idiom in Clojure is (seq x) instead of (not (empty? x)),
and that works fine for Strings. Also, since you don't really need
the specific return value of the test expressions, 'when' might be a
better choice than 'and'.

> (defn roman-to-arabic-aux [roman-string]
> (cond (empty? roman-string) 0
> (empty? (rest roman-string)) (value (first roman-string))
> (< (value (first roman-string)) (value (second roman-string)))
> (- (roman-to-arabic-aux (rest roman-string))
> (value (first roman-string)))
> :else
> (+ (value (first roman-string))
> (roman-to-arabic-aux (rest roman-string)))))

To reduce the clutter here, you might consider using destructuring on
roman-string to get locals for the first and rest parts:

user=> (let [[a & b] "fie"] [a b])
[\f (\i \e)]

You also might consider indenting the result expressions to help them
stand out more. I like either:

(cond
test1 (expr-fn
arg1 arg2)
test2 (expr-fn2
argA argB))

or:

(cond
test1
(expr-fn arg1 arg2)
test2
(expr-fn2 argA argB))

> (def arabic-values '((1000 "M") (900 "CM") (500 "D") (400 "CD")
> (100 "C") (90 "XC") (50 "L") (40 "XL")
> (10 "X") (9 "IX") (5 "V") (4 "IV") (1 "I")))

You might consider using a vector of vectors here. There's nothing
particularly wrong with what you've got, but I've been noticing
recently that almost every list you expressed directly in idiomatic
Clojure code is a function or macro call. I think this is part of why
I've found Clojure code to be less difficult to read than CL.

It doesn't matter much in this case as the quote is clearly visible
and the literal numbers aren't likely to be confused with function
calls, but I thought I'd mention it anyway.

> (defn arabic-to-roman
> ([n] (if (<= 1 n 3999)
> (apply str (arabic-to-roman n arabic-values))
> (format "%d cannot be converted." n)))
> ([n num-list] (cond (empty? num-list) '()
> (zero? n) '()
> :else (let [[[arabic roman] & tail] num-list]
> (if (>= n arabic)
> (cons roman (arabic-to-roman (- n
> arabic)
> num-list))
> (arabic-to-roman n tail)))) ))

You might consider using 'reduce' instead of recursion here.
Alternatively, it's interesting to note that because of the ease with
which you're destructuring arabic-values, it would be no more
difficult if you had a single list (or vector) of alternating numbers
and strings rather than nesting them.

--Chouser

Michel S.

unread,

Mar 3, 2009, 5:18:32 PM3/3/09

to Clojure

Hello,

Chouser has commented extensively, so I'll just put in my two cents.

On Mar 3, 6:40 am, David Sletten <da...@bosatsu.net> wrote:
>
> Incidentally, the "cond" form in roman-to-arabic-aux seems to me
> harder to read in Clojure than it would be in CL with its additional
> set of grouping parentheses. When you can't fit both the predicate
> and the consequent expression on the same line it gets confusing.
>

I find that rather confusing too, formatting wise.

> I'd appreciate any feedback regarding my Clojure style. Any more
> natural ways to do things?
>
> Aloha,
> David Sletten
>
> (def roman-values-map {\I 1 \V 5 \X 10 \L 50 \C 100 \D 500 \M 1000})
>
> (defn value [roman]
> (get roman-values-map (Character/toUpperCase roman)))
>

One really neat thing in Clojure is that a collection can be treated
as a function that map from an index (or a key) to a value. So in this
case, this works:

(roman-values-map (Character/toUpperCase roman))

Regards,

--
Michel

David Sletten

unread,

Mar 4, 2009, 4:43:09 AM3/4/09

to clo...@googlegroups.com

On Mar 3, 2009, at 4:53 AM, Chouser wrote:
>
>> (defn roman? [roman-string]
>> (and (not (empty? roman-string))
>> (re-matches
>> #"(?:M{0,3})(?:D?C{0,3}|C[DM])(?:L?X{0,3}|X[LC])(?:V?I{0,3}|I
>> [VX])$"
>> roman-string)))
>
> The normal idiom in Clojure is (seq x) instead of (not (empty? x)),
> and that works fine for Strings. Also, since you don't really need
> the specific return value of the test expressions, 'when' might be a
> better choice than 'and'.
>

Thanks for your comments Chris and Michel.

I'll keep the "seq" idiom in mind.

On the other hand, there is a convention in Common Lisp that I think
is worth observing in Clojure as well. It relates to the distinction
between "if" and "when". My view is that "if" is an expression, and
expressions return values. This is true whether the test is true or
not, so I am not a fan of implicit "else" expressions. Namely, I
believe that "if" should always contain 3 subexpressions. Moreover,
the "then" and "else" subexpressions by default consist of single
expressions. Of course they could be wrapped in "do" forms, but they
aren't automatically.

Conversely, "when" says "side effect". This is clear for 2 reasons.
First, there is no "else" clause, so "when" doesn't really work as an
expression. Furthermore, the forms in the "when" body are evaluated
in an implicit "do". Since only the final value is returned, any of
the earlier expressions in the "when" body are useful only for their
side effects.

For this reason I would argue against using "when" above. The
"roman?" predicate simply states that a string that isn't empty AND
that matches this pattern is a Roman numeral.

>> (defn roman-to-arabic-aux [roman-string]
>> (cond (empty? roman-string) 0
>> (empty? (rest roman-string)) (value (first roman-string))
>> (< (value (first roman-string)) (value (second roman-
>> string)))
>> (- (roman-to-arabic-aux (rest roman-string))
>> (value (first roman-string)))
>> :else
>> (+ (value (first roman-string))
>> (roman-to-arabic-aux (rest roman-string)))))
>
> To reduce the clutter here, you might consider using destructuring on
> roman-string to get locals for the first and rest parts:
>
> user=> (let [[a & b] "fie"] [a b])
> [\f (\i \e)]
>

You are probably right here. I am still thinking in terms of CL's
DESTRUCTURING-BIND, which yields an error if the pattern doesn't match:
(destructuring-bind ((a b) &rest tail) '() (list a b tail))
This is a problem since I don't know whether or not roman-string is
empty going into the "cond".

[D'oh! I just figured out how to do it the right way:
(destructuring-bind (&optional a b &rest tail) '( ) (list a b tail))
=> (NIL NIL NIL)]

However, Clojure just assigns nil to extra variables, so this works:
(defn roman-to-arabic-aux [roman-string]
(let [[a b & tail] roman-string]
(cond (empty? roman-string) 0
(nil? b) (value a)
(< (value a) (value b))
(- (roman-to-arabic-aux (cons b tail))
(value a))
:else
(+ (value a)
(roman-to-arabic-aux (cons b tail)))) ))

But this introduces its own awkwardness as I have to rebuild the tail
of roman-string (cons b tail) for each recursive call.

Aha! I'm destructuring too much. I could just do this:
(defn roman-to-arabic-aux [roman-string]
(let [[a b] roman-string]
(cond (empty? roman-string) 0
(nil? b) (value a)
(< (value a) (value b))
(- (roman-to-arabic-aux (rest roman-string))
(value a))
:else
(+ (value a)
(roman-to-arabic-aux (rest roman-string)))) ))

>> (def arabic-values '((1000 "M") (900 "CM") (500 "D") (400 "CD")
>> (100 "C") (90 "XC") (50 "L") (40 "XL")
>> (10 "X") (9 "IX") (5 "V") (4 "IV") (1 "I")))
>
> You might consider using a vector of vectors here. There's nothing
> particularly wrong with what you've got, but I've been noticing
> recently that almost every list you expressed directly in idiomatic
> Clojure code is a function or macro call. I think this is part of why
> I've found Clojure code to be less difficult to read than CL.
>
> It doesn't matter much in this case as the quote is clearly visible
> and the literal numbers aren't likely to be confused with function
> calls, but I thought I'd mention it anyway.
>

Good point. I'm still getting used to recursively CDR'ing across
vectors. I don't have any problems with the readability of the nested
list structure, but I understand that many Clojure programmers view a
reduction in parentheses as helpful.

>> (defn arabic-to-roman
>> ([n] (if (<= 1 n 3999)
>> (apply str (arabic-to-roman n arabic-values))
>> (format "%d cannot be converted." n)))
>> ([n num-list] (cond (empty? num-list) '()
>> (zero? n) '()
>> :else (let [[[arabic roman] & tail] num-list]
>> (if (>= n arabic)
>> (cons roman (arabic-to-roman (- n
>> arabic)
>> num-
>> list))
>> (arabic-to-roman n tail)))) ))
>
> You might consider using 'reduce' instead of recursion here.
> Alternatively, it's interesting to note that because of the ease with
> which you're destructuring arabic-values, it would be no more
> difficult if you had a single list (or vector) of alternating numbers
> and strings rather than nesting them.
>

I don't think "reduce" would actually work here. As you see, the test
(>= n arabic) determines whether we continue working with the
original num-list or proceed to the tail. I'm not sure of how to get
that behavior with "reduce".

Aloha,
David Sletten

Chouser

unread,

Mar 4, 2009, 7:22:34 AM3/4/09

to clo...@googlegroups.com

On Wed, Mar 4, 2009 at 4:43 AM, David Sletten <da...@bosatsu.net> wrote:
>
> Conversely, "when" says "side effect". This is clear for 2 reasons.
> First, there is no "else" clause, so "when" doesn't really work as an
> expression. Furthermore, the forms in the "when" body are evaluated
> in an implicit "do". Since only the final value is returned, any of
> the earlier expressions in the "when" body are useful only for their
> side effects.

That's interesting. Perhaps I haven't looked carefully enough at when
it's idiomatic to use 'when' in Clojure, but my impression has been
that anytime that returning 'nil' is ok for the else case, using
'when' is acceptible.

> I don't think "reduce" would actually work here. As you see, the test
> (>= n arabic) determines whether we continue working with the
> original num-list or proceed to the tail. I'm not sure of how to get
> that behavior with "reduce".

You're absolutely right, 'reduce' will not work here. I missed that
the 'if' was making the decision to loop again or not, and that based
on numeric comparison.

--Chouser

Rich Hickey

unread,

Mar 4, 2009, 8:56:56 AM3/4/09

to clo...@googlegroups.com

This is not the convention in Clojure for a couple of reasons.

First, there are far fewer side effects. 'when' does not say side
effects, although as you point out, it does support them. But then, so
do function bodies and they don't inherently say side-effects.

I agree 'if' should always have 3 subexpressions, all the more reason
to use 'when' when the second is going to be nil. Idiomatic Clojure
code is full of expressions like:

(when there-is-stuff (expr stuff))

Which I read as when there is stuff, some function of it, else
nothing. There's nothing inherently side-effecting about it.

It would certainly be a waste of 'when' in Clojure to reserve it for
side effects, since so much code contains none.

Rich

David Sletten

unread,

Mar 4, 2009, 9:35:50 PM3/4/09

to clo...@googlegroups.com

On Mar 4, 2009, at 3:56 AM, Rich Hickey wrote:

>>
>> Conversely, "when" says "side effect". This is clear for 2 reasons.
>> First, there is no "else" clause, so "when" doesn't really work as an
>> expression. Furthermore, the forms in the "when" body are evaluated
>> in an implicit "do". Since only the final value is returned, any of
>> the earlier expressions in the "when" body are useful only for their
>> side effects.
>>
>
> This is not the convention in Clojure for a couple of reasons.
>
> First, there are far fewer side effects. 'when' does not say side
> effects, although as you point out, it does support them. But then, so
> do function bodies and they don't inherently say side-effects.
>

Rich,

You're obviously the wrong guy to argue with regarding what the
"right" way to do stuff in Clojure is, so I'll take your comment to
heart.

At the same time though, I still don't quite get why "when" by
definition evaluates its subexpressions in an implicit "do" (as CL's
WHEN provides an implicit PROGN). There is no point in evaluating
multiple expressions unless they produce side effects: printing to
the screen, writing to disk, etc... Since this is the defined
semantics of "when" it seems this should be a flag that governs how
"when" is used. I understand that there is far less mutability in
Clojure, but that's not to say there aren't side effects.

> I agree 'if' should always have 3 subexpressions, all the more reason
> to use 'when' when the second is going to be nil. Idiomatic Clojure
> code is full of expressions like:
>
> (when there-is-stuff (expr stuff))
>
> Which I read as when there is stuff, some function of it, else
> nothing. There's nothing inherently side-effecting about it.
>

Again here I'm not sure what you mean by "else nothing". If you're
using "when" as an expression, it has to produce a value even when
the test fails. Why not just use "if" and make that value explicit?

> It would certainly be a waste of 'when' in Clojure to reserve it for
> side effects, since so much code contains none.
>

It was 50 years ago today that John McCarthy's description of Lisp
was published in an AI Memo (AIM-8). Here's to McCarthy and Lisp. And
here's to you, RIch, for carrying on the flame.

Aloha,
David Sletten

Tom Faulhaber

unread,

Mar 5, 2009, 1:29:36 AM3/5/09

to Clojure

BTW, cl-format (my Common Lisp format function for Clojure), supports
the ~@R directive for converting Arabic to Roman (but nothing to go
the other way around. I didn't spend too much time thinking about
stylistic issues when I wrote it, but if you're interested you can
compare. It's at http://github.com/tomfaulhaber/cl-format.)

David Sletten

unread,

Mar 5, 2009, 3:41:23 AM3/5/09

to clo...@googlegroups.com

On Mar 4, 2009, at 8:29 PM, Tom Faulhaber wrote:

>
> BTW, cl-format (my Common Lisp format function for Clojure), supports
> the ~@R directive for converting Arabic to Roman (but nothing to go
> the other way around. I didn't spend too much time thinking about
> stylistic issues when I wrote it, but if you're interested you can
> compare. It's at http://github.com/tomfaulhaber/cl-format.)
>

Thanks for the link Tom. I'd read mention of your work somewhere but
hadn't had a chance to look at it yet. That's extremely cool that
you've implemented FORMAT!

Aloha,
David Sletten

Reply all

Reply to author

Forward