Naive question on how to capitalize only the first word of a string

586 views
Skip to first unread message

Glenn Hoetker

unread,
Jun 19, 2017, 8:12:10 PM6/19/17
to Racket Users
I'm quite new to Racket/LISP, so I hope this isn't breathtakingly obvious. Can someone please tell me the best way to capitalize just the first word in a multiword string. So, given the string "it was a dark and stormy night", I would like to get "It was a dark and stormy night". I see functions for turning everything lower case, everything uppercase or capitalizing each word, but nothing in line with what I hope to do.

> (define it "cat dog")
> (string-titlecase it)
"Dog Cat" ; So close, but not quite "Dog cat" as I want.

Many thanks.


Philip McGrath

unread,
Jun 19, 2017, 8:18:02 PM6/19/17
to Glenn Hoetker, Racket Users
I don't think there's a library function that does what you want, so you'd need to define your own. Here's one way to do it:

(define (capitalize-first-letter str)
  (cond
    [(non-empty-string? str)
     (define first-letter-str
       (substring str 0 1))
     (define rest-str
       (substring str 1 (string-length str)))
     (string-append (string-upcase first-letter-str)
                    rest-str)]
    [else
     ""]))

-Philip



--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jack Firth

unread,
Jun 19, 2017, 8:19:19 PM6/19/17
to Racket Users

You'll want to split the string into a list of words and apply the capitalization function to only the first word.

(define (capitalize-first sentence-str)
(string-join (list-update (string-split sentence-str) 0 string-titlecase)))

Disclaimer: this will do some weird things to the whitespace in sentence-str (e.g. convert newlines to spaces); you might want to tweak that.

Neil Van Dyke

unread,
Jun 19, 2017, 8:40:45 PM6/19/17
to Glenn Hoetker, Racket Users
Welcome to Racket!

One intro-to-Racket-compared-to-some-other-languages thing I'll just say
upfront is that you don't want to modify the string itself, not that you
asked to. (Not all Racket strings are mutable. Plus, mutating
introduces a bunch more possibilities for bugs, and for string
operations, we usually are in the "pure functional" school of thought.
There are rare situations in which you'll want to mutate strings, but
that's an advanced topic, after comfortable with idiomatic Racket.)

I'm not aware of this particular procedure in base Racket, so here's one
implementation. Someone new to Racket might like to read through this
procedure and try to figure out the thinking of how the language and
standard library was used.

(define (string-capitalize str)
(or (string? str)
(raise-argument-error 'capitalize
"string?"
str))
(if (equal? "" str)
str
(let ((first-char (string-ref str 0)))
(if (char-lower-case? first-char)
(string-append (string (char-upcase first-char))
(substring str 1))
str))))

Note that this assumes that the first character of the string is also
the first character of any first word. If there might be whitespace or
other non-alphabetic characters before a lowercase alphabetic character,
and your policy would be to find the start of the word, then you'd
probably want to document the exact policy, and then code a loop to
check successive characters (until a determining character or the end of
string is found).

Robby Findler

unread,
Jun 19, 2017, 8:44:19 PM6/19/17
to Neil Van Dyke, Glenn Hoetker, Racket Users
Here's another way to implement it. Fun. :)

#lang racket

(provide
(contract-out
[cap-first (-> string? string?)]))

(define (cap-first s)
(apply
string
(for/list ([c (in-string s)]
[i (in-naturals)])
(if (= i 0)
(char-upcase c)
c))))

(module+ test
(require rackunit)
(check-equal? (cap-first "") "")
(check-equal? (cap-first "Cat dog") "Cat dog")
(check-equal? (cap-first "cat dog") "Cat dog"))


Robby
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users...@googlegroups.com.

Jack Firth

unread,
Jun 19, 2017, 8:50:25 PM6/19/17
to Racket Users
Given how many solutions everyone's giving, this would be a good rosetta code task :)

Jon Zeppieri

unread,
Jun 19, 2017, 8:55:18 PM6/19/17
to Glenn Hoetker, Racket Users
Yet another option:

#lang racket

(define (cap-first str)
(match (string->list str)
[(cons x xs)
(list->string
(cons (char-upcase x) xs))]
[_
str]))

Neil Van Dyke

unread,
Jun 19, 2017, 9:38:44 PM6/19/17
to Glenn Hoetker, Racket Users
Robby's answer was more idiomatic Racket, and mine was
idomatic-Scheme-and-also-OK-Racket, by habit. :)

I'd suggest reading both implementations. As you learn more Racket,
you'll start to get a feel for your preferred linguistic style(s), and
you'll notice different people have a lot more stylistic variation than
just these two. Also, if you're coming from low-level programming:
don't worry about micro-optimizing for performance, yet -- performance
was one head-scratcher that bothered me when I first started learning,
but it's complicated, and comes later.

Robby also added unit tests, setting a proper example from the start.

David Storrs

unread,
Jun 20, 2017, 12:16:50 AM6/20/17
to Neil Van Dyke, Glenn Hoetker, Racket Users
Here's another:

(define/contract (ucfirst str)
  (-> string? string?)
  (match (regexp-match #px"^([^a-z]*)(.)(.+)" str)
    [(list _ prefix first-letter rest-of-string)
     (~a prefix (string-upcase first-letter) rest-of-string)]))

-> (ucfirst "cat dog")
"Cat dog"
-> (ucfirst "  Cat dog")
"  Cat dog"
-> (ucfirst "1  cat dog")
"1  Cat dog"

Modifying the contents of the prefix matcher in the regex will let you express whatever policy you want.  The one that I've written above is:  'Find the first lower-case Latin character (i.e., a-z) in the string and change it to uppercase.  Leave the rest of the string otherwise undisturbed'

You could trivially extend the function so that you can change policies on the fly by passing in the pattern you want:


(define/contract (ucfirst str #:pat [prefix-pat "[^a-z]*"])
  (->* (string?) (#:pat non-empty-string?) string?)
  (define pat (pregexp (~a "^(" prefix-pat ")(.)(.+)")))
  (match (regexp-match pat str)
    [(list _ prefix first-letter rest-of-string)
     (~a prefix (string-upcase first-letter) rest-of-string)]
    [else str]))

Now this works:

(ucfirst "cat dog")   ; "Cat dog"
(ucfirst "Cat dog")  ; "Cat dog"
(ucfirst "  cat dog")        ; "  Cat dog"
(ucfirst "cat dog" #:pat "[^aeiou]*") ; "cAt dog"  

That last one is a trivial example, but it shows the flexibility you could easily get.

Now, I wouldn't recommend using the actual code written above.  It's a toy example intended to demonstrate a technique, not for production.  It's begging for obtuse errors and/or a string injection attack, since it's compiling a user-supplied string into a regex and putting metacharacters around that string.  Still, it shows the flexibility -- ucfirst means 'upcase the first significant character', but the pattern parameter lets you define what 'significant' means.




--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscribe@googlegroups.com.

Glenn Hoetker

unread,
Jun 20, 2017, 1:15:42 AM6/20/17
to Racket Users
Wow. In addition to getting my question answered, I learned about 6 other things. Thank you so much, everyone!

Glenn

Reply all
Reply to author
Forward
0 new messages