An Alternative String library for Arduino

763 views
Skip to first unread message

Matthew Ford

unread,
May 31, 2020, 3:30:40 PM5/31/20
to devel...@arduino.cc

I would like to offer the SafeString library, https://www.forward.com.au/pfod/ArduinoProgramming/SafeString/index.html,
I have developed, for inclusion in the Arduino core as an alternative to the current String class (WString.cpp)

The current string processing options in Arduino are C character arrays (using strcat, srtcpy etc), which are a major source of program crashes, and the Arduino standard String class (contained in WString.cpp/.h files) which leads to program failure due to memory fragmentation and excessive memory usage, and is slower and contains bugs.

Both Sparkfun and Adafruit advise against using the Arduino String class.
Sparkfun's comment on the String class is “The String method (capital 'S') in Arduino uses a large amount of memory and tends to cause problems in larger scale projects. ‘S’trings should be strongly avoided in libraries.
Adafruit's comment on the String class is “In most usages, lots of other little String objects are used temporarily as you perform these (String) operations, forcing the new string allocation to a new area of the heap and leaving a big hole where the previous one was (memory fragmentation). “

SafeStrings have the following features:-

SafeStrings are easy to debug. SafeStrings provide detailed error messages, including the name of SafeString in question, to help you get your program running correctly.
SafeStrings are safe and robust. SafeStrings never cause reboots and are always in a valid usable state, even if your code passes null pointers or '\0' arguments or exceeds the available capacity of the SafeString.
SafeString programs run forever. SafeStrings completely avoid memory fragmentation which will eventually cause your program to fail and never makes extra copies when passed as arguments.
SafeStrings are faster. SafeStrings do not create multiple copies or short lived objects nor do they do unnecessary copying of the data.


Rob Tillaart

unread,
Jun 2, 2020, 3:20:33 AM6/2/20
to Arduino Developers

@Matthew

Please check this link how to publish a library for the library manager - https://github.com/arduino/Arduino/wiki/Library-Manager-FAQ  
You might need to make a Bitbucket/GitHub /GitLab/etc account, but that allows also for easy issue tracking and / or enhancements


--
You received this message because you are subscribed to the Google Groups "Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.
To view this discussion on the web visit https://groups.google.com/a/arduino.cc/d/msgid/developers/4c1834b8-5007-5b0c-a218-073ff7fae6f1%40forward.com.au.

Collin Kidder

unread,
Jun 2, 2020, 7:06:03 PM6/2/20
to developers
That's a good first step. But, I think he and others would be more
interested in folding it into the core as he said. The key there, I
suppose, would be whether it works across all platforms and does not
cause any regressions. A good way to determine that might be to have
it as a library in the manager for a bit then discuss whether it makes
sense? Though, I think to gain inclusion in the core it might be nice
if it had the same syntax and class name as the existing class. That
is, it probably should only be in the core if it can be named String
and all existing code that uses strings just magically works but
better. So, that's a question for the OP: Can this new code be
directly used in place of the existing string class with no changes at
all?
> To view this discussion on the web visit https://groups.google.com/a/arduino.cc/d/msgid/developers/CANoujwHqZ327yeSZFLYgRZCX7VTqMWvRbYZMGd06J1j84FABMA%40mail.gmail.com.

Matthew Ford

unread,
Jun 2, 2020, 11:56:24 PM6/2/20
to devel...@arduino.cc, Collin Kidder
>> whether it works across all platforms
I believe it does, nothing fancy here.

 >>it might be nice if it had the same syntax and class name as the existing class.

Unfortunately the syntax, while very similar, is not identical. A number of changes were necessary to make SafeStrings user proof. The differences are detailed in the tutorial 
https://www.forward.com.au/pfod/ArduinoProgramming/SafeString/index.html

The most noticeable ones are:-

To create a SafeString you need to use a macro, either at the global or method local level.
createSafeString(msgStr, 5);  // a SafeString called msgStr with space for 5 chars
createSafeString(msgStr, 10, "A0");  // space for 10 chars and initialize with "A0"
createSafeString macro creates the char array, of the requested size, and wraps it in a SafeString object, msgStr e.g. it expands to
 char msgStr_SAFEBUFFER[10+1]; // add 1 for terminating '\0'
 SafeString msgStr(sizeof(msgStr _SAFEBUFFER),msgStr _SAFEBUFFER,"A0","msgStr"); // adds the object name, msgStr, for error messages and debugging


testStr1 = "a" + 10; // is not supported because it would create temporary objects. Use instead  testStr1 = "a"; testStr1 += 10;
concat() returns a SafeString& (not a bool) so you can chain concats  e.g. testStr1.concat('a').concat(10);

str[2] = .. is not supported because the '\0' error cannot be caught. In SafeStrings use setCharAt( ) instead.

In WString str[2] = '\0'; results in str having an invalid internal state, i.e.  str.length() != strlen(str.c_str())
SafeStrings are ALWAYS valid. If an operation fails the SafeString is left unchanged and an error message is output (if setOutput( ) has been called)

String to number conversions actually check for a valid number and return false if invalid. The converted number is returned in the arg. e.g.
createSafeString(str, 7, " 5.3.6 ");
 float f = 0.0;
 bool rtn = str.toFloat(f);
will return false and leave f unchanged 

In WString f = str.toFloat() returns 5.3 and if it returns 0.0 you cannot tell if the string was '0.0' or '0.0a' or 'abc'

IndexOf type methods return a size_t type, i.e. always >= 0, so tests for char found idx need to be changed to 
 idx < str.length() // char found

In WString -1 (int) is returned for the idx if the char not found

void test(SafeString s);  is not supported because to support it would require creating a complete copy the string.
Arguments must be references, SafeString&  If you try to call that method, then you get the compiler error message
    SafeString(const SafeString& other ); // You must declare SafeStrings function arguments as a reference, SafeString&,  e.g. void test(SafeString& strIn)

Extra SafeString methods:-

stoken() indexes a SafeString into tokens via delimiters, nextToken() tokenizes and removes one token at a time
prefix() and -= operator adds to the front of a SafeString, 
non-blocking read(Stream&) and readUntil(Stream&)
SafeString implements the Print interface so you can add append text by calling the well know print methods on a SafeString object.  This gives you all the print formatting options.
SafeString::setOutput(Print&);  enables detailed error messages and says where to send them. The debug() method dumps the SafeString to the set output.
SafeString::setVerbose(false); suppresses printing the SaftString's contents in error msgs and debug() output
commenting out the setOutput( ) statement turns off all error messages and debug.  Error checks are still performed and the SafeStrings are still always valid, just no messages.
Commenting out a define in SafeString.h removes all the error message code, but not the error checks, for small footprint.

I think String should be retained but deprecated and something like SafeString be included in the core as the preferred alternative.
Message has been deleted

Paul Stoffregen

unread,
Jun 3, 2020, 4:50:19 AM6/3/20
to devel...@arduino.cc
On 5/31/20 12:30 PM, Matthew Ford wrote:

SafeStrings are safe and robust. SafeStrings never cause reboots and are always in a valid usable state


This SafeString usage crashes:  (tested on Arduino Uno and Teensy 4.1, with minor edit to remove include of pgmspace.h)


#include "SafeString.h"
int count = 0;
void setup() {
  Serial.begin(9600);
  while (!Serial && millis() < 2500) ; // wait for Arduino Serial Monitor
  createSafeString(str, 40, "SafeString Test");
  Serial.println(str);
  delay(100);
}
SafeString & myMessage(int num) {
  createSafeString(str, 40, "loop count = ");
  str += num;
  return str;
}
void loop() {
  Serial.println(myMessage(count));
  count = count + 1;
  delay(500);
}


But this equivalent Arduino String usage runs properly:


int count = 0;
void setup() {
  Serial.begin(9600);
  while (!Serial && millis() < 2500) ; // wait for Arduino Serial Monitor
  String str = "Arduino String Test";
  Serial.println(str);
  delay(100);
}
String myMessage(int num) {
  String str = "loop count = ";
  str += num;
  return str;
}
void loop() {
  Serial.println(myMessage(count));
  count = count + 1;
  delay(500);
}


As you continue to develop your library, please use File > Preferences to set "Compiler warnings" to "All".

Matthew Ford

unread,
Jun 3, 2020, 7:57:27 AM6/3/20
to devel...@arduino.cc, Paul Stoffregen
Nice example Paul,
Any suggestions on how to avoid this problem?

And let me work on getting rid of those warnings, thanks
--
You received this message because you are subscribed to the Google Groups "Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

Kurt Stutsman

unread,
Jun 3, 2020, 2:42:23 PM6/3/20
to devel...@arduino.cc
This is not valid code. The fact that it works for Arduino's string is irrelevant. It's a dangling reference to a destroyed string object. The second code returns a copy which is valid code and should work with a SafeString, too.

Paul Stoffregen

unread,
Jun 3, 2020, 2:48:31 PM6/3/20
to devel...@arduino.cc
On 6/3/20 11:42 AM, Kurt Stutsman wrote:
This is not valid code.

Maybe you should read a bit about C++ copy constructors, specifically "return by value" use, before you call this usage invalid.


Kurt Stutsman

unread,
Jun 3, 2020, 3:02:22 PM6/3/20
to devel...@arduino.cc
I have been developing in C++ for 20 years. I know about return by value. I was referring to your first snippet specifically this:


SafeString & myMessage(int num) {
  createSafeString(str, 40, "loop count = ");
  str += num;
  return str;
}

This is the well known dangling reference return error. You are returning a reference to a stack-local variable.

Matthew Ford

unread,
Jun 3, 2020, 3:52:16 PM6/3/20
to devel...@arduino.cc, Paul Stoffregen
I think Paul has a valid point
 My comment "SafeStrings never cause reboots and are always in a valid usable state"  was not correct.
I missed this use case, which someone coming from using Strings might expect to 'just work' since it compiles like before.

Given this is a long standing C problem of a method passing back a pointer to a local stack variable,
perhaps the best that can be done is to tell the users to turn on All Warnings and make sure the warning about this is not hidden in a morass of others.

To this end I am working on cleaning up all the other warnings and updating the tutorial to cover this point in detail.


On 3/06/2020 6:48 pm, Paul Stoffregen wrote:

Kurt Stutsman

unread,
Jun 3, 2020, 4:12:04 PM6/3/20
to devel...@arduino.cc, Paul Stoffregen
I think the lack of a copy constructor is problematic. Perhaps that is what drove Paul to use a reference to begin with. I'm guessing you've made the copy constructor private to prevent people from copying strings around unnecessarily and producing more code/poor performance. Not sure I agree with that, but if you really want to do that I would say provide a way of forcing a copy to return from a function. However if the users don't know about this function, then they will still be inclined to work around it such as with pointer/reference returns. Providing a copy constructor makes it work safely and intuitively albeit with perhaps poorer performance.

I think it's better to allow flexibility and document "proper" usage to avoid extra copies and dangling references.

Billy

unread,
Jun 3, 2020, 4:56:27 PM6/3/20
to devel...@arduino.cc
This thread is turning into a very inefficient C++ code review.
We can't really make efficient pointwise analysis of the API if it's only distributed as a .zip file from a personal website.
Maybe throw it up on GitHub to manage Issues and Pull Requests?

I have some problems with the SafeString API as it stands.
The macros are not obviously macros, `operator-=` doesn't do something that models subtraction, etc...
but I don't think this is the place to work that out.

I would prefer it take a less generic name that indicates its salient feature and niche.


--
You received this message because you are subscribed to the Google Groups "Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.


--
ǝnɥɐuop ʎllıq

Jacques

unread,
Jun 4, 2020, 4:37:53 AM6/4/20
to Developers, pa...@pjrc.com, matthe...@forward.com.au, Matthe...@forward.com.au
No Matthew, his example is wrong by design (or semantic), see my answer to him. Unless I missed something, no need to modify SafeString for this.
To unsubscribe from this group and stop receiving emails from it, send an email to devel...@arduino.cc.

Jacques

unread,
Jun 4, 2020, 4:37:53 AM6/4/20
to Developers
Paul,

I think that the issue is not in SafeString:

- myMessage returns a reference to a stack variable in your SafeString example
- myMessage returns a value in your String example

These are definitely not the same things. The bug is in your example ;-) , not in SafeStrin.

William Westfield

unread,
Jun 4, 2020, 5:13:56 PM6/4/20
to devel...@arduino.cc

I think that the issue is not in SafeString:

- myMessage returns a reference to a stack variable in your SafeString example
- myMessage returns a value in your String example

These are definitely not the same things. The bug is in your example ;-) , not in SafeStrin.

It’s a bug in the DESIGN of SafeString, because it creates a local variable, it a spot where normal Strings created just a copyable object.


// createSafeString(str, 40, "test");
// expands in the pre-processor to
//   char str_SAFEBUFFER[40+1];
//   SafeString str(sizeof(str_SAFEBUFFER),str_SAFEBUFFER,"test","str);

In theory it’s only bad when mis-used.  In practice, it’s supposed to be a replacement for Strings, and having a function return a created string is a common thing…

I’m not sure that I’m convinced that we couldn’t get most of the advantages of SafeString by adding a “maximum length” parameter to the constructors (and maybe eliminating “+” ?) (although I like the debugging additions.)  AFAIK, it’s the constant reallocation of space for Strings that causes most of the problems.

BillW/WestfW

Matthew Ford

unread,
Jun 5, 2020, 7:20:35 PM6/5/20
to devel...@arduino.cc
Hi BillW,

Perhaps thinking of this as SafeCStr  (i.e. a replacement for C str methods) would be a better approach.
You want the memory foot print reliability of C str methods but with more safety and ease of use.
The approach I took was take a C str (char[]) and wrap in an object that would protect it from buffer overflows and ensure it was always correctly terminated.  Then add convenience methods.

Both Adafruit and Sparkfun advise against using Arduino Strings because of the memory fragmentation they create the eventually results in failure.  At present the only alternative is C str methods.
I code consumer devices based on Arduino that run experiments over long periods of time.  If the there is a reboot the whole experiment has to be restarted and the customers get very upset.  Failure is not an option.
So until now I have completely avoided using Strings and struggled with writing 'correct' C str statements

>> It’s a bug in the DESIGN of SafeString, because it creates a local variable
Being able to create method local variables was by design as it lets you safely process characters and then completely recover all the stack memory on the method's return.
In the proposed library objects are only created, and sized, globally for the life of the program or locally on the stack,  just like a normal char[]=" ".  Like a normal char[], if you pass back a pointer to one created on the local stack, things fail.  Fortunately there is a compiler warning for that case.

>> having a function return a created string is a common thing
Once you allow on the fly creation of copyable (in the C++ sense) objects then you are back to memory fragmentation.  In this library you pass in a reference arg to the return string from the caller.

>> adding a “maximum length”
Setting a maximum length for String would not really solve the problems.  You can use reserve() to get something like that now.  The real problem with String is the dynamic heap allocations that naturally occur when using the convenience copying features of Strings . Once you ban those you have to ban,  passing by value, returns of value/reference,   sequences of  + str +  str,  automatic conversion of " " to String for arguments, etc.

The proposed library for inclusion in the Arduino core does ban those while keeping all the other useful features of Stings and adding detailed error messages to help the user find their problems.

The bottom line is that the present String class is not reliable for microprocessor use due to it on the fly creating of new String objects using malloc().  Once you ban that you will get something like what I am proposing.
--
You received this message because you are subscribed to the Google Groups "Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.
To view this discussion on the web visit https://groups.google.com/a/arduino.cc/d/msgid/developers/2C94D3C3-7CF1-47FF-946C-40C2D50CB283%40me.com.


Juraj Andrássy

unread,
Jun 6, 2020, 4:04:43 AM6/6/20
to Developers, matthe...@forward.com.au, Matthe...@forward.com.au
Hello

CStringBuilder class from my StreamLib uses well-know Print class functions to build a c-string.

it is constructed with a char array as parameter. So the user is aware of the scope of the char array variable.

char buff[150];
CStringBuilder sb(buff, sizeof(buff));

example

the library is in Library Manager

J.A

Matthew Ford

unread,
Jun 14, 2020, 12:12:02 AM6/14/20
to Developers
SafeString is now on GitHub, thanks to  Va_Tech_EE, and available via the Arduino Library Manager

Matthew Ford

unread,
Jun 14, 2020, 12:13:36 AM6/14/20
to Developers

Matthew Ford

unread,
Nov 12, 2020, 6:36:56 AM11/12/20
to devel...@arduino.cc

V2.0.1 of the SafeString library has been released and is available from the Arduino Library manager.  The tutorial at https://www.forward.com.au/pfod/ArduinoProgramming/SafeString/index.html, has been updated.

V2 adds wrapping of existing char* and char[] data in SafeStrings so they can be safely worked on and the changes reflected directly in the underlying char[], thus avoiding the need to pass SafeString references as arguments and to more easily integrate with third party libraries.

V2 also add typing short cuts for the macros,  cSF() for createSafeString(),  cSFA() for createSafeStringFromCharArray() , cSFP() for createSafeStringFromCharPtr and cSFPS() for createSafeStringFromCharPtrWithSize()

The SafeString_OBD.ino example parses OBD data, supplied from a third party library, and illustrates cSFA(), cSFP() and cSFPS( ).  Wrapping a char* in a SafeString makes it easy to process char* args.  Wrapping a char* with specified capacity, or wrapping a char[], makes it easy to build results using SafeStrings

The SafeString_ReadFrom_WriteTo.ino example shows how to setup method 'static' SafeStrings variables. 

The library and tutorial include multiple examples sketches for parsing commands from c-strings, Arduino Monitor or telnet, including processing of backspaces and auto processing after a short typing timeout.

Because the createSafeStringFrom..() macros wrap existing c-string data, it is possible, but not advisable, to intermix calls to SafeString methods and unsafe c-string methods, like strcat( ). However typically you would do all your processing using SafeString methods, either in a method or within a code block { }  However if you do intermix SafeString method calls with c-string methods, the wrapped SafeStrings remain valid and safe. 
Before executing each method of a wrapped SafeString:-
a) the underlying c-string is re-terminate to the SafeString capacity() and
b) strlen called to resynchronize the SafeString length to the length of the underlying c-string.

Suggestions, criticisms and bug reports welcome.

Reply all
Reply to author
Forward
0 new messages