It just seems like optimizing a case that isn't really very common, and causing a penalty for all allocations (not just Vectors, but any dimension array) with elementsize == 1, is not the best way to go.
The copying with substrings happened whenever a substring was passed by ccall, to add the nul terminating byte or word, however, I *think* that is now only if you use Cstring (but I haven't verified that yet).
Most decent C/C++ APIs avoid using nul terminated strings any more anyway (too many cases of security holes with buffer overruns, etc.), and what remains is typically for fairly short things, like names, paths, etc.
I am concerned about both problems with poor memory utilization and *unnecessary* copying, however there are ways of avoiding most all of the copying, without causing a tax on all allocations with elementsize == 1.
Taking out the penalty for all elementsize == 1 allocations would not mean that you would always have to do a copy when you pass a string to C. Here is a fairly simple way of doing so:
1) Have the String type add a \0, *if* there is room in the memory allocated to the underlying Vector (because of allocating an aligned size)
2) Have the ccall Cstring conversion from String only make a copy if there is no room in the passed vector for a \0 to have been added (which would likely be rather infrequent)
3) Possibly make Cstring into a concrete type, so that you could completely avoid the checking for embedded nuls / copying if you were passing Cstrings frequently to ccall.
I think currently there are likely some holes due to expecting that nul byte to be there in a Vector{UInt8}, because (as Fengyang Wang [@TotalVerb on GitHub] pointed out yesterday on Gitter), you can create a Vector{UInt16}, and then reinterpret it.
I've verified that in that case, you have a Vector{UInt8} that does not have a nul terminating byte.
There is also a minor bug in the code in _new_array_ (in array.c), where the extra byte is added *after* the length has been checked for the maximum (MAXINTVAL).
(I think that is not a problem in practice though, since MAXINTVAL is pretty large)
Bumping up to the next allocation size is small potatoes though, compared to the huge amount of memory used to store immutable strings with a structure optimized for growable, mutable, multi-dimensional arrays.
I hope that will be addressed in v0.6.
Thanks for the response!