Proposal: Introduce string limit function.

61 views
Skip to first unread message

Hassan Raza

unread,
Nov 19, 2022, 11:20:21 AM11/19/22
to elixir-lang-core
Hi all,
I came across from laravel framework, where there are a lot of useful functions, I miss those functions in Elixir, One of the functions is called limit function, I would like to have that in elixir.
```
iex> String.limit("elixir", 3)
"eli..."

iex> String.limit("elixir", 7)
"elixir"

iex> String.limit("elixir", 3, "***")
"eli***"
```
This function would be really helpful with longer string, we can limit long string with some trailing string like (...).

What do you think? If yes what should be the name you suggest?

Thanks,
Hassan



Kip

unread,
Nov 19, 2022, 2:12:19 PM11/19/22
to elixir-lang-core
That is comes from Laravel, not PHP core may be an indication it is better implemented in a library?  If there is momentum towards adding it to the String module I think `String.truncate` would feel more natural to me (its also what Ruby uses).  

Its difficult to make guarantees about the printable width though since characters like ZWJ and Bidi text would mean that to do this properly is not a simple or straight forward situation.  For that reason I don't personally think it belongs in Elixir itself.

Ben Wilson

unread,
Nov 19, 2022, 9:45:35 PM11/19/22
to elixir-lang-core
This seems reasonably straight forward to implement in your own code base:

```
def truncate(string, length, padding \\ ".") do
  string
  |> String.slice(0, length)
  |> String.pad_trailing(String.length(string), padding)
end
```

Not seeing a strong need to include it in the standard library. Just my $0.02

Zach Daniel

unread,
Nov 19, 2022, 11:43:24 PM11/19/22
to elixir-l...@googlegroups.com
It would be great to come up with some kind of heuristic and/or consistent philosophy on what belongs in the standard library, to guide these discussions. Some kind of rubric could make these kinds of conversations easier or even prevent them entirely. For me, the main guiding principles are whether or not there is exactly one right way to do the thing in question, how ubiquitous the need for it is, and how obvious the implementation is (on the flipside, how much we can prevent people from hidden gotchas they wouldn't even think to reach for a library for).

For example, the implementation actually requires only adding padding if the string has been trimmed at all, and I'd bet there are lots of suboptimal implementations out there. Ben's above isn't quite right, since the idea is to only add the ellipses if it truncated the string, and then it should only add exactly the string provided (not pad it out to the full length of the string).  Since a performant implementation probably might not be quite as obvious to the less experienced (with elixir or in general), and this seems like a relatively common operation (for rendering strings in UIs or emails or w/e), I feel like a std library implementation could be warranted.

Something like this would probably be better since it avoids checking the string length (a linear time operation) and also avoids things like multiple slice operations in favor of a single traversal up to "length".

```
def truncate("", 0, _), do: ""
def truncate(_, 0, padding), do: padding

def truncate(string, length, padding) when length > 0 do
  case String.split_at(string, length) do
    {leading, ""} -> leading
    {leading, _} -> leading <> padding
  end
end
```



--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/35628c34-c8c4-4558-a985-de87ec7111d3n%40googlegroups.com.

Kip

unread,
Nov 20, 2022, 2:59:30 AM11/20/22
to elixir-lang-core
> for rendering strings in UIs

This is the bit that concerns me about user expectations. Rendering to a specific length can really only be done by the render engine since characters, kerning and spacing are variable width.  The widest I know of is the single grapheme "﷽". And that also doesn't account for the unicode characters that do not take up space. 

A minor nit is that I think the truncation is expected to be the total length of the string including the padding. So `leading <> padding ` would need to be trim the padding length from leading first.

José Valim

unread,
Nov 20, 2022, 3:33:58 AM11/20/22
to elixir-l...@googlegroups.com
Good shout on using String.split_at/2 on the implementation, Zach. It is one of the concerns I raised in the original PR and your solution is quite elegant.

Which also brings another point: if the implementation is 6 LOC (I believe the first two clauses are not strictly necessary), then there is even less reason to add it to Elixir.

Zach Daniel

unread,
Nov 20, 2022, 10:25:49 AM11/20/22
to elixir-l...@googlegroups.com
This is the kind of thing I mean when I say having a rubric for std library candidacy would be useful. I think how many lines of code shouldn't really be the metric, but more some kind of subjective measure of how difficult it would be for someone else to provide the right implementation on the fly. Regardless of what the rubric looks like, I feel like it could guide lots of discussions on this mailing list.


On Sun, Nov 20 2022 at 3:33 AM, José Valim <jose....@dashbit.co> wrote:
Good shout on using String.split_at/2 on the implementation, Zach. It is one of the concerns I raised in the original PR and your solution is quite elegant.

Which also brings another point: if the implementation is 6 LOC (I believe the first two clauses are not strictly necessary), then there is even less reason to add it to Elixir.

On Sun, Nov 20, 2022 at 5:43 AM Zach Daniel <zachary....@gmail.com> wrote:
It would be great to come up with some kind of heuristic and/or consistent philosophy on what belongs in the standard library, to guide these discussions. Some kind of rubric could make these kinds of conversations easier or even prevent them entirely. For me, the main guiding principles are whether or not there is exactly one right way to do the thing in question, how ubiquitous the need for it is, and how obvious the implementation is (on the flipside, how much we can prevent people from hidden gotchas they wouldn't even think to reach for a library for).

For example, the implementation actually requires only adding padding if the string has been trimmed at all, and I'd bet there are lots of suboptimal implementations out there. Ben's above isn't quite right, since the idea is to only add the ellipses if it truncated the string, and then it should only add exactly the string provided (not pad it out to the full length of the string).  Since a performant implementation probably might not be quite as obvious to the less experienced (with elixir or in general), and this seems like a relatively common operation (for rendering strings in UIs or emails or w/e), I feel like a std library implementation could be warranted.

Something like this would probably be better since it avoids checking the string length (a linear time operation) and also avoids things like multiple slice operations in favor of a single traversal up to "length".

```
def truncate("", 0, _), do: ""
def truncate(_, 0, padding), do: padding

def truncate(string, length, padding) when length > 0 do
  case String.split_at(string, length) do
    {leading, ""} -> leading
    {leading, _} -> leading <> padding
  end
end
```


José Valim

unread,
Nov 20, 2022, 10:46:28 AM11/20/22
to elixir-l...@googlegroups.com
The general rubric is outlined here:

However we can give more leeway to functions compared to features.

Zach Daniel

unread,
Nov 20, 2022, 10:56:09 AM11/20/22
to elixir-l...@googlegroups.com
Yeah, features vs functions make sense that they would have different leeway, and I think maybe deserves its own rubric. Primarily because "providing a standard set of utilities to do common and basic things" is already a feature of core, so this isn't really about wether or not that feature set should exist, but on what falls under the blanket of things that can be included in the feature.

I'd put up something like:

Rubric for standard library candidacy:
  1.  Does it bring important concepts/features to the community in a way its effect can only be maximized or leveraged by making it part of the language?
  2. Is this a relatively common use case and/or do you find yourself repeating this piece of code across multiple code bases?
  3. Is the proper solution non-obvious, i.e does implementing the function in a performant way involve understanding language internals to a high degree?
If you answered yes to one of the questions above, then your function likely belongs in a library. If you answered yes to two or more, then it likely belongs in the standard library.

We could grade various things against this rubric, like `String.equivalent?/2`which is only one line of code. But that function is infinitely more useful than a guide somewhere explaining that to actually check string equivalency requires normalization of each string. The solution is very non-obvious, and many wouldn't even think to seek out something better than string1 == string2. So that would pass #2 and #3 on the rubric.

Just some ideas.



On Sun, Nov 20, 2022 at 10:46 AM, José Valim <jose....@dashbit.co> wrote:
The general rubric is outlined here:

However we can give more leeway to functions compared to features.

On Sun, Nov 20, 2022 at 16:25 Zach Daniel <zachary.s.daniel@gmail.com> wrote:
This is the kind of thing I mean when I say having a rubric for std library candidacy would be useful. I think how many lines of code shouldn't really be the metric, but more some kind of subjective measure of how difficult it would be for someone else to provide the right implementation on the fly. Regardless of what the rubric looks like, I feel like it could guide lots of discussions on this mailing list.

On Sun, Nov 20 2022 at 3:33 AM, José Valim <jose.valim@dashbit.co> wrote:
Good shout on using String.split_at/2 on the implementation, Zach. It is one of the concerns I raised in the original PR and your solution is quite elegant.

Which also brings another point: if the implementation is 6 LOC (I believe the first two clauses are not strictly necessary), then there is even less reason to add it to Elixir.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4Kkxf3hS-uCWk6BKsrBcRkNvSd9U1Oq8uOnrbnt7nuFpQ%40mail.gmail.com.

Christopher Keele

unread,
Nov 21, 2022, 1:53:25 PM11/21/22
to elixir-lang-core
> Yeah, features vs functions make sense that they would have different leeway, and I think maybe deserves its own rubric.

It'd be interesting to see a PR that introduces a new .md file in the language repo (or adds to DEVELOPMENT.md) that provides some guidance here. Then we can continue the discussion about what exactly belongs there at the PR level?

On Sunday, November 20, 2022 at 10:56:09 AM UTC-5 zachary....@gmail.com wrote:
Yeah, features vs functions make sense that they would have different leeway, and I think maybe deserves its own rubric. Primarily because "providing a standard set of utilities to do common and basic things" is already a feature of core, so this isn't really about wether or not that feature set should exist, but on what falls under the blanket of things that can be included in the feature.

I'd put up something like:

Rubric for standard library candidacy:
  1.  Does it bring important concepts/features to the community in a way its effect can only be maximized or leveraged by making it part of the language?
  2. Is this a relatively common use case and/or do you find yourself repeating this piece of code across multiple code bases?
  3. Is the proper solution non-obvious, i.e does implementing the function in a performant way involve understanding language internals to a high degree?
If you answered yes to one of the questions above, then your function likely belongs in a library. If you answered yes to two or more, then it likely belongs in the standard library.

We could grade various things against this rubric, like `String.equivalent?/2`which is only one line of code. But that function is infinitely more useful than a guide somewhere explaining that to actually check string equivalency requires normalization of each string. The solution is very non-obvious, and many wouldn't even think to seek out something better than string1 == string2. So that would pass #2 and #3 on the rubric.

Just some ideas.



On Sun, Nov 20, 2022 at 10:46 AM, José Valim <jose....@dashbit.co> wrote:
The general rubric is outlined here:

However we can give more leeway to functions compared to features.

On Sun, Nov 20, 2022 at 16:25 Zach Daniel <zachary....@gmail.com> wrote:
This is the kind of thing I mean when I say having a rubric for std library candidacy would be useful. I think how many lines of code shouldn't really be the metric, but more some kind of subjective measure of how difficult it would be for someone else to provide the right implementation on the fly. Regardless of what the rubric looks like, I feel like it could guide lots of discussions on this mailing list.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

José Valim

unread,
Nov 21, 2022, 3:34:57 PM11/21/22
to elixir-l...@googlegroups.com
We already cover the relevant policies in our README: https://github.com/elixir-lang/elixir

As of now, it also includes a link to the Development page. I don't think a new document would help as it would lead to duplication.

Regarding functions, I think it can be even simpler than what Zach proposed. We have generally accepted any enhancement to existing functions and modules that are straightforward and without corner cases. After all, the biggest issue with truncate are the corner cases in relation to all scriptsets. For example, I would expect some languages to truncate on the left? So that by itself already has to reframe the conversation to having both truncate_suffix and truncate_prefix. There are probably more corner cases. So it can be 6 LOC in your app, with several assumptions, but likely many more LOC in Elixir.

Zach Daniel

unread,
Nov 21, 2022, 4:14:47 PM11/21/22
to elixir-lang-core
Agreed, a new document is not necessary, just thinking an addition to the text about features.

I don’t personally care much about this specific function, but more so that this is a class of conversation we have constantly on the mailing list, and as you yourself highlighted earlier, there are some different standards for standard library vs feature contributions in terms of what might be accepted. Only point being having something that could help these discussions, like exactly what you mentioned just now about a lack of corner cases. Sounds useful to appear in the readme or somewhere else. Then the conversation starts with “what kind of corner cases are there” and continue along other guide lines and now we’re all rowing in the same direction.

On this specific case, that’s a good point about leading/trailing, but I think it should still be up for consideration with those two names. None of the other functions are protective of different script sets in any specific way right?


On Mon, Nov 21 2022 at 3:34 PM, José Valim <jose....@dashbit.co> wrote:
We already cover the relevant policies in our README: https://github.com/elixir-lang/elixir

As of now, it also includes a link to the Development page. I don't think a new document would help as it would lead to duplication.

Regarding functions, I think it can be even simpler than what Zach proposed. We have generally accepted any enhancement to existing functions and modules that are straightforward and without corner cases. After all, the biggest issue with truncate are the corner cases in relation to all scriptsets. For example, I would expect some languages to truncate on the left? So that by itself already has to reframe the conversation to having both truncate_suffix and truncate_prefix. There are probably more corner cases. So it can be 6 LOC in your app, with several assumptions, but likely many more LOC in Elixir.

On Mon, Nov 21, 2022 at 7:53 PM Christopher Keele <christ...@gmail.com> wrote:
> Yeah, features vs functions make sense that they would have different leeway, and I think maybe deserves its own rubric.

It'd be interesting to see a PR that introduces a new .md file in the language repo (or adds to DEVELOPMENT.md) that provides some guidance here. Then we can continue the discussion about what exactly belongs there at the PR level?

On Sunday, November 20, 2022 at 10:56:09 AM UTC-5 zachary....@gmail.com wrote:
Yeah, features vs functions make sense that they would have different leeway, and I think maybe deserves its own rubric. Primarily because "providing a standard set of utilities to do common and basic things" is already a feature of core, so this isn't really about wether or not that feature set should exist, but on what falls under the blanket of things that can be included in the feature.

I'd put up something like:

Rubric for standard library candidacy:
  1.  Does it bring important concepts/features to the community in a way its effect can only be maximized or leveraged by making it part of the language?
  2. Is this a relatively common use case and/or do you find yourself repeating this piece of code across multiple code bases?
  3. Is the proper solution non-obvious, i.e does implementing the function in a performant way involve understanding language internals to a high degree?
If you answered yes to one of the questions above, then your function likely belongs in a library. If you answered yes to two or more, then it likely belongs in the standard library.

We could grade various things against this rubric, like `String.equivalent?/2`which is only one line of code. But that function is infinitely more useful than a guide somewhere explaining that to actually check string equivalency requires normalization of each string. The solution is very non-obvious, and many wouldn't even think to seek out something better than string1 == string2. So that would pass #2 and #3 on the rubric.

Just some ideas.



Reply all
Reply to author
Forward
0 new messages