Proposal: New function on List module

97 views
Skip to first unread message

Julian Somoza

unread,
Jul 15, 2022, 7:58:59 AM7/15/22
to elixir-lang-core
Hi everyone, I was working on a suggestion to add a new function on the List module in order to find out if a given sublist is inside another list.

We currently have List.starts_with?/2 but this only match the beginning:

iex(1)> List.starts_with?([1, 2, 3], [1, 2])
true

Besides, we have MapSet.subset, but it ignores the elements order:

iex(1)> MapSet.subset?(MapSet.new([1, 2]), MapSet.new([1, 2, 3]))
true

iex(1)> MapSet.subset?(MapSet.new([2, 1]), MapSet.new([1, 2, 3]))        
true

I suggest a new function that works like List.starts_with?/2 but matches the entire list, either at the beginning, middle, or end of the list.

iex(1)> List.includes_sublist?([1,2,3,4,5], [2,3])
true

iex(1)> List.includes_sublist?([1,2,3,4,5], [1,2])
true

iex(1)> List.includes_sublist?([1,2,3,4,5], [3,4,5])
true

iex(1)> List2.includes_sublist?([1,2,3,4,5], [1,5])  
false

I'm sharing with you also the function code:


I aim to complement the List.starts_with?/2. This function could be the same or even more useful because it finds the ordered sublist also in the middle or in the end.

Looking forward for your comments,
BR,
Julian Somoza.


eksperimental

unread,
Jul 15, 2022, 8:07:46 AM7/15/22
to elixir-l...@googlegroups.com
Hi Julian.
It think it is a good addition.
It is also analog to String.contains?/2

On Fri, 15 Jul 2022 04:58:59 -0700 (PDT)
Julian Somoza <julian...@gmail.com> wrote:

> Hi everyone, I was working on a suggestion to add a new function on
> the List module in order to find out if a given sublist is inside
> another list.
>
> We currently have *List.starts_with?/2* but this only match the
> beginning:
>
> iex(1)> List.starts_with?([1, 2, 3], [1, 2])
> true
>
> Besides, we have *MapSet.subset*, but it ignores the elements order:
>
> iex(1)> MapSet.subset?(MapSet.new([1, 2]), MapSet.new([1, 2, 3]))
> true
>
> iex(1)> MapSet.subset?(MapSet.new([2, 1]), MapSet.new([1, 2, 3]))
> true
>
> I suggest a new function that works like *List.starts_with?/2* but

Aleksei Matiushkin

unread,
Jul 15, 2022, 8:38:34 AM7/15/22
to elixir-l...@googlegroups.com
I have not benchmarked it, but I believe the recursive approach would be better

```
defmodule MyList do
 def contains?(list, sublist, strict? \\ false)

 def contains?(_, [], _), do: true
 def contains?([], _, _), do: false
 def contains?([h | t], [hh | _] = sub, strict?) when h != hh, do: not strict? and contains?(t, sub, strict?)
 def contains?([h | t], [h | tt], strict?), do: contains?(t, tt, true) or contains?(t, [h | tt], strict?)
end
```


>
> I aim to complement the List.starts_with?/2. This function could be
> the same or even more useful because it finds the ordered sublist
> also in the middle or in the end.
>
> Looking forward for your comments,
> BR,
> Julian Somoza.
>
>



--
Aleksei MatiushkinSoftware Engineer - R&D
 
 


8 Devonshire Square, London, EC2M 4PL, United Kingdom
Torre Mapfre, Planta 22, Marina, 16-18, 08005 Barcelona, Spain

  








LinkedIn    Twitter    YouTube
 
Kantox Limited is a UK private company with registered company number 07657495 and registered address at 8 Devonshire Square, London EC2M 4PL, United Kingdom. We are authorised with the UK Financial Conduct Authority (FCA) under the Payment Service Regulation 2017 as a Payments Institution (FRN 580343) for the provision of payment services and with HMRC as a Money Service Business Registration No.12641987.
Kantox European Union, S.L.  is a Spanish private company with tax ID number B67369371 and registered address at Torre Mapfre, Planta 22, Marina, 16-18, 08005 Barcelona, Spain. Kantox is authorized by the Bank of Spain, with registration number 6890, which is the supervisor of the Spanish banking system along with the European Central Bank. Additionally, we are supervised by SEPBLAC, the Supervisory Authority for the prevention of money laundering and terrorist financing in Spain.
KANTOX is the Controller for the processing of data in accordance with the GDPR and LOPDGDD for the purpose of maintaining a commercial relationship. You may exercise your rights of access and rectification, portability, restriction and opposition by writing to KANTOX to the email: gd...@kantox.com. You have your right to make a complaint at www.aepd.es.  

Julian Somoza

unread,
Jul 15, 2022, 9:32:10 AM7/15/22
to elixir-l...@googlegroups.com
Thanks for your replies,

yes, it could be an String.contains? analog:

[1,2,"3",4] |> Enum.join |> String.contains?("34")
[1,:a,"3",4] |> Enum.join |> String.contains?("a3")

About the recursive version, it seems to have best performance:

image.png

image.png

José Valim

unread,
Jul 15, 2022, 10:16:25 AM7/15/22
to elixir-lang-core
Thanks for the proposal.

The issue with contains? is that it is slightly ambiguous in that it is unclear if List.contains?(list, [1, 2]) is meant to be positive for [[1, 2]] or [1, 2]. In Strings there is no such ambiguity. So I would personally avoid contains? in favor of something clearer. Therefore I would love to see if other languages have this function and what they call it. :)

Regarding the implementation, I also think one of the several string lookup techniques could be used here. The recursive approach is better but it has O(m*n) worst case complexity. The "bad character rule" from Boyer–Moore shouldn't add too much complexity and can provide a meaningful improvement.

Zach Daniel

unread,
Jul 15, 2022, 10:21:37 AM7/15/22
to elixir-l...@googlegroups.com
An interesting question that should be defined that highlights the main difference between MapSet and List: what is the result of `List.includes_sublist?([1, 2, 3], [1, 1, 1])`? If the answer is `true` then I think the function should still be called `List.subset` because that is what its really determining (i.e the implication is that this operation treats the lists as sets). Alternatively, `List.all_contained?(subject, search)` may be clearer, and lends itself to further additions like `List.any_contained?` and `List.none_contained?`.


On Fri, Jul 15, 2022 at 10:16 AM, José Valim <jose....@dashbit.co> wrote:
Thanks for the proposal.

The issue with contains? is that it is slightly ambiguous in that it is unclear if List.contains?(list, [1, 2]) is meant to be positive for [[1, 2]] or [1, 2]. In Strings there is no such ambiguity. So I would personally avoid contains? in favor of something clearer. Therefore I would love to see if other languages have this function and what they call it. :)

Regarding the implementation, I also think one of the several string lookup techniques could be used here. The recursive approach is better but it has O(m*n) worst case complexity. The "bad character rule" from Boyer–Moore shouldn't add too much complexity and can provide a meaningful improvement.

On Fri, Jul 15, 2022 at 3:32 PM Julian Somoza <julian.somoza@gmail.com> wrote:
Thanks for your replies,

yes, it could be an String.contains? analog:

[1,2,"3",4] |> Enum.join |> String.contains?("34")
[1,:a,"3",4] |> Enum.join |> String.contains?("a3")

About the recursive version, it seems to have best performance:





El vie, 15 jul 2022 a las 9:38, Aleksei Matiushkin (<aleksei.matiushkin@kantox.com>) escribió:
I have not benchmarked it, but I believe the recursive approach would be better

```
defmodule MyList do
 def contains?(list, sublist, strict? \\ false)

 def contains?(_, [], _), do: true
 def contains?([], _, _), do: false
 def contains?([h | t], [hh | _] = sub, strict?) when h != hh, do: not strict? and contains?(t, sub, strict?)
 def contains?([h | t], [h | tt], strict?), do: contains?(t, tt, true) or contains?(t, [h | tt], strict?)
end
```

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.


--
Aleksei MatiushkinSoftware Engineer - R&D
 
 


8 Devonshire Square, London, EC2M 4PL, United Kingdom
Torre Mapfre, Planta 22, Marina, 16-18, 08005 Barcelona, Spain

  








LinkedIn    Twitter    YouTube
 
Kantox Limited is a UK private company with registered company number 07657495 and registered address at 8 Devonshire Square, London EC2M 4PL, United Kingdom. We are authorised with the UK Financial Conduct Authority (FCA) under the Payment Service Regulation 2017 as a Payments Institution (FRN 580343) for the provision of payment services and with HMRC as a Money Service Business Registration No.12641987.
Kantox European Union, S.L.  is a Spanish private company with tax ID number B67369371 and registered address at Torre Mapfre, Planta 22, Marina, 16-18, 08005 Barcelona, Spain. Kantox is authorized by the Bank of Spain, with registration number 6890, which is the supervisor of the Spanish banking system along with the European Central Bank. Additionally, we are supervised by SEPBLAC, the Supervisory Authority for the prevention of money laundering and terrorist financing in Spain.
KANTOX is the Controller for the processing of data in accordance with the GDPR and LOPDGDD for the purpose of maintaining a commercial relationship. You may exercise your rights of access and rectification, portability, restriction and opposition by writing to KANTOX to the email: gdpr@kantox.com. You have your right to make a complaint at www.aepd.es.  

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4JYnayA0QnDTqwJrYUT4b8tokk8bpD2UD9%2BGP0LX%3DgDQw%40mail.gmail.com.

Julian Somoza

unread,
Jul 15, 2022, 11:07:29 AM7/15/22
to elixir-l...@googlegroups.com
Hi Zach, absolutely no.

List.includes_list?([1, 2, 3], [1, 1, 1])
false

I like the idea behind any_contained? and none_contained? 

I think all_contained? is not clear about the strict order of the elements...

Probably think on the function name is hardest than the logic behind... :)



El vie, 15 jul 2022 a las 11:21, Zach Daniel (<zachary....@gmail.com>) escribió:
An interesting question that should be defined that highlights the main difference between MapSet and List: what is the result of `List.includes_sublist?([1, 2, 3], [1, 1, 1])`? If the answer is `true` then I think the function should still be called `List.subset` because that is what its really determining (i.e the implication is that this operation treats the lists as sets). Alternatively, `List.all_contained?(subject, search)` may be clearer, and lends itself to further additions like `List.any_contained?` and `List.none_contained?`.


On Fri, Jul 15, 2022 at 10:16 AM, José Valim <jose....@dashbit.co> wrote:
Thanks for the proposal.

The issue with contains? is that it is slightly ambiguous in that it is unclear if List.contains?(list, [1, 2]) is meant to be positive for [[1, 2]] or [1, 2]. In Strings there is no such ambiguity. So I would personally avoid contains? in favor of something clearer. Therefore I would love to see if other languages have this function and what they call it. :)

Regarding the implementation, I also think one of the several string lookup techniques could be used here. The recursive approach is better but it has O(m*n) worst case complexity. The "bad character rule" from Boyer–Moore shouldn't add too much complexity and can provide a meaningful improvement.

On Fri, Jul 15, 2022 at 3:32 PM Julian Somoza <julian...@gmail.com> wrote:
Thanks for your replies,

yes, it could be an String.contains? analog:

[1,2,"3",4] |> Enum.join |> String.contains?("34")
[1,:a,"3",4] |> Enum.join |> String.contains?("a3")

About the recursive version, it seems to have best performance:





To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.


--
Aleksei MatiushkinSoftware Engineer - R&D
 
 


8 Devonshire Square, London, EC2M 4PL, United Kingdom
Torre Mapfre, Planta 22, Marina, 16-18, 08005 Barcelona, Spain

  








LinkedIn    Twitter    YouTube
 
Kantox Limited is a UK private company with registered company number 07657495 and registered address at 8 Devonshire Square, London EC2M 4PL, United Kingdom. We are authorised with the UK Financial Conduct Authority (FCA) under the Payment Service Regulation 2017 as a Payments Institution (FRN 580343) for the provision of payment services and with HMRC as a Money Service Business Registration No.12641987.
Kantox European Union, S.L.  is a Spanish private company with tax ID number B67369371 and registered address at Torre Mapfre, Planta 22, Marina, 16-18, 08005 Barcelona, Spain. Kantox is authorized by the Bank of Spain, with registration number 6890, which is the supervisor of the Spanish banking system along with the European Central Bank. Additionally, we are supervised by SEPBLAC, the Supervisory Authority for the prevention of money laundering and terrorist financing in Spain.
KANTOX is the Controller for the processing of data in accordance with the GDPR and LOPDGDD for the purpose of maintaining a commercial relationship. You may exercise your rights of access and rectification, portability, restriction and opposition by writing to KANTOX to the email: gd...@kantox.com. You have your right to make a complaint at www.aepd.es.  

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/l5mjnis5.050a125c-ff42-494a-a8c5-27e5415c052e%40we.are.superhuman.com.

Zach Daniel

unread,
Jul 15, 2022, 3:50:09 PM7/15/22
to elixir-l...@googlegroups.com
Ah, sorry I misunderstood. The names i proposed are not good :)


Julian Somoza

unread,
Jul 19, 2022, 12:19:29 PM7/19/22
to elixir-l...@googlegroups.com
I'm working on the Boyer-Moore-Horspool algorithm but Aleksei's Naïvi algorithm solution still being faster as they don't need to use module functions like Enum.at nor Map.has_index?

I'm refactoring my Boyer-Moore code to avoid any module function.

image.png

Reply all
Reply to author
Forward
0 new messages