"substring" is a special case of "subsequence".
"OAR" is a subsequence of "cOronA viRus" but not a substring of it.
"OAR" is a substring of "bOARs" but not a prefix of it.
I am still in the dark about whether you want to view these things
as sequences of BYTEs, as encoded sequences of CODEPOINTS,
as encoded sequences of base-character+combining characters,
as sequence of grapheme clusters, or quite possibly as sequences
of tokens.
A lesson I learned back in 1980 is that "STRINGS ARE WRONG".
As Alan Perlis said, "The string is a stark data structure and
everywhere it is passed there is much duplication of process.
It is a perfect vehicle for hiding information." Since then experience
has only reinforced this: if you are looking inside a string, then you
probably should not be using a string.
Take a step backwards.
What do these strings (binaries) *represent*?
What is the logical structure of the things you are representing?
Why do you want an operation that is completely insensitive
to that logical structure?
Can two different strings represent the same abstract concept?
What, if anything, does *part* of a string represent?
On Tue, 31 Mar 2020 at 02:45, I Gusti Ngurah Oka Prinarjaya