How to escape emoji?

1,927 views
Skip to first unread message

Necro mancer

unread,
Mar 5, 2016, 6:34:49 AM3/5/16
to Tasker
Hi,

I intercept messages using tasker, and would like to remove any emoji's present. 
Not "text" emoji
:-) 
but these 
👍😟 

Is there any way to do so? a real pain in the a$$.

Juergen Gruen

unread,
Mar 5, 2016, 8:22:06 AM3/5/16
to Tasker

Hi,

Emoticons are in the unicode range 1F600-1F64F. So there should be a way to filter them with javascript...

Juergen.

Necro mancer

unread,
Mar 5, 2016, 6:22:26 PM3/5/16
to Tasker
Do you know a proper way to do so? 

From google it seems the emoji range is outside the BMP range, and no easy way to go around it.

Juergen Gruen

unread,
Mar 6, 2016, 2:47:41 AM3/6/16
to Tasker

Interesting Topic.

Maybe not the most proper way:

var emoji = "😟 Hello 😟";
   
emoji
= emoji.replace(/[^\x00-\xFFFF]/g, "");

alert
(emoji);



Working with the surrogate pairs would be way more complicated, I guess. There should be JS-libraries for this(?).

On the other hand: isn't it dangerous to clear the emojis? Messages can get a total different meaning, with different emojis, imho.

Also, I often use just a single "thumbs up" as an answer...


easiuser

unread,
Mar 6, 2016, 2:56:55 PM3/6/16
to Tasker
I don't want to hijack this thread but I would like to be able to substitute them for text.  I have a Tasker profile to read my messages out load while driving.  With emojis I just get silence.  

I put my texts through a series of Search and Replace statements for common abbreviations before announcing them but can't figure out how to do so with emojis.

Would  be nice to announce "thumbs up" or see you soon "happy face" "kisses" etc.

I would think a speech engine would be better equipped to handle this but haven't found one.

Juergen Gruen

unread,
Mar 9, 2016, 3:43:24 PM3/9/16
to Tasker
I'm not sure if my approach with the regex is correct (it deletes everything that is not in the BMP (?)).

Here is another suggestion:

A1 Variable Set  [ Name: %text To: 😟 Hello 😃]

There is no other way than examine each character in the string and search for high and low surrogates, I guess:

    var hs = 0xD800; // high surrogate bitmask 1101100000000000
   
var ls = 0xDC00; // low surrogate bitmask  1101110000000000
   
   
var high;
   
var low;
   
var code;
   
   
for(i = 0; i < text.length; i++)
   
{
        high
= text.charCodeAt(i);
       
       
// char-code is high surrogate and there is another char-codes?    
       
// unicode characters higher than uFFFF are represented by two char-codes in JS  
       
if(((high >> 10) == 54) && ((i+1) < text.length))
       
{
           
// get next char-code
            low
= text.charCodeAt(i+1);
           
           
// char-code is low surrogate?
           
if(( low >> 10) == 55)
           
// calculate code point
           
{
               
// subtract high surrogate identifier and shift ten bits to the left
                high
= (high - hs) << 10;
               
// subtract low surrogate identifier
                low
= low - ls;
               
                code
= high + low + 0x10000;
               
               
// is Emoji?
               
if((code >= 0x1F600) && (code <= 0x1F64F))
                    alert
("Emoji at " + i + ": " + code.toString(16).toUpperCase());  
           
           
}
       
       
}
     
}




Needs some more testing...

Juergen Gruen

unread,
Mar 9, 2016, 3:55:53 PM3/9/16
to Tasker

... and there are three more smilies in the codeblock "Miscellaneous Symbols", u2600 to u26FF:

u2639, u263A, u263B












Necro mancer

unread,
Mar 12, 2016, 3:51:41 PM3/12/16
to Tasker


On Wednesday, March 9, 2016 at 10:43:24 PM UTC+2, Juergen Gruen wrote:
I'm not sure if my approach with the regex is correct (it deletes everything that is not in the BMP (?)).


Oh, i thought that was intentional.

Thanks for your effort, ill try that code. 

Necro mancer

unread,
Mar 12, 2016, 3:59:50 PM3/12/16
to Tasker
Eh, that didn't work out.

i've set %text variable and copy-paste your code into a javascriptlet action, when running it its stuck on the scriptlet action.

:/

can you link an xml file i can import directly? 

Juergen Gruen

unread,
Mar 12, 2016, 5:53:50 PM3/12/16
to Tasker
...
JavaSriptlet.tsk.xml

Necro mancer

unread,
Mar 12, 2016, 6:40:59 PM3/12/16
to Tasker
Thanks

So the problem is the string I entered did NOT have an emoji.

It spits an uncought error ILLEGAL at line 37.

Juergen Gruen

unread,
Mar 13, 2016, 3:12:51 AM3/13/16
to Tasker
Hi,

works here, also without emojis. Line 37 is the last line of the script. What string did you test it with?

I prefer using JavaScript instead of JavaScriptlet. See the script attached (some error handling included).
unicode.js

Necro mancer

unread,
Mar 13, 2016, 4:02:54 PM3/13/16
to Tasker
Seems to work fine.

It takes about 2 seconds to run though...  is it possible to compile this chunk of code and run a pre-compiled block to maybe help with the running time?

Juergen Gruen

unread,
Mar 13, 2016, 9:56:49 PM3/13/16
to Tasker
Hi,

finally I found a proper regex, I guess:

text = text.replace(/\uD83D[\uDE00-\uDE4F]/g, "");

Should be way faster.

Does anybody know, how to make this work in the Tasker Variable-Search/Replace-Action?


Necro mancer

unread,
Mar 14, 2016, 6:40:14 AM3/14/16
to Tasker
I was trying that, but the prefix \uD83D didn't match.
I tried splitting the emoji into its high and low chars (surrogates), but it didn't work, i always matched either the entire emoji or none.

So i started suspecting the emoji is being represented as a whole and not by 2 different chars. i tried \u1F601 , which obviously didn't work.
So then i tried matching against the emoji itself, and it worked. It also accepts the emoji as character class [😁]

Then i tried to make a class from the first to last emoji [😁-🕧]  but this spits out an error. This is probably due to both emoji's belonging to different classes, like [0-Z] would spit an error due to 0 belonging to numbers and Z to characters. 
class range within the same type does work [😁-🙏] 



This class covers the entire emoji range

[[😁-🙏][✂-➰][🚀-🛀][Ⓜ-🉑][©-🗿][😀-😶][🚁-🛅][🌍-🕧]]

based on the emoji list from


tested with some random emojis
Task description:

♨A1: Variable Set [ Name:%text To:😁😁
😁😁😥😭😢😜☺🎄🎥🔮🎒#⃣🔤#⃣♡😜📷🍶🍪🍑🐼🐵🌋✈🚁 Do Maths:Off Append:Off ] 
A2: Variable Search Replace [ Variable:%text Search:[[😁-🙏][✂-➰][🚀-🛀][Ⓜ-🉑][©-🗿][😀-😶][🚁-🛅][🌍-🕧]] Ignore Case:Off Multi-Line:Off One Match Only:Off Store Matches In: Replace Matches:On Replace With:k ] 
A3: Flash [ Text:%text Long:Off ] 


*some emoji's aren't supported (on android?) and were listed as #⃣ as a result.


Message has been deleted

Necro mancer

unread,
Mar 14, 2016, 6:55:49 AM3/14/16
to Tasker


easiuser 

for a way to replace emoji's with their text description, check this

Juergen Gruen

unread,
Mar 14, 2016, 8:30:00 AM3/14/16
to Tasker
text = text.replace(/\uD83D[\uDE00-\uDE4F]/g, "");

To clarify: that works with Javascript...

Necro mancer

unread,
Mar 19, 2016, 2:35:51 PM3/19/16
to Tasker
Yeah, but how to make it work in Tasker?

note that there are some letters (from other languages) between 
U+00AE (®) and U+203C (‼)
splitting this group 
[©-🗿]
to
[©-®][‼-🗿]

fixes it.

If characters from your language is being removed, narrow it down to between what two emoji characters those letters are, and split the group to, and from those two emoji characters.

The task is attached.

Remove_Emoji.tsk.xml

Juergen Gruen

unread,
Mar 19, 2016, 2:55:25 PM3/19/16
to Tasker


Am Samstag, 19. März 2016 13:35:51 UTC-5 schrieb Necro mancer:
Yeah, but how to make it work in Tasker?

Action->Code->JavaScript or JavaScriptlet

Necro mancer

unread,
Mar 20, 2016, 6:50:29 AM3/20/16
to Tasker
I've put it as a scriptlet and it does not seem to work.
Reply all
Reply to author
Forward
0 new messages