This should be handled well by Python itself, you do not need a tokenizer for letters.
1. Make sure you are using Python 3, not Python 2.7
2. Read in the file with the correct encoding.
3. If s is a string containing arabic, any iteration or list operation will process one character at a time. E.g.,
letters = list(s) or
for x in s: