On 4 Sep., 01:54, Jonathan Hsieh <
j...@cloudera.com> wrote:
> Daniel,
>
> We have tried to keep everything as byte arrays to avoid character encoding
> problems, but it looks like we may have missed some spots.
>
> I've looked at the RawOutputFormat and it doesn't look like the culprit.
>
> I think the bug in TailSource -- it reads lines using a method (readLine)
> which does character set interpretation.
>
> Can you file this a bug in the jira? (
issues.cloudera.org)
>
> Thanks,
> Jon.
>
> This sounds like a bug.
>