Matching to the beginning and end of the stream

23 views
Skip to first unread message

Andrew Liles

unread,
Nov 16, 2012, 3:03:37 AM11/16/12
to streamfly...@googlegroups.com
I have a requirement to strip of the beginning & end of a stream.
The content is JSON and might be variable.

I imagine that I could easily achieve the head of the stream filter since the source JSON is going to be like this:

{"responseHeader":{"status":0,"QTime":1,"params":{"wt":"json","q":"*:*"}},"response":{"numFound":17,"start":0,"docs":[

 so with a cunning Regex I will be highly unlikely to find this in the midlde of the stream.  I then set the RegexModifier to replace it with the empty string.  Easy.

The more difficult is the footer which is just like this:

]}}

which is very likely to exist in the middle of the stream.

You imply
RegexModifier supports all of the Regex syntax, but what about "$"; what is the meaning in this case. If it was "end of stream" that would be great,
but I imagine that would have been hard to implement.

 more.


Andrew.

rw...@gmx.de

unread,
Mar 8, 2013, 2:49:26 PM3/8/13
to streamfly...@googlegroups.com
Don't worry, "$" works exactly the way as it is expected. If the flag Pattern.MULTILINE is set, then "$" matches both the line terminator and the end of the stream. If the flag is not set, then it matches only the end of the stream.

At the end of my post you will find two unit tests that document the behaviour.

I very much apologize for the late reply. Your post was inadvertently moved to the wrong folder. By the way, I will add your question to a new FAQ wiki page.

Rod

    public void testRemovalAtTheEndOfStream_notUsingMultiLineFlag()
            throws Exception {

        String startOfStream = StringUtils.repeat("]}}", 10000);
        String entireStartOfString = startOfStream + "]}}\n" + startOfStream
                + "]}}\n" + startOfStream;
        Reader originalReader = new StringReader(entireStartOfString + "]}}");
        Modifier myModifier = new RegexModifier(Pattern.quote("]}}") + "$", 0,
                "", 0, 3);
        Reader modifyingReader = new ModifyingReader(originalReader, myModifier);
        String output = IOUtils.toString(modifyingReader);
        assertEquals(entireStartOfString, output);
    }

    public void testRemovalAtTheEndOfLine_usingMultiLineFlag() throws Exception {

        String startOfStream = StringUtils.repeat("]}}", 3);
        Reader originalReader = new StringReader(startOfStream + "]}}\n"
                + startOfStream + "]}}\n" + startOfStream + "]}}");
        Modifier myModifier = new RegexModifier(Pattern.quote("]}}") + "$",
                Pattern.MULTILINE, "", 0, 3);
        Reader modifyingReader = new ModifyingReader(originalReader, myModifier);
        String output = IOUtils.toString(modifyingReader);
        assertEquals(startOfStream + "\n" + startOfStream + "\n"
                + startOfStream, output);
    }

rw...@gmx.de

unread,
Mar 10, 2013, 10:26:55 AM3/10/13
to streamfly...@googlegroups.com
Just a word about matching the beginning of a stream: Look-behind constructs like ^ are handled differently by Streamflyer and Java's String.replaceAll(). 

I have just pointed out the difference on the webpage.

As most users would like to use Streamflyer as memory-optimized alternative to Java's String.replaceAll(), in the next release of Streamflyer there won't be such a difference between Streamflyer and String.replaceAll().

Cheers
Rod


Reply all
Reply to author
Forward
0 new messages