readable streams, stack usage and unshift()

53 views
Skip to first unread message

Марк Коренберг

unread,
Jun 8, 2014, 12:40:03 PM6/8/14
to nod...@googlegroups.com
Example:
=========================
{createServer} = require 'net'

log = console.log.bind(console)

srv = createServer (s)->
  buffered_text = ''
  _try_read_line = ->
    log '_try_read_line called'

    # effect is the same if we setup 'data' event...
    binary_data = s.read()
    if binary_data is null
      return

    # count how many bytes=chars we have in temporary buffer
    incomplete_line_length = buffered_text.length
    buffered_text += binary_data.toString('ascii')
    separator_index = buffered_text.indexOf('\r\n')
    if separator_index is -1
      # read incomplete line next time
      return
    line = buffered_text.slice(0, separator_index)
    log 'Handling line', line
    buffered_text = ''
    # we cannot push text back to buffer, at least because ascii converts \x00 to \x20.
    unread_binary_part = binary_data.slice(separator_index + 2 - incomplete_line_length)
    log 'calling unshift'
    s.unshift(unread_binary_part)
    log 'unshift complete'

  s.on 'readable', _try_read_line
srv.listen(8001)
====================
Now, suppose, someone send 4 lines with on .send() syscall. So, we received it as one big chunk. And console output will be:
====================
_try_read_line called
Handling line LINE1
calling unshift
_try_read_line called
Handling line LINE2
calling unshift
_try_read_line called
Handling line LINE3
calling unshift
_try_read_line called
Handling line LINE4
calling unshift
_try_read_line called
unshift complete
unshift complete
unshift complete
unshift complete
====================

This mean that after calling .unshift(), _try_read_line() is called immediatelly. So, hacker may send many short lines in one big chunk and gain stack overflow.

If I do some tricks with process.onNextTick(), results will not be stable, it is not guaranteed order of calling such scheduled jobs and IO handlers, so my stream may be corrupted if my postponed unshift() will be called AFTER  _try_read_line, triggered by IO event.
Now, the question: How I can re-write this program to be stack-friendly ? Also, note, that after reading some specific line, we should unshift() and read unshifted data in another function. And also, I don not want re-make binary data buffer by hands. Also, example above is simplified function without any anti-dos checks.

Denys Khanzhyiev

unread,
Jun 9, 2014, 2:33:25 PM6/9/14
to nodejs


--
Job board: http://jobs.nodejs.org/
New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+un...@googlegroups.com.
To post to this group, send email to nod...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/c795ffc2-a300-483d-bc8d-858a82e33609%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

greelgorke

unread,
Jun 10, 2014, 3:32:37 AM6/10/14
to nod...@googlegroups.com
here is the official example of using unshift in a parser: http://nodejs.org/api/stream.html#stream_readable_unshift_chunk

recomendation here is to implement a Transform stream instead, if you are unshifting often. There are some stream implementations that handle splitting already, i.E. https://www.npmjs.org/package/split
Reply all
Reply to author
Forward
0 new messages