[PSR-17] Define where the file position indicator (cursor) ends up to ensure Stream compatibility

258 views
Skip to first unread message

Martijn van der Ven

unread,
Dec 28, 2018, 7:07:31 PM12/28/18
to PHP Framework Interoperability Group
There seems to be some disagreement on where the file position indicator (cursor) ends up upon the creation of a Stream instance. I have done some minor testing and put the results on GitHub. This can lead to incompatibilities when doing something like:

echo (new StreamFactory)->createStream('Hello!')->read(6);

  • createStream() seems to be the most contested. The cursor goes at the start or end of the string. Even more surprising (to me) is that Diactoros will result in a different state depending on whether you use its own factory or the tuupola one. (Again showing how compatibility may be an issue here!)
  • createStreamFromResource() seems to be agreed upon by all implementations: keep the cursor wherever it was at in the original resource. Even without specifically being defined by the PSR this makes a lot of sense to me, and I will propose adding a test for this to the (official) unit tests.
  • createStreamFromFile() has one of the tested implementations do its own thing: berlioz/http-message copies the contents of the file into a new stream and puts the cursor at the end. Every other implementations puts the cursor at the start, matching where it would be if fopen() is called on a file.
Of course the (official) unit tests that go with the PSR-17 interfaces do not have tests for any of this as it isn’t covered by the PSR itself.

The default for createStream() is a coin toss to me. I see no clear argument for either position. But I do think the PSR should define a location. This does not sound like something that can just be errata’d in, and defining behaviour now will lead to breaking changes in implementations :( Not sure what the way is to start addressing this.

For *FromResource() there seem to be good reasons to just leave the cursor be where it is, and I believe this can be mentioned by the PSR. As all implementations do this already, maybe this can be an errata of some kind?

For *FromFile() I don’t know what can be defined, other than saying it should follow what fopen() does by default. For files this will mean the cursor position is set to the start of the stream, I am not sure what the cursor does for other protocols/wrappers/schemes.

Ideas?


Woody Gilk

unread,
Dec 29, 2018, 11:31:57 AM12/29/18
to PHP Framework Interoperability Group
Good report, Martijn!

From my perspective, it would make sense that the position would be always be set to the end of the stream, as if the stream had just been written, mirroring the fopen(), fwrite() sequence that would normally be done.

I'm not quite sure on the bylaws about level of recommendation, but an errata to the following would cover this:

Streams SHOULD have their internal pointer set to the end of the underlying resource.

FIG, would having this as an errata be acceptable?

--
You received this message because you are subscribed to the Google Groups "PHP Framework Interoperability Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-fig+u...@googlegroups.com.
To post to this group, send email to php...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/php-fig/508ab231-63e2-485c-9a9a-0ff3554815ca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martijn van der Ven

unread,
Dec 29, 2018, 12:22:08 PM12/29/18
to PHP Framework Interoperability Group
On Saturday, 29 December 2018 17:31:57 UTC+1, Woody Gilk wrote:
From my perspective, it would make sense that the position would be always be set to the end of the stream, as if the stream had just been written, mirroring the fopen(), fwrite() sequence that would normally be done.

I'm not quite sure on the bylaws about level of recommendation, but an errata to the following would cover this:

Streams SHOULD have their internal pointer set to the end of the underlying resource.

Do you imagine this for all Stream instances or just those created with createStream()?

I can see a problem with trying to move the cursor within others as the underlying file/resource may not be seekable (thus something like fseek($handle, 0, SEEK_END) wouldn’t work) or because for a true streaming resource the end of the data stream may not be known yet (e.g. live streaming). This is why my initial thought was that all three methods may require separate cursor behaviour.

Larry Garfield

unread,
Dec 29, 2018, 2:34:13 PM12/29/18
to php...@googlegroups.com
I think this may technically be a PSR-7 issue, not PSR-17, or perhaps it's an
area that overlaps both. I ran into this issue myself when working on
ApiProblem. and it took me a while to figure out that the cursor was not where
I expected it to be. :-)

A SHOULD errata sounds like the right solution, I agree. What that SHOULD
should be, I'm not sure yet; I largely defer to the PSR-7/17 WGs and Editors.

--Larry Garfield

On Saturday, December 29, 2018 10:31:17 AM CST Woody Gilk wrote:
> Good report, Martijn!
>
> From my perspective, it would make sense that the position would be always
> be set to the end of the stream, as if the stream had just been written,
> mirroring the fopen(), fwrite() sequence that would normally be done.
>
> I'm not quite sure on the bylaws about level of recommendation, but an
> errata to the following would cover this:
>
> Streams SHOULD have their internal pointer set to the end of the underlying
> resource.
>
> FIG, would having this as an errata be acceptable?
> --
> Woody Gilk
> https://shadowhand.me
>
>
> On Fri, Dec 28, 2018 at 6:07 PM Martijn van der Ven <
>
> martijn....@gmail.com> wrote:
> > There seems to be some disagreement on where the file position indicator
> > (cursor) ends up upon the creation of a Stream instance. I have done some
> > minor testing and put the results on GitHub
> > <https://github.com/Zegnat/php-psr17-test-cursor>. This can lead to
> > incompatibilities when doing something like:
> >
> > echo (new StreamFactory)->createStream('Hello!')->read(6);
> >
> > - createStream() seems to be the most contested. The cursor goes at
> > the start or end of the string. Even more surprising (to me) is that
> > Diactoros will result in a different state depending on whether you use
> > its
> > own factory or the tuupola one. (Again showing how compatibility may be
> > an
> > issue here!)
> > - createStreamFromResource() seems to be agreed upon by all
> > implementations: keep the cursor wherever it was at in the original
> > resource. Even without specifically being defined by the PSR this makes
> > a
> > lot of sense to me, and I will propose adding a test for this to the
> > (official) unit tests.
> > - createStreamFromFile() has one of the tested implementations do its
> > own thing: berlioz/http-message copies the contents of the file into a
> > new
> > stream and puts the cursor at the end. Every other implementations puts
> > the
> > cursor at the start, matching where it would be if fopen() is called on
> > a
> > file.
> >
> > Of course the (official) unit tests
> > <https://github.com/http-interop/http-factory-tests> that go with the
signature.asc

Martijn van der Ven

unread,
Dec 30, 2018, 3:26:09 PM12/30/18
to PHP Framework Interoperability Group
Just to continue on my earlier report, I have created a project to run the http-factory-tests against as many PSR-17 providers as I could find. In the end there were 14 packages I found testable, resulting in 18 different PSR-17 & PSR-7 combinations to actually test. See php-psr17-http-factory-tests-runner for details.

By running a modified version of the tests, one that includes tests for StreamInterface::tell(), I can now confirm the following:
  1. All implementations keep the cursor wherever it is in the resource when StreamFactoryInterface::createStreamFromResource() is used.
  2. berlioz/http-message is the only implementation (out of 14) to move the cursor to the end when StreamFactoryInterface::createStreamFromFile() is used. All other implementations have the cursor at the start of the Stream. (This is probably because this is where fopen() puts it.)
  3. Implementations are very split on where the cursor goes when StreamFactoryInterface::createStream() is used. Depending on how you count the wrapper implementations by http-interop and tuupola the split is almost 50/50.

The following implementations put the cursor at the start (position 0) of the stream for StreamFactoryInterface::createStream():
  • bulldog/http-factory (Guzzle)
  • chillerlan/php-httpinterface
  • http-interop/http-factory-diactoros
  • http-interop/http-factory-guzzle
  • http-interop/http-factory-slim
  • sunrise/http-factory
  • viserio/http-factory
  • zendframework/zend-diactoros

The following implementations put the cursor at the end (position 29 in the test) of the stream for StreamFactoryInterface::createStream():
  • berlioz/http-message
  • chiron/http
  • nyholm/psr7
  • rancoud/http
  • tuupola/http-factory (Guzzle, Nyholm, Slim, Diactoros)

Any implementations not on either list had other test failures keeping them from performing correctly.

I am happy to further assist the WG wherever I can!

Rasmus Schultz

unread,
Jan 7, 2019, 3:33:32 AM1/7/19
to PHP Framework Interoperability Group
I'm pretty sure this is a PSR-17 issue, as it deals with creation.

PSR-7 does not specify constructors, so it probably shouldn't specify anything about creation.

Rasmus Schultz

unread,
Jan 7, 2019, 3:57:00 AM1/7/19
to PHP Framework Interoperability Group
> Streams SHOULD have their internal pointer set to the end of the underlying resource.

I have to strongly disagree here, and would suggest:

Stream MUST have their internal pointer set of the start of the underlying resource.

Please consider existing implementations here.

Diactoros rewinds streams created from strings:


It leaves streams created from a file path with the pointer at zero:



Emitters of existing libraries do not rewind streams:




Introducing a requirement to move the file-pointer to the end when creating a stream from a file path is a serious breaking change versus any existing emitter implementation I could find - so mandating a change at this point could create serious problems.

It also has a negative impact on performance in the most common scenario, where you're just opening a file so you can attach it as the body of a Response - you'll end up unnecessarily seeking to the end of the file, just so you can rewind it again and then read it. So this could be harmful to performance of the majority use-case.

Based on this, I'd suggest the following amendments to the specification of StreamFactoryInterface:

1. createStream() MUST return a stream with the internal file-pointer at 0.
2. createStreamFromFile() MUST return a stream with the internal file-pointer at 0.
3. createStreamFromResource() MUST NOT modify the internal file-pointer.

Both createStream() and createStreamFromFile() create Stream instances from existing content, so they should work consistently, and the majority use-case for opening a stream with existing content is to read/transmit that content, e.g. in the body of a Response.

As for createStreamFromResource(), we can't/shouldn't make any assumptions - if the user decides to open the stream on their own, rather than using one of the other two methods, we should assume they know what they're doing. The resource could be a streaming resources from another HTTP server, for example, so doing anything with the file-pointer could either break something (if the file stream isn't repeatable) and/or have a potentially critical performance impact (such as accidentally reading the whole stream twice.)

Rasmus Schultz

unread,
Jan 7, 2019, 10:56:03 AM1/7/19
to PHP Framework Interoperability Group
> This does not sound like something that can just be errata’d in, and defining behaviour now will lead to breaking changes in implementations

I have to disagree with this assertion.

There's no breaking change versus a previous, incorrect specification - the problem here is a simple omission, not an error in the specification.

The interface won't change, so this isn't a breaking change to the package either, and should be versioned either as a bugfix (1.0.1) or a feature (1.1.0).

Yes, this will render some existing implementations incompatible with the specification - that's okay, these packages will need bugfix releases, and will then be compatible with both versions of the PSR package.

No, you won't be able to look at the package and tell from it's requirements whether it's been updated with regards to the corrected specification - but that's probably okay too, as this will be a quick and easy fix for most packages. (Packages that don't satisfy the updated specification simply had bugs, we just didn't know before, and that's normal - that's what bugfix releases are for.)

I understand you strictly could make an argument that this is a breaking change - but it's much easier and simpler for everyone to just treat this as a bugfix.

The only alternative is an entirely new PSR, which is much harder and messier for all vendors and consumers involved and just sounds like a meaningless waste of time.

Some of the existing implementations will be rendered "buggy", but bugfix releases will need to be published either way, and a bugfix is a much smoother transition for everyone. Again, there's no breaking change versus a previous specification - it's an omission, not an error.

Lastly, it would surely be better if everyone gets these fixes automatically with a composer update - as opposed to forcing everyone to do the whole dependency song-and-dance just to get what is essentially a simple bugfix.

Let's correct this mistake and move on, please.

Tobias Nyholm

unread,
Jan 7, 2019, 3:44:34 PM1/7/19
to PHP Framework Interoperability Group
PSR-7 nor PSR-17 specify where the cursor should be positioned and I don't think it should either.

Sure, one would think it makes sense to have the pointer at the beginning of the stream because "that is why I will start reading". But that does not take middleware into consideration. A middleware that write stuff to the end of the stream would never like the cursor to be at the beginning...

Other edge cases to consider are non-seekable streams or just non-rewindable?

I think we should not try to amend a PSR and specify a "default behavior" because:
- Since specifying a "default behavior" for this is close to impossible (just see Woody's and Rasmus' short discussion)
- Neither currently PSR mention it
- Thanks to Martijn research we can see that the implementations are rather spitted as well

The consumer of a stream must always make sure to move the cursor before the are using a stream. (Or use __toString())

Rasmus Schultz

unread,
Jan 9, 2019, 2:45:07 PM1/9/19
to PHP Framework Interoperability Group
> Sure, one would think it makes sense to have the pointer at the beginning of the stream because "that is why I will start reading". But that does not take middleware into consideration. A middleware that write stuff to the end of the stream would never like the cursor to be at the beginning...

That's a marginal use-case.

It's bordering on fiction, really - which middleware would try to append to the body of an existing stream? You can't just append to either HTML, JSON or most other body-types - you'd just end up with a corrupted response. Any middleware that's going to write to a stream is most likely going to create an empty stream itself.

The majority use-case is the body goes to an emitter, which is going to read the stream, end-to-end.

We have to optimize for that.

Two of the three stream factory-methods defined by PSR-17 *create* the stream, and therefore it's important to know what state they're created in - the resource is just another object, which happens to have an offset property, and for mutable objects such as streams, what use is a factory that returns objects in an undefined/unpredictable state?

Note that an emitter can't work around this issue, because not all streams are seekable, and some aren't event repeatable. An emitter can't make any assumptions - it has to assume the stream is ready to ready.

For the last of the PSR-17 stream factory methods, where the resource handle is provided to the factory-method, it's safe to assume you know what state your resource is in, since you created it yourself - in every case, a predictable result is an absolute must.

The stream factory is largely unusable as it is.

> specifying a "default behavior" for this is close to impossible

For PSR-7 it is, because no part of the spec deals with creation.

For PSR-17, it's very easy to specify as I did for the three stream-methods.


> Thanks to Martijn research we can see that the implementations are rather spitted as well

That's the whole point - that's why we need to address this.

> The consumer of a stream must always make sure to move the cursor before the are using a stream.  (Or use __toString())

Neither of those are real options.

As said, you may not be able to move the cursor at all - not all streams are seekable. Some streams aren't repeatable. And __toString() is absolutely not an option in the case of an emitter, if you're going to send large binary files.

There's no way around it.

This was a serious oversight, and we have to amend.

Martijn van der Ven

unread,
Jan 10, 2019, 7:09:12 AM1/10/19
to PHP Framework Interoperability Group
It sounds like an errata is possible still? That is good to hear! I am not particularly familiar with the bylaws.

[Diactoros] leaves streams created from a file path with the pointer at zero:

This is a bit of an overstatement IMO. It does not set the pointer to zero, instead it leaves the pointer untouched from where fopen() puts it. For local files this seems to be at zero, but I am not sure if this is actual documented behaviour. I could not find it in the PHP documentation, but it may just be default handling by C’s fopen()?

It may also be worth noting that many implementations simply map createStreamFromFile() to fopen() and thus allow URLs that have registered protocol handlers (PSR-17 calls these “stream URI”). Can we confirm that these always allow the pointer to be set to position zero? (Or always default to that position?) If we cannot, requiring that implementations MUST put the pointer at zero will simultaniously mean createStreamFromFile can no longer map straight to fopen().

It is because of this mapping, and the many references PSR-17 already makes to fopen(), that I wondered in my very first email whether the cursor position should be defined as having to match fopen()’s cursor position. This is also how I proposed testing it: https://github.com/http-interop/http-factory-tests/pull/39/commits/ca49679370c6b22bf8b3544c71e0a4a338c9cb8a

Emitters of existing libraries do not rewind streams: […]


Introducing a requirement to move the file-pointer to the end when creating a stream from a file path is a serious breaking change versus any existing emitter implementation I could find - so mandating a change at this point could create serious problems.

None of the emitters linked would be affected by changes to the pointer position as none of them call read(), instead having chosen to depend on the defined behaviour of __toString from PSR-7. This definition includes the fact it “MUST attempt to seek to the beginning of the stream before reading data” so calling rewind() isn’t neccessary from those emitters’ perspective.

No matter what cursor position ends up in a possible PSR-17 errata, those emitters will continue to function without issue.


And __toString() is absolutely not an option in the case of an emitter, if you're going to send large binary files.

If you have come across emitters in your search that support this, and thus depend on read() calls rather than __toString(), would you mind sharing them? Those will obviously be affected by any change that might be decided upon. I have not personally been able to find any.

Two of the three stream factory-methods defined by PSR-17 *create* the stream, and therefore it's important to know what state they're created in - […] what use is a factory that returns objects in an undefined/unpredictable state?

Agreed. The unpredictable outcome of a factory method is the main reason I started researching this behaviour, and why I wanted to prioritise getting tests into http-interop/http-factory-tests.

There is one more creation point that may need consideration:

(new ResponseFactory)->createResponse()->getBody()->write()

I have seen code like that in many places, and it will offload the creation of a Stream to the message factories or even the underlying (PSR-7) message implementations. This seems to be done mostly as a way of skipping the StreamFactory completely. So maybe it needs to be defined that the creation of a default stream for these cases always matches createStream('')? From code I have seen the expectation is for it to be an empty and writeable stream, which matches the default createStream().

All together I would think of changes more along the lines of:

  1. StreamFactoryInterface::createStream() MUST return a stream with the internal file-pointer at 0.
  2. StreamFactoryInterface::createStreamFromFile() MUST return a stream with the internal file-pointer matching fopen()’s internal file-pointer, alternatively at 0.
  3. StreamFactoryInterface::createStreamFromResource() MUST NOT modify the internal file-pointer.
  4. Streams created as defaults for MessageInterface::getBody() MUST match StreamFactoryInterface::createStream('').

Cheers,
Martijn

Rasmus Schultz

unread,
Jan 11, 2019, 4:26:44 AM1/11/19
to PHP Framework Interoperability Group
> Can we confirm that these always allow the pointer to be set to position zero?

I assumed this was just common knowledge?

I've never heard of a stream-wrapper that doesn't open streams with the file-pointer at zero - I'm pretty sure that's implied, if it isn't specified. (and I'd be surprised if it doesn't have test-coverage in the code-base somewhere.)

> None of the emitters linked would be affected by changes to the pointer position as none of them call read()

Here are just two that I happen to know of:



I'm pretty sure there are more - I couldn't find it, but I'm pretty sure I've seen an emitter for ReactPHP as well.

Basically anything that doesn't emit by the usual means (standard output) will have to read the stream.

I've opened a PR with my proposed amendments here:


Regarding the 4th change you're proposing, that would be an amendment to PSR-7, which I believe goes beyond the scope of this discussion and my proposed amendment.

I am also not convinced it makes sense to propose an amendment to PSR-7 that references the PSR-17 specification?

I don't want to conflate my own proposed amendment with a PSR-7 issue - if you want to propose an amendment to address the issue with PSR-7, I think that would make sense, but I think it needs to be described independently of PSR-17 and I'd suggest you open a separate discussion and a separate PR, since, while the topics are related, the changes do not pertain to the same PSR.

Rasmus Schultz

unread,
Jan 11, 2019, 4:37:07 AM1/11/19
to PHP Framework Interoperability Group
I accidentally opened the PR against the wrong project!

Sorry about that.

Here's the correct PR:

Message has been deleted

bo...@jfmedier.dk

unread,
Jan 11, 2019, 5:17:48 AM1/11/19
to PHP Framework Interoperability Group
I would agree on this behavior: 
 
1. createStream() MUST return a stream with the internal file-pointer at 0.
2. createStreamFromFile() MUST return a stream with the internal file-pointer at 0.
3. createStreamFromResource() MUST NOT modify the internal file-pointer.
 
I have experienced some issue where I thought this would be the behavior, and used time on figuring out why I didn't get any output until I figured out I needed to rewind the point on my own. If we have different behavior on this, I can't change the implementation, which is the idea of the standards, where some I need to rewind and others I don't...

Larry Garfield

unread,
Jan 11, 2019, 11:25:25 AM1/11/19
to php...@googlegroups.com
On Friday, January 11, 2019 3:37:07 AM CST Rasmus Schultz wrote:
> I accidentally opened the PR against the wrong project!
>
> Sorry about that.
>
> Here's the correct PR:
>
> https://github.com/php-fig/fig-standards/pull/1134

Changing the body of the spec itself is a no-go. "Addresses a problem with
the existing specification" is what Errata are for, and what should be used
here.

I'm still mostly agnostic on the clarification itself (I haven't really dug
into the problem space enough to say what the preferred behavior should be),
but the only way forward is to add an Errata section to the metadoc that
includes appropriate "SHOULD" statements and not touch the spec file itself.

This PR as filed is unmergable.

--Larry Garfield
signature.asc

Rasmus Schultz

unread,
Jan 12, 2019, 6:36:45 AM1/12/19
to PHP Framework Interoperability Group
Errata is very loosely described, so I was just taking a best guess to get this thing moving.

Where should an errata section go? In the meta (e.g. an added section 9) or in the main document?

(in RFCs, I think amendments are usually referenced in the main document using some kind of
annotation/identifier referencing an errata section at the end of the document?)

Thanks

Larry Garfield

unread,
Jan 12, 2019, 2:54:54 PM1/12/19
to php...@googlegroups.com
On Saturday, January 12, 2019 5:36:44 AM CST Rasmus Schultz wrote:
> Errata is very loosely described, so I was just taking a best guess to get
> this thing moving.
>
> Where should an errata section go? In the meta (e.g. an added section 9) or
> in the main document?
>
> (in RFCs, I think amendments are usually referenced in the main document
> using some kind of
> annotation/identifier referencing an errata section at the end of the
> document?)
>
> Thanks

1) Please be thread-consistent in posting. If someone bottom posts, keep
bottom posting. Thanks.

2) Errata go in the metadoc. See:

https://www.php-fig.org/psr/psr-6/meta/

for an example.

--Larry Garfield
signature.asc

Rasmus Schultz

unread,
Jan 16, 2019, 8:03:02 AM1/16/19
to PHP Framework Interoperability Group
I've added the meta-section.

How do we proceed from here?

Can we bring this to a vote?

Rasmus Schultz

unread,
Jan 29, 2019, 2:43:43 AM1/29/19
to PHP Framework Interoperability Group
Okay, so we're not done - there are more issues with stream state.


--
You received this message because you are subscribed to a topic in the Google Groups "PHP Framework Interoperability Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/php-fig/S5YIw-Pu1yM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to php-fig+u...@googlegroups.com.

To post to this group, send email to php...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages