I checked the stream created by CreateStreamOnHGLOBAL
(http://msdn.microsoft.com/en-us/library/aa378980.aspx) and it did not
behave like described. The differences where:
1) If NULL was passed as pstatstg (obviously pointer not valid)
E_INVALIDARG was returned instead of expected STG_E_INVALIDPOINTER (even
E_POINTER seems better).
2) If uninitialized pointer was passed as pstatstg (having value 0xcccccccc,
obviously pointer not valid) access violation happens and
STG_E_INVALIDPOINTER is not returned (or any value in fact).
I know that COM allows to return error codes different then those declared
on the interface (as opposed to success codes). (See
http://msdn.microsoft.com/en-us/library/ms688560.aspx and
http://msdn.microsoft.com/en-us/library/ms693442.aspx.) This is however a
special case in which the interface describes that particular kind of error
and assigns a code to it while implementation returns a different code.
So the stream object did some checks if it discovered the NULL value. It did
not however check other popular invalid values (like the 0xcccccccc value).
Also it did not check for access violation while it could try to use SEH
(http://msdn.microsoft.com/en-us/library/ms680657.aspx) here - if I am not
mistaken.
Why is it so? Is it just poor implementation? Or is it normal for COM
objects to not check for access violation? Doesn't it make the object
vulnerable to attacks?
Also another question arises:
3) Why introduce errors like STG_E_INVALIDPOINTER and use the in common
error cases (like invalid pointer argument) instead of the generic ones
(like E_POINTER)?
Adam Badura
Why is this important? Consider passing a valid pointer, then the issue
will become moot.
> So the stream object did some checks if it discovered the NULL value.
> It did not however check other popular invalid values (like the
> 0xcccccccc value).
How often do you check for 0xcccccccc (and other "popular" invalid
values, whatever these are) in your programs?
> Why is it so? Is it just poor implementation? Or is it normal for COM
> objects to not check for access violation?
Quite normal. Personally, I'd much rather the application blows in my
face during testing, than the exception being swallowed. Access
violation is a valuable debugging tool.
> Doesn't it make the object
> vulnerable to attacks?
Please describe the exploit. What exactly would this behavior of the
object allow an attacker to do that he wouldn't be able to do otherwise?
If the attacker's goal is to crash his own application, he can just do
int* p = NULL;
*p = 42;
> Also another question arises:
>
> 3) Why introduce errors like STG_E_INVALIDPOINTER and use the in
> common error cases (like invalid pointer argument) instead of the
> generic ones (like E_POINTER)?
If I had to guess, I'd say there are probably historical reasons.
Compound storage actually predates COM - there are vestiges of the old
system still remaining in the form of OleConvertIStorageToOLESTREAM and
OleConvertOLESTREAMToIStorage functions. Perhaps similar codes were
defined there, then ported over to the new model.
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925
I don't agree. You could as well say that any "logical error" checking is
pointless. Why to bother checking validity of an index when accessing vector
if we could just supply a correct index. And note that if it was not
important then why to check the NULL pointer since we could pass non-NULL
and issue would become moot...
>> So the stream object did some checks if it discovered the NULL value.
>> It did not however check other popular invalid values (like the
>> 0xcccccccc value).
>
> How often do you check for 0xcccccccc (and other "popular" invalid values,
> whatever these are) in your programs?
Personally? Never.
>> Why is it so? Is it just poor implementation? Or is it normal for COM
>> objects to not check for access violation?
>
> Quite normal. Personally, I'd much rather the application blows in my face
> during testing, than the exception being swallowed. Access violation is a
> valuable debugging tool.
Yes. I agree that during debugging I would rather have the exception then
return value. But if I am not mistaken then debugger gets chance of handling
exception prior to the SEH.
And please consider release - since in this mode application should work
most of the time. I would rather have an error return value and deal with it
(as usually you are expecting error return values also due to truly runtime
reasons - not only logical errors) then ExitProcess due to access violation.
>> Doesn't it make the object
>> vulnerable to attacks?
>
> Please describe the exploit. What exactly would this behavior of the
> object allow an attacker to do that he wouldn't be able to do otherwise?
I am not sure. That is why this was a question. However using SEH I would
receive control upon access violation. I would also receive some machine
state data. I could manipulate them and continue execution to abuse the
mis-implemented object's module. But that is just a guess.
Adam Badura
The object does check for NULL, it just returns a different error code
than the one you expect. Perhaps a minor annoyance, but no more than
that.
As to non-NULL values, it's in general impossible to verify that a
pointer is valid. Once you start passing random garbage values to the
method, I personally would be concerned more about a pointer that just
accidentally happens to point to valid memory and thus results in silent
data corruption, than one that causes the application to crash
immediately in an obvious way. The former would be much more difficult
to debug.
> And
> note that if it was not important then why to check the NULL pointer
> since we could pass non-NULL and issue would become moot...
I wouldn't mind if it just crashed when given a NULL pointer.
>>> So the stream object did some checks if it discovered the NULL
>>> value. It did not however check other popular invalid values (like
>>> the 0xcccccccc value).
>>
>> How often do you check for 0xcccccccc (and other "popular" invalid
>> values, whatever these are) in your programs?
>
> Personally? Never.
Thought so. Why aren't you worried that it makes your applications
vulnerable to attacks?
>>> Why is it so? Is it just poor implementation? Or is it normal for
>>> COM objects to not check for access violation?
>>
>> Quite normal. Personally, I'd much rather the application blows in
>> my face during testing, than the exception being swallowed. Access
>> violation is a valuable debugging tool.
>
> Yes. I agree that during debugging I would rather have the exception
> then return value. But if I am not mistaken then debugger gets chance
> of handling exception prior to the SEH.
>
> And please consider release - since in this mode application should
> work most of the time.
Again - I'd much rather the application crashed (even in release) than
silently corrupted random memory and continued working. There's no
telling what state it is in at this point: at best, it'll crash some
time later; at worst, it will lose user's data.
Also - do you really see yourself writing code like this:
HRESULT hr = pStream->Stat(pStatStg, 0);
if (hr == E_POINTER) {
// Now what?
}
I understand writing code to protect against run-time failures (e.g.
disk I/O error), but to protect against logical errors? Now, this would
be more reasonable:
assert(pStatStg != NULL);
// or even, for exceptionally paranoid
assert(!IsBadWritePtr(pStatStg, sizeof(*pStatStg));
HRESULT hr = pStream->Stat(pStatStg, 0);
Further, with IStream::Stat specifically, most of the time you'd
actually have something like this:
STATSTG statstg = {0};
HRESULT hr = pStream->Stat(&statstg, 0);
and there's absolutely no change that the pointer is NULL or invalid. If
you find yourself allocating STATSTG structure on the heap, you should
stop and ask why you are doing that.
> I would rather have an error return value and
> deal with it
How would you deal with this particular failure?
> (as usually you are expecting error return values also
> due to truly runtime reasons - not only logical errors)
Those are a different kettle of fish entirely. Of course they should be
accurately reported by the component.
>>> Doesn't it make the object
>>> vulnerable to attacks?
>>
>> Please describe the exploit. What exactly would this behavior of the
>> object allow an attacker to do that he wouldn't be able to do
>> otherwise?
>
> I am not sure. That is why this was a question. However using SEH I
> would receive control upon access violation.
And if the method just returns, you wouldn't? In fact, you had it before
calling the method.
> I would also receive
> some machine state data.
You would receive bits from memory in your own process' address space.
You could happily read all that memory before.
> I could manipulate them and continue
> execution to abuse the mis-implemented object's module.
Which would result in what, exactly, that you couldn't just do directly?
Code in ole32.dll doesn't have any special privileges that your own code
doesn't also have.
You can just jump to any address inside ole32.dll if you are so
inclined. Microsoft publishes debug symbols for all Windows DLLs, to
help you find out what function is at what address. All you'd achieve is
mess up your own process, and there are easier ways to do that if that's
your goal.
> As to non-NULL values, it's in general impossible to verify that a pointer
> is valid. Once you start passing random garbage values to the method, I
> personally would be concerned more about a pointer that just accidentally
> happens to point to valid memory and thus results in silent data
> corruption, than one that causes the application to crash immediately in
> an obvious way. The former would be much more difficult to debug.
Yes. Note however that just because we cannot prevent all memory access
errors it doesn't mean we should not try to detect those we can. Again just
because you cannot handle all real run-time errors (like you cannot handle
error of rapid lack of power or instant processor burn-out) it does not mean
that you skip detection of any run-time errors.
Access violation errors can be detected with SEH. So in my opinion they
should. Obviously if a caller passes a pointer to his/hers data but still
not a valid pointer then we cannot detect it without additional (unavailable
in this case) layer of memory abstraction. But again even with such layer we
would only know that the pointer is to memory block of valid size and type.
But still we would not know whether this is the "correct" block - callers
algorithm might be incorrect and causes writing to a different variable then
it should.
So to sum up we cannot detect all errors but we should at least try to
detect does we can.
>>> How often do you check for 0xcccccccc (and other "popular" invalid
>>> values, whatever these are) in your programs?
>>
>> Personally? Never.
>
> Thought so. Why aren't you worried that it makes your applications
> vulnerable to attacks?
I am. That is way I started to investigate SEH (still haven't added it to
the application). But I am doing end-users applications. If I were doing
general usage components then making them stable even when used incorrectly
would be even more important.
> Again - I'd much rather the application crashed (even in release) than
> silently corrupted random memory and continued working. There's no telling
> what state it is in at this point: at best, it'll crash some time later;
> at worst, it will lose user's data.
This is not correct. Note that I do not say that the object should have
ignored the access violation error. No way! But I would prefer it to return
error code instead of terminate my process. Why should some hypothetical
system critical process be terminated by an access violation error in some
minor side task of it (like in outputting some log info)?
Detection of invalid pointers by catching SEH access violation errors will
not cause random memory corruption. And the state is at the violating
function quite easy to determine and by proper error returning/handling
could be determined on the entire call stack.
> Also - do you really see yourself writing code like this:
>
> HRESULT hr = pStream->Stat(pStatStg, 0);
> if (hr == E_POINTER) {
> // Now what?
> }
No. But code like this:
HRESULT hr = pStream->Stat( pStatStg, 0 );
if ( FAILED( hr ) ) {
// if this operation was critical then return error
// else assume stat information are not available and continue without
them
}
>> I am not sure. That is why this was a question. However using SEH I
>> would receive control upon access violation.
>
> And if the method just returns, you wouldn't? In fact, you had it before
> calling the method.
>
>> I would also receive
>> some machine state data.
>
> You would receive bits from memory in your own process' address space. You
> could happily read all that memory before.
>
>> I could manipulate them and continue
>> execution to abuse the mis-implemented object's module.
>
> Which would result in what, exactly, that you couldn't just do directly?
> Code in ole32.dll doesn't have any special privileges that your own code
> doesn't also have.
I still don't know much about SEH. But if access violation error was caught
by the object itself and translated to E_POINTER return value the caller can
do nothing more then to acknowledge that the pointer is invalid (regardless
if it is invalid by mistake or on purpose). But if the error is not caught
by the object it could be caught my abusing caller. He will receive CONTEXT
data which might (I don't know really - I am just guessing) allow him/her to
change the state as he wishes and continue execution in the object. But just
continuation is dangerous as know the objects state is really undetermined
as it did not end nicely.
Again don't be to much distracted with this IStream/Stat/ole32.dll example.
This is just an example. It might happen in much more critical object of
much more critical processes and modules. My question comes from that that
since this particular object behaves like that then maybe generally all COM
objects have to be assumed to not deal with access violation errors.
Adam Badura
The OS does a very good job of detecting those, by throwing access
violation. What more do you want? If you are so inclined, surround every
COM call with __try/__except. That's not much different than checking
every return value for E_POINTER (and in my humble opinion, both
approaches would be a waste of time).
> Access violation errors can be detected with SEH. So in my opinion
> they should.
So do it, what's stopping you?
> I am. That is way I started to investigate SEH (still haven't added
> it to the application). But I am doing end-users applications. If I
> were doing general usage components then making them stable even when
> used incorrectly would be even more important.
Swallowing an exception and masking it by an error code is doing your
caller a disservice, not a favor. An application with a logical error is
best crashed as soon as possible, rather than continuing to hobble along
in a crippled, indeterminate state doing unpredictable things.
>> Again - I'd much rather the application crashed (even in release)
>> than silently corrupted random memory and continued working. There's
>> no telling what state it is in at this point: at best, it'll crash
>> some time later; at worst, it will lose user's data.
>
> This is not correct. Note that I do not say that the object should
> have ignored the access violation error. No way! But I would prefer
> it to return error code instead of terminate my process. Why should
> some hypothetical system critical process be terminated by an access
> violation error in some minor side task of it (like in outputting
> some log info)?
System-critical process shouldn't be passing NULLs to components that
expect valid pointers. If the code is so broken as to do that, what
confidence do you have that it would correctly check the error code and
successfully recover? And wouldn't the effort spend on writing detection
and recovery code be better spent on fixing the original bug?
Further, if that process wants to recover anyway, it can do so by
handling the exception. Most of the time, you don't want to do that, but
in those hypothetical cases that you do, the option is available to you.
> Detection of invalid pointers by catching SEH access violation errors
> will not cause random memory corruption.
So, feel free to go ahead and do that in your program.
> And the state is at the
> violating function quite easy to determine and by proper error
> returning/handling could be determined on the entire call stack.
Your program is _already_ in a bad state, by definition - otherwise it
wouldn't be passing invalid pointers around. How exactly do you plan to
get back to a good state?
>> Also - do you really see yourself writing code like this:
>>
>> HRESULT hr = pStream->Stat(pStatStg, 0);
>> if (hr == E_POINTER) {
>> // Now what?
>> }
>
> No. But code like this:
>
> HRESULT hr = pStream->Stat( pStatStg, 0 );
> if ( FAILED( hr ) ) {
> // if this operation was critical then return error
> // else assume stat information are not available and continue
> without them
> }
In other words, you are happy to mask a bug in your program forever.
> I still don't know much about SEH. But if access violation error was
> caught by the object itself and translated to E_POINTER return value
> the caller can do nothing more then to acknowledge that the pointer
> is invalid (regardless if it is invalid by mistake or on purpose).
> But if the error is not caught by the object it could be caught my
> abusing caller. He will receive CONTEXT data which might (I don't
> know really - I am just guessing) allow him/her to change the state
> as he wishes and continue execution in the object. But just
> continuation is dangerous as know the objects state is really
> undetermined as it did not end nicely.
So the bottom line is, your process can execute arbitrary machine
instructions and set CPU registers to arbitary values. What else is new?
You don't need CONTEXT for this - you could write a piece of assembly
that sets CPU registers to whatever values you want, then jumps into the
middle of IStream implementation's code. I still don't see what you as
an attacker hope to achieve by this.
> My question comes from that that since this particular object behaves
> like that then maybe generally all COM objects have to be assumed to
> not deal with access violation errors.
Yes, they have to be so assumed. And IMHO that's a good thing. Again, if
you in your application want to "deal" with access violation errors,
please feel free to do so. I prefer to spend that time to make sure my
programs don't trigger access violations in the first place - and
actually having access violations helps me in this task.
Actually no. It requires more code. Code dependent on compiler (as SEH
syntax is compiler specific according to MSDN). While I do check for
returned errors anyway. Yes, on each call - maybe I am weird but I do.
And I would not check for E_POINTER specifically (in most cases) because: 1)
this will unlikely change further control flow as I would unlikely prepare
any special handling for it (like with most other errors), 2) the value
cannot be dependent on since as I showed with the example the object
actually returned E_INVALIDARG and I guess I could find objects returning
other values as well.
But I always thought that COM requires caller to be prepared for ANY error
code to be returned. Which in most cases will lead to propagating them (you
just clean up and return that error code) or returning E_UNEXPECTED, E_FAIL
or something similar (with clean-up as well).
>> Access violation errors can be detected with SEH. So in my opinion
>> they should.
>
> So do it, what's stopping you?
Generally I think that consistency is important. If COM objects generally do
not catch access violation then if my object did it would be less general.
Since its users would expect it to behave like other objects while it would
not.
>> I am. That is way I started to investigate SEH (still haven't added
>> it to the application). But I am doing end-users applications. If I
>> were doing general usage components then making them stable even when
>> used incorrectly would be even more important.
>
> Swallowing an exception and masking it by an error code is doing your
> caller a disservice, not a favor. An application with a logical error is
> best crashed as soon as possible, rather than continuing to hobble along
> in a crippled, indeterminate state doing unpredictable things.
> Further, if that process wants to recover anyway, it can do so by handling
> the exception. Most of the time, you don't want to do that, but in those
> hypothetical cases that you do, the option is available to you.
That is a good point.
However I considered example like a running application which logs some
events to its internal log for servicing support. Now that code is incorrect
and causes access violation resulting in terminating entire application
(which might be of much importance to the system, or just contain a lot of
unsaved user's work). However this is in fact a minor side task, not even
required to use the application. If this is repeatable the application could
stop being usable at all.
Obviously in such case application should cache such errors by itself end
prevent premature termination. However I thought this would be easier if
objects returned invalid pointer error instead of causing access violation.
Especially that propagation of SEH exception does not cause to do "language
cleanup" (like calling destructors in C++) and thus if you do not cache such
error early likely after catching it you would have to quit anyway as the
state indeed will no longer be valid. But catching them early requires
wrapping almost all calls in __try __except which obviously leads to
unmaintanable code.
>> HRESULT hr = pStream->Stat( pStatStg, 0 );
>> if ( FAILED( hr ) ) {
>> // if this operation was critical then return error
>> // else assume stat information are not available and continue
>> without them
>> }
>
> In other words, you are happy to mask a bug in your program forever.
No. I am not masking them. I just change their form. During development I
will likely detect that the function returned truly unexpected error (likely
because I will assert upon such). But that is a minor issue here.
Consider code like
if ( ptr == NULL )
throw std::logic_error( "ptr == NULL" );
*ptr = ...
Now do you consider it a masking of logic error? I do not. But obviously it
does prevent access violation. The COM code earlier did the same but for COM
you propagate errors by returning HRESULT instead of throwing C++ exceptions
(it is less usable and more likely misused but I guess this is required to
support binary compatibility).
The problem with access violation is that it causes SEH exception. But SEH
exceptions unlike C++ exceptions do not support correct stack unwinding and
thus if not caught early lead to application state destruction, resources
leaking and so on.
> Yes, they have to be so assumed. And IMHO that's a good thing. Again, if
> you in your application want to "deal" with access violation errors,
> please feel free to do so. I prefer to spend that time to make sure my
> programs don't trigger access violations in the first place - and actually
> having access violations helps me in this task.
Basically this answers my original questions so thank you.
I still think that special treatment of NULL pointer is an inconsistency of
the objects behaviour. And I still think that catching access violation
errors and returning error codes would lead to better (and easier
achievable) robustness. But those are personal opinions largely dependent on
ways one works (and likes to work).
Adam Badura
Note that there is even better example. IStream::Seek states explicitly
that "STG_E_INVALIDPOINTER" "Indicates that the [out] parameter
plibNewPosition points to invalid memory, because plibNewPosition is not
read." I think we can assume that there is a bug in the documentation and it
meant that the memory cannot be written (and not read) since the function
must write that memory and has no reasons to read it.
So to conform to that specification I actually have to catch structured
exception when accessing *plibNewPosition and convert it to
STG_E_INVALIDPOINTER return value. On the other hand this discussion so far
convinced me that this might not be a good thing after all. But if so then
why does specification require it?
Adam Badura