There are some problems that seem to have no good solution, and since this is one of them, I decided to ask here rather than think too hard about it myself. :)
I have a framework where I send strings between two nodes on a network, serializing the strings through a Socket object:
Socket socket; string s;
socket << s;
The obvious implementation of serializing a string is to have the source first send the count of characters in the string, then the characters themselves. The target will allocate a buffer to hold "count" characters, then fill in the buffer with the actual characters as they arrive from the target.
An attacker can wreak havoc with this model by injecting bogus packets into the network to arrive at the target and present a "count" as a very large number, say, 100,000,000. The target will unwittingly invoke:
char *buffer = new char[100000000];
The attempt to allocate will either succeed or fail. If it succeeds, 100MB of virtual memory will be lost, which is, in a sense, worse than if it fails.
I do have security mechanisms in my framework that eliminates this problem, but there are scenarios where the user of my framework will deliberately and necessarily choose not to enable the security feature.
What then can I do to stop this problem?
I considered placing an artificial limit on allocation of memory for a string or any other free-store-consuming object. I also considered placing the entire thread that would invoke operator new() on a kind of free-store limit, so that any attempt to breach that limit would result in exception being thrown. Neither of these solutions feel right.
My gut feeling is that I will eventually discover that no solution feels right, but thought I would ask before giving up.
On 27 Maj, 06:42, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> The obvious implementation of serializing a string is to have the > source first send the count of characters in the string, then the > characters themselves. The target will allocate a buffer to hold > "count" characters, then fill in the buffer with the actual characters > as they arrive from the target.
> An attacker can wreak havoc with this model by injecting bogus packets > into the network to arrive at the target and present a "count" as a > very large number
[...]
> What then can I do to stop this problem?
Use hard limit on this count field and reject messages that do not comply. Document this hard limit as part of the communication protocol. For those who want to take the responsibility on them, allow to modify this limit by parameterizing the communication library (macro definitions, configuration files, constructor parameters, etc.).
The following library is an example of using this strategy:
In general, don't think that you should allow everybody to do everything - there is absolutely no need to do so. Just set up your rules and reject everything that looks strange. If there is a genuine need for sending longer packets, users can either reconfigure the library by using the limit parameter (and then it's *their* business if they get into DOS) or they can send long content by chopping it into smaller parts. Another solution might be to use normal (short) messages to negotiate opening a new dedicated and temporary channel for long content. This provides nice hook for authentication and permission checks as well.
> There are some problems that seem to have no good solution, and since > this is one of them, I decided to ask here rather than think too hard > about it myself. :)
> I have a framework where I send strings between two nodes on a > network, serializing the strings through a Socket object:
> Socket socket; > string s;
> socket << s;
> The obvious implementation of serializing a string is to have the > source first send the count of characters in the string, then the > characters themselves. The target will allocate a buffer to hold > "count" characters, then fill in the buffer with the actual characters > as they arrive from the target.
> An attacker can wreak havoc with this model by injecting bogus packets > into the network to arrive at the target and present a "count" as a > very large number, say, 100,000,000. The target will unwittingly > invoke:
> char *buffer = new char[100000000];
> The attempt to allocate will either succeed or fail. If it succeeds, > 100MB of virtual memory will be lost, which is, in a sense, worse than > if it fails.
> I do have security mechanisms in my framework that eliminates this > problem, but there are scenarios where the user of my framework will > deliberately and necessarily choose not to enable the security > feature.
> What then can I do to stop this problem?
Don't allocate whole buffer immediately but realloc in chunks as packets arrive. Let them send all 100 mb ;) Also you can request ack from client for every chunk sent, say I like every 512 bytes. Since originating ip will probably be spoofed, protocol will break on first ack.
Maciej Sobczak wrote: > On 27 Maj, 06:42, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
>> The obvious implementation of serializing a string is to have the >> source first send the count of characters in the string, then the >> characters themselves. The target will allocate a buffer to hold >> "count" characters, then fill in the buffer with the actual characters >> as they arrive from the target.
>> An attacker can wreak havoc with this model by injecting bogus packets >> into the network to arrive at the target and present a "count" as a >> very large number > [...]
>> What then can I do to stop this problem?
> Use hard limit on this count field and reject messages that do not > comply. Document this hard limit as part of the communication > protocol. > For those who want to take the responsibility on them, allow to modify > this limit by parameterizing the communication library (macro > definitions, configuration files, constructor parameters, etc.).
> The following library is an example of using this strategy:
> In general, don't think that you should allow everybody to do > everything - there is absolutely no need to do so. Just set up your > rules and reject everything that looks strange. If there is a genuine > need for sending longer packets, users can either reconfigure the > library by using the limit parameter (and then it's *their* business > if they get into DOS) or they can send long content by chopping it > into smaller parts.
I absolutely agree. Another thing you may want to consider is to get away from the idea of a "secure mode" and offer several "tunable" parameters. If the user wants to mess with the hard limit, they can do so, at their own risk, but can leave the other parameters at their defaults that help secure your protocol.
On May 27, 8:24 pm, John Moeller <fishc...@gmail.com> wrote:
> Maciej Sobczak wrote: > > Use hard limit on this count field and reject messages that do not > > comply. Document this hard limit as part of the communication > > protocol. > > For those who want to take the responsibility on them, allow to modify > > this limit by parameterizing the communication library (macro > > definitions, configuration files, constructor parameters, etc.).
So you are saying for all 190 C++ classes that I have that are serializable, I should find a way to specify hard limits on the size of what is being serialized, including not only strings, but a family of containers that includes at least 30 containers? Do I specify a maximum number of elements that can be serialized to/from the container?
> > The following library is an example of using this strategy:
> > In general, don't think that you should allow everybody to do > > everything - there is absolutely no need to do so. Just set up your > > rules and reject everything that looks strange. If there is a genuine > > need for sending longer packets, users can either reconfigure the > > library by using the limit parameter (and then it's *their* business > > if they get into DOS) or they can send long content by chopping it > > into smaller parts.
Let's say that I specify the hard limit for class String<> to be 4096 characters, an arbitrary but reasonable value. Let's say also that I specify the number of elements in a List<> to be 65,636 elements, again, an arbitrary but reasonable value. Calculating the maximum amount of memory that can be consumed by a DoS attacker, we get 2^16*2^12 = 256 MB. So an attacker, using the system in a "safe" mode, could easily break the model. It should be intuitively obvious that it is impossible to have both generalized plurality and defense against this type of attack simultaneously. Even with parameterization of how much memory can be allocated, there is still the question of where the user of my library should specify (and how), what limits should be set. It should also be intuitively apparent that there comes a point where, if the user of the library is so busy putting checks in the code to limit this type of attack, the ease-of- use is destroyed. And again, whatever values chosen would be arbitrary, and because objects are hierarchical, with unpredictable level of nesting, the whole thing would quickly turn into a monstrous mess.
I would be very curious to know what Boost Serialization does in this situation, if anyone knows.
> I absolutely agree. Another thing you may want to consider is to get > away from the idea of a "secure mode" and offer several "tunable" > parameters. If the user wants to mess with the hard limit, they can do > so, at their own risk, but can leave the other parameters at their > defaults that help secure your protocol.
The security mechanisms were not created for this problem. They were created for the generalized problem of proving security in the Internet (in a research sense). It is only by fortune that, if the nodes are connected over a secure channel, the deliberate attacks are no longer possible.
On 28 Maj, 15:34, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> > > Use hard limit on this count field and reject messages that do not > > > comply. Document this hard limit as part of the communication > > > protocol. > > > For those who want to take the responsibility on them, allow to modify > > > this limit by parameterizing the communication library (macro > > > definitions, configuration files, constructor parameters, etc.).
> So you are saying for all 190 C++ classes that I have that are > serializable, I should find a way to specify hard limits on the size > of what is being serialized, including not only strings, but a family > of containers that includes at least 30 containers? Do I specify a > maximum number of elements that can be serialized to/from the > container?
In this case you can avoid DOS attacks by using message headers with authentication information and setting up a "circle of trust" in the system. In other words, you need to be able to tell whether the message is valid or bogus *before* you come to the point where you dynamically allocate buffers. Alternatively, you can play tricks with both approaches - put hard limit (even very small one) on messages that do not authenticate ("guests") and "no limit" on messages from trusted sources.
Le Chaud Lapin wrote: > My gut feeling is that I will eventually discover that no solution > feels right, but thought I would ask before giving up.
IMHO you have to use some kind of digital signature. Corrupted sequences will also be filtered out. -- With all respect, Sergey. http://ders.stml.net/ mailto : ders at skeptik.net
On 2007-05-27, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> very large number, say, 100,000,000. The target will unwittingly > invoke:
> char *buffer = new char[100000000];
> The attempt to allocate will either succeed or fail. If it succeeds, > 100MB of virtual memory will be lost, which is, in a sense, worse than > if it fails.
-cut-
> What then can I do to stop this problem?
Leave the decision to the user. Make the user to implement function with a signature like
void *allocate(int type, int nitems, int size);
where type is the object type being allocated (list, simple struct, ...), nitems is the number of child items (if applies to the given type), and size is the number of bytes to be allocated in this call. So the user can collect statistics on each individual allocation type if he cares about memory usage (and has the opportunity to say at some point "enough!" by returning NULL pointer), or (if he's lazy) he can just implement it like
void *allocate(int, int, int size) { return malloc(size);
}
Or, as others have suggested, cryptography. You don't need DSIG, I believe that a HMAC would be sufficient.
On May 30, 8:28 am, "Sergey P. Derevyago" <non-exist...@iobox.com> wrote:
> Le Chaud Lapin wrote: > > My gut feeling is that I will eventually discover that no solution > > feels right, but thought I would ask before giving up.
> IMHO you have to use some kind of digital signature. Corrupted > sequences will > also be filtered out.
We have no problems with our secure links, which we already have. With such links, there is nothing that a perpetrator can do to alter or inject bogus packets into the communication stream to trick the recipient of the packet in doing a massive new [], because the security mechanisms, which includes digital signatures, will cause packet to be dropped.
The problem is when the link is insecure. It ruins the entire serialization framework. Note that ruin happens not just for strings, but for any situation where there is a vector of elements, and the source of an object is about to convey to the target that size of that vector before serializing the individual elements of that vector.
Note again that this is a framework here, not a specific application, so I cannot, for example, in the context of each serializable class, specific an arbitrary limit on the number of elements involved, because it would be, well...arbitrary. This is true especially if the class contains a vector template, as it would not be known the size of each element in the vector, so even if some arbitrary limit were set for the size of the array, say 65,536, if each element of the array is an object with multiple members, it is conceivable that one of those members would be an array itself. This problem presents itself recursively, so that, if N is limit on number of elements allowed to be serialized for vector V, then recursively L levels, there would be an exponential explosion in memory space required for new[] against a Foo vector[N], equal to N^L, so that even for L=4 and N=65,536, N^L is 2^64, and we're back where we started.
But aside from the details, it should be intuitively apparent that trying to put these artificial limits ruins the regularity of the entire model, which again, is a framework and not a specific application. As we all know, arbitrariness is a red-flag in good design principles.
Consider defining the serialization function for a List<>:
Socket s;
List<Foo> l;
s << l;
One would not be able to specify a limit on the count of s without knowing how much space Foo will take up. Big Foo, small limit. Small Foo, large limit. Foo itself could contain members that contain List<>, and so on, recursively.
The more I think about this problem, the more I am beginning to believe that it is better to leave the classes themselves alone and focus on the memory management itself. At least the regularity would be preserved.
In that case, there are two possible "solutions", one that will not work, the other that might:
The solution of putting a limit on the "archive" object (Socket in this case) won't work because that will be meaningless for a long- duration application that was meant to acquire and release terabytes of memory throughout its natural life.
That leaves memory allocation against the thread itself. At any given instant, on a server machine with 4GB of ram and 500 client connects, if one server thread is hogging 2.5GB for itself, there is probably a breach in progress. In that case, the memory allocation should fail with an exception, the server thread will hard-abort, the evil client connection will be broken, and the only entity unhappy at that point will be the evil client.
Unfortunately, memory allocation quotas on on most OS's, if I am not mistaken, are applied on a per-process, not per-thread, basis.
In article <1180546963.456914.57...@w5g2000hsg.googlegroups.com>, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> The problem is when the link is insecure. It ruins the entire > serialization framework.
This problem has nothing to do with serialization per se (or even C++, for that matter). You have input from an untrusted source. You have to validate the heck out of it before you use it. Period.
-- Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> 773 961-1620
Discussion subject changed to "Preventing Denial of Service Attack In IPC Serialization ÃÐÀÆÄÀÍÅ! ÂÀØÀ ÁÅÑÏÅ×ÍÎÑÒÜ ÍÀ ÐÓÊÓ ÂÎÐÀÌ!" by Sergey P. Derevyago
Le Chaud Lapin wrote: > > IMHO you have to use some kind of digital signature. Corrupted > > sequences will also be filtered out.
> We have no problems with our secure links, which we already have. > With such links, there is nothing that a perpetrator can do to alter > or inject bogus packets into the communication stream to trick the > recipient of the packet in doing a massive new [], because the > security mechanisms, which includes digital signatures, will cause > packet to be dropped.
> The problem is when the link is insecure.
Also the problem is when the link is unstable and therefore can deliver corrupted Count members. The signature will address both of these issues. -- With all respect, Sergey. http://ders.stml.net/ mailto : ders at skeptik.net
You're conflating serialization with transmission, and on the other end, deserialization with reception. You need to serialize your data:
std::vector<char> data; serialize(l, data);
, and then send it
s.send(data);
And on the other end, receive the data:
std::vector<char> data; s.receive(data);
, and then deserialize it:
deserialize(l, data);
That gives you control over how much data moves in and out of your application.
In the Socket::receive() function you put a limit on the number of bytes you are willing to read off the network, and if you're still suspicious, you allocate memory in smaller chunks and realloc as the data comes in.
On May 31, 8:21 am, "Nevin :-] Liber" <n...@eviloverlord.com> wrote:
> This problem has nothing to do with serialization per se (or even C++, > for that matter). You have input from an untrusted source. You have to > validate the heck out of it before you use it. Period.
Actually, it does. If you have ever created a serialization framework, you'd probably know that validation is not possible.
I am surprised at some of the responses here to be honest. It is intuitively obvious to me that "validation" is not possible, and the grab-and-realloc method facings the same issues - when is too much memory too much? And given that class objects can be nested, per- object limitations on allocated memory are completely arbitrary. This should be evident from a container of objects.
This is a reasonable piece of code in terms of the model it implies. One can imagine how serialization for map<> might be implemented. If Foo contains a string that is say, 800 bytes long, that is a reasonable value for some strings. If the count of elements in a map<> is 3500, that is a reasonable value for some map<>'s. But I can take this structure and easily make its overall memory consumption on the order of Gigabytes.
Then what? Should the internal map<>'s be artificially-intelligent and say, "Uh oh...I detected that I am inside of something big and bad going on..." Obviously they cannot. This will blow up at the server.
Furthermore, there is another problem that is insurmountable, I think. It is highly reasonable that one legitimate client might induce the server to allocate, say, 20 megabytes, on behalf of the client. That 20 megabytes does not have to be allocated in one array- chunk. It could be distributed over, say, the last 2000 objects created by the server on behalf of the client.
I am going to repeat myself here because I can feel that, when I write the last sentence, there are some reading this who are thinking, "Just put limits on what's done."
I cannot do that. :) This is a serialization framework. I must be able to provide a means of serializing an object, then hands off. If I parameterize the allocation size, the size becomes arbitrary. And given the nested map<>'s up above, it should be evident that, if I make the per-array size too small, I deny legitimate clients. If I make it too large, the malicious client can successfully attack. If I choose something "reasonable", a malicious client can still attack.
An to reiterate, I have a secure-mode of operation where this issue is not a problem.
The problem is when the link is insecure. And there are cases where it is a legitimate necessity that the link be insecure.
I am beginning to think that a poor "solution" might be for the kernel of the OS to allow per-thread quotas on allocated pages.
For those who keep saying, "Put a limit..", I encourage you to write C+ + code to show how you would serialize a List<> template, and tell me what limits you would use.
On May 31, 1:56 pm, "Sergey P. Derevyago" <non-exist...@iobox.com> wrote:
> Also the problem is when the link is unstable and therefore can deliver > corrupted Count members. > The signature will address both of these issues. > --
Signatures are a form of security.
> > The problem is when the link is insecure.
What does one do when the link is insecure? (Ask in the spirit of exploration).
On Jun 1, 1:55 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On May 31, 8:21 am, "Nevin :-] Liber" <n...@eviloverlord.com> wrote:
> > This problem has nothing to do with serialization per se (or even C++, > > for that matter). You have input from an untrusted source. You have to > > validate the heck out of it before you use it. Period.
> Actually, it does. If you have ever created a serialization > framework, you'd probably know that validation is not possible.
> I am surprised at some of the responses here to be honest. It is > intuitively obvious to me that "validation" is not possible, and the > grab-and-realloc method facings the same issues - when is too much > memory too much?
Realloc method will prevent attacker from allocating too much memory in server by injecting packets (if somehow they can pass router). Since these packets will break protocol and attacker cannot establish connection, I can't see issue here. So I assume that you are talking about legitimate, connected clients that are trying to dos server or your application accepts connections from any source and is not hidden behind router. If that is the case you can't prevent dosing without imposing allocation limits as you can't prevent users of any library to allocate all available memory.
.......
> The problem is when the link is insecure. And there are cases where > it is a legitimate necessity that the link be insecure.
In other words you have to allow connections from any source?
> I am beginning to think that a poor "solution" might be for the kernel > of the OS to allow per-thread quotas on allocated pages.
Or you can write per thread memory allocator.
> For those who keep saying, "Put a limit..", I encourage you to write C+ > + code to show how you would serialize a List<> template, and tell me > what limits you would use.
There is always limit of available ram. This is not about serialization nor security, but problem of memory allocation. Whether you will allow all available memory to be used or not. You can limit that by writing custom allocators, don;t even have to limit per thread but say per request. Just use allocator per request that will limit available memory for request to some reasonably large value.
On May 31, 6:55 pm, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On May 31, 8:21 am, "Nevin :-] Liber" <n...@eviloverlord.com> wrote:
> > This problem has nothing to do with serialization per se (or even C++, > > for that matter). You have input from an untrusted source. You have to > > validate the heck out of it before you use it. Period.
> Actually, it does. If you have ever created a serialization > framework, you'd probably know that validation is not possible.
Both of you are correct. The issue is not one of "validation" (if by "validation" you mean sanity checks of the serialized data), but of who is sending you that data (validating the sender, not the data stream). Even IF the serialized data was correctly formatted and not blatantly out of range, if it didn't come from the proper source, you still have an avenue of attack. It might not result in a "denial of service" attack, but the consequences of random data injection can be equally bad.
> It is > intuitively obvious to me that "validation" is not possible, and the > grab-and-realloc method facings the same issues - when is too much > memory too much?
The answer to that question lies somewhere outside of your code. It might be safe to serialize 1Gb of objects to a server, but not to a PDA.
> The problem is when the link is insecure. And there are cases where > it is a legitimate necessity that the link be insecure.
> I am beginning to think that a poor "solution" might be for the kernel > of the OS to allow per-thread quotas on allocated pages.
If it's insecure, then that's your answer: it's insecure. That means injection attacks are possible, whether it's an attempt to force your deserialization code to malloc too much, or something more subtle, like bogus objects. Per-thread quotas on allocated pages is just an attempt to move your heuristic sanity checks down into the OS. Those sanity checks are not a substitute for validating your source and preventing injection attacks. Use SSL tunneling or something similar.
On Jun 1, 12:37 pm, Nominal Pro <majorsc...@gmail.com> wrote:
> If it's insecure, then that's your answer: it's insecure. That means > injection attacks are possible, whether it's an attempt to force your > deserialization code to malloc too much, or something more subtle, > like bogus objects. Per-thread quotas on allocated pages is just an > attempt to move your heuristic sanity checks down into the OS. Those > sanity checks are not a substitute for validating your source and > preventing injection attacks. Use SSL tunneling or something similar.
Nice response, and I agree.
This leads us to a simple conclusion, was somewhat sure of when I wrote the OP, but now I am certain of: one cannot have his cake and eat it. Generalized serialization frameworks, the kind that many C++ programmers write, fail in the face of insecure IPC channels.
Being a researcher in computer networking, this is very troubling to me. It means that the most wonder of feature of serialization, obviation of microscopic attention to marshalling of data across the channel, fails completely. On an insecure channel, every single element just be range-checked, etc.
This means that if one wants to avoid DoS attacks, either through over memory allocation or simple causing the server to choke on bad data, one really should not use serialization at all over an insecure channel.
On Jun 1, 12:36 pm, Branimir Maksimovic <b...@hotmail.com> wrote:
> > The problem is when the link is insecure. And there are cases where > > it is a legitimate necessity that the link be insecure.
> In other words you have to allow connections from any source?
Yes, that's what I keep saying. I have an IPC channel that has both a secure mode and an un-secure mode. The secure mode provides rock- solid security, in both directions. The un-secure mode provides nothing. There are situations (just as exists in the Internet today), where the un-secure mode is a necessary mode, but still provides some value. It is the un-secure mode where there is a problem. My contention is that using serialization on a socket that has not-yet- been-secured is a bad idea, which is extremely unfortunate, IMO, as it forces one to revert to picking apart every single vector whose size is dynamic and potentially unlmited.
> > I am beginning to think that a poor "solution" might be for the kernel > > of the OS to allow per-thread quotas on allocated pages.
> Or you can write per thread memory allocator.
Yes, but then that would ruin the serialization framework. I am too lazy to prove this here, but think about how you would serialize an object under Boost or MCF serialization or any other serialization, and it should become clear very quickly that the code would become intractable by providing a specialized memory allocator for every serialized object, in addition to knowing just how much each object should consume.
If this is not clear, think about it some more. :)
> There is always limit of available ram. > This is not about serialization nor security, but problem of > memory allocation. Whether you will allow all available > memory to be used or not. You can limit that by writing > custom allocators, don;t even have to limit per thread > but say per request. Just use allocator per request > that will limit available memory for request > to some reasonably large value.
See above. Per request will render the serialization framework intractable.
> > An to reiterate, I have a secure-mode of operation where this issue > > is not a problem.
> > The problem is when the link is insecure. And there are cases where > > it is a legitimate necessity that the link be insecure.
> So, basically you're saying that:
> - You want to avoid unauthorised clients inducing the server to > allocate lots of resources, which would constitute a DoS attack.
> - You want to let authorised clients induce the server to allocate > lots of resources without impediment.
> - You can't authenticate clients to differentiate between the two > cases.
> I suggest magic.
This is a most beautiful response.:) This is *exactly* what I have been trying to say
It it is evident to me that, with no authentication, you cannot have your cake and eat it. What you wrote above is inevitable.
What this means is that, any serialization framework, not just mine, that claims that, "you can use it against sockets just as well as files", is actually being somewhat dishonest. Again, I am curious to know how Boost handles serialization of strings. What happens if I want to serialize a 10,000-character string over a socket using Boost's archive method.
Why is this important?
I means that, for all the applications on the Internet that uses unprotected serialization of the kind provided by Boost,/etc...they are all vulnerable to DoS attack.
All one has to do is super-saturate the server with bogus resource consumption (memory allocation), and linger.
The most important observation, which I keep repeating, is that it should also be evident that anything beyond a secure (authenticated) connection won't work. It will result in quick and massive degradation of the framework itself. For example, someone might propose that the IP address of the server be checked, and if it makes too many connections within a specified period, limit its memory allocation. Or whatever.
It should be obvious that:
1. You are back to the original problem, which is "How much is too much?" 2. There are legitimate cases to multiple connections.
One cannot have his cake and eat it without authentication.
If I were an evil person, I'd go hunting around the Internet finding servers that use serialization against general-public links and do naughty things to them. ;)
On Jun 2, 1:17 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On Jun 1, 12:36 pm, Branimir Maksimovic <b...@hotmail.com> wrote:
> > > I am beginning to think that a poor "solution" might be for the kernel > > > of the OS to allow per-thread quotas on allocated pages.
> > Or you can write per thread memory allocator.
> Yes, but then that would ruin the serialization framework. I am too > lazy to prove this here, but think about how you would serialize an > object under Boost or MCF serialization or any other serialization, > and it should become clear very quickly that the code would become > intractable by providing a specialized memory allocator for every > serialized object, in addition to knowing just how much each object > should consume.
I am not talking about anything I didn't already done. I' have implemented serialization in a way from pdf document, (I think that was first that presented serialization with separate readers / writers so that serialization is completely transparent) Since it is transparent I use streambuf to send packets via sockets and to receive at other end with internal buffer protocol. But allocator is even more transparent then that. It uses thread specific storage to implement per thread allocation and can easily be limited to some maximum memory. When one thread allocates and other frees, it simply switches blocks. Since it replaces default global new, I cannot see an issue here?
> If this is not clear, think about it some more. :)
It is not clear, since per thread allocator doesn't (nor way of writing and reading from sockets for that matter) have to do anything with serialization. If your single request requires all available ram that means that server will be dosed by legitimate clients sooner or later, by bugs or who knows what.
{ Please confine responses to standard C++ or libraries of general interest such as Boost. Thanks, -mod }
On Jun 2, 5:20 am, Branimir Maksimovic <b...@hotmail.com> wrote:
> On Jun 2, 1:17 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote: > But allocator is even more transparent then that. > It uses thread specific storage to implement > per thread allocation and can easily be limited > to some maximum memory. When one thread allocates > and other frees, it simply switches blocks. > Since it replaces default global new, I cannot see an issue here?
That's interesting. How hard was it to make the per thread allocator?
On Jun 1, 5:13 pm, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> On Jun 1, 12:34 pm, Lourens Veen <lour...@rainbowdesert.net> wrote:
> Again, I am curious to > know how Boost handles serialization of strings. What happens if I > want to serialize a 10,000-character string over a socket using > Boost's archive method.
> Why is this important?
> I means that, for all the applications on the Internet that uses > unprotected serialization of the kind provided by Boost,/etc...they > are all vulnerable to DoS attack.
What is low class IMO is criticizing other attempts when you have not published anything. I think the Boost library has some weaknesses, but one nice thing about it is you can use it. Do you plan to make available what you have been describing?
On Jun 3, 12:44 pm, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
> { Please confine responses to standard C++ or libraries of general > interest such as Boost. Thanks, -mod }
> On Jun 2, 5:20 am, Branimir Maksimovic <b...@hotmail.com> wrote:
> > On Jun 2, 1:17 am, Le Chaud Lapin <jaibudu...@gmail.com> wrote: > > But allocator is even more transparent then that. > > It uses thread specific storage to implement > > per thread allocation and can easily be limited > > to some maximum memory. When one thread allocates > > and other frees, it simply switches blocks. > > Since it replaces default global new, I cannot see an issue here?
> That's interesting. How hard was it to make the per thread > allocator?
There is some work but it is not that hard. Just implement casual allocator then make it thread specific. Construct one either on thread creation or first alloc and destruct on thread exit. Keep global map/vector of pairs of thread ids and their allocators for block transferring. Transfer is pretty straightforward, first lookup map for allocator to whom block belongs, if thread is not there take ownership of block or return to global allocator. Each allocator is lock free, except when transferring cached blocks or allocating/freeing blocks from global allocator. You can limit allocation by book keeping on alloc/free operations how much memory is allocated, since each thread specific allocator has state.
{ I'm sorry, but I see no C++-related content here. If there is, please just repost with explanation. -mod }
For the mods: just like the rest of this thread, this post concerns the application of generalized C++ serialization frameworks, like the one in Boost, to IPC applications.
> The most important observation, which I keep repeating, is that it > should also be evident that anything beyond a secure (authenticated) > connection won't work.
You're making a mountain out of a molehill here, Mr Rabbit :)
IPC systems commonly use a concept of a message, with a header and a payload. Among other things, the header would contain the length of the payload. When the server receives a message from a client, it reads the header, and checks the payload length against a preset limit. After that, it proceeds with deserialization of the payload, and because it already knows the length of the payload, eg 4196 bytes, it knows that it should not accept eg a single byte string claiming a length of 5000, or a double byte string claiming a length of 2500, etc.
Now if your serialization code blindly allocates buffers of arbitrary size, then you obviously have a problem in your serialization code. You need to improve it to be aware of the payload length of the current message being processed. I'm curious as to why you think that is a big deal?