How can I get hold of a RepeatedField object to manage a repeated field
in C++.
In RProtoBuf, we do a lot of :
for( int i=0; i<size; i++){
INTEGER(res)[i] = (int) ref->GetRepeatedInt32( *message, fieldDesc, i ) ;
}
where essentially the INTEGER macro gives a pointer to the beginning of
the int array we are filling.
I'd like to replace this using e.g std::copy
RepeatedField field ;
std::copy( field.begin(), field.end(), INTEGER(res) ) ;
but I can't find how to actually get hold of a RepeatedField object.
Is it possible ?
Romain
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/NrTG : Rcpp 0.7.5
|- http://tr.im/MPYc : RProtoBuf: protocol buffers for R
`- http://tr.im/KfKn : Rcpp 0.7.2
--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
On the documentation it says :
"
Most users will not ever use a RepeatedField directly; they will use the
get-by-index, set-by-index, and add accessors that are generated for all
repeated fields
"
What if I do want to use RepeatedField ?
Romain
On 02/11/2010 06:50 PM, Jason Hsueh wrote:
> No, there isn't a way to get the RepeatedField from the reflection
> interface. You can only do so via the generated interface.
>
> On Thu, Feb 11, 2010 at 5:57 AM, Romain Francois
> <romain.francoi...@gmail.com
Thanks for the answers.
Maybe I should give some more background on why this is of interest to
me. In RProtoBuf, we essentially only use the reflection api so that we
can dynamically load new proto message types at runtime, etc ... we
don't use protoc and therefore have no access to the generated classes.
In the class "GeneratedMessageReflection", there are templates such as :
template <typename Type>
inline Type GetRepeatedField(const Message& message,
const FieldDescriptor* field,
int index) const;
but they are private ? Why ?
From what I can read of the code, methods like GetRepeatedInt32 get
expanded out of :
PASSTYPE GeneratedMessageReflection::GetRepeated##TYPENAME( \
const Message& message,
\
const FieldDescriptor* field, int index) const {
\
USAGE_CHECK_ALL(GetRepeated##TYPENAME, REPEATED, CPPTYPE);
\
if (field->is_extension()) {
\
return GetExtensionSet(message).GetRepeated##TYPENAME(
\
field->number(), index);
\
} else {
\
return GetRepeatedField<TYPE>(message, field, index);
\
}
\
}
\
so doing things like this code
for( int i=0; i<size; i++){
INTEGER(res)[i] = (int) ref->GetRepeatedInt32( *message,
fieldDesc,
i ) ;
}
is going to be not as efficient as if I could directly iterate over the
repeated field using RepeatedField::iterator
Instead of extending the Reflection interface, what about making the
templates public in GeneratedMessageReflection and then maybe use some
sort of trait to indicate whether that the instance of Reflection I have
access to has these templates.
Something like :
template <typename _T, _T _V> struct integral_constant {
static const _T value = _V;
typedef _T value_type;
typedef integral_constant<_T,_V> type;
};
typedef integral_constant<bool, true> true_type;
typedef integral_constant<bool, false> false_type;
struct reflection_has_templates{} ;
struct reflection_no_templates{} ;
template <typename T> struct has_templates : public false_type{};
template<> struct has_templates<GeneratedMessageReflection> : public
true_type{} ;
So that using this trait, I can dispatch between :
- suboptimal implementation when the templates are not there
- optimal implementation using RepeatedField::iterator when they are
available.
It adds nothing to the Reflection interface.
Romain
On 02/12/2010 10:37 PM, Kenton Varda wrote:
> Yeah, it would add a lot of complication and reduce the flexibility of
> the Reflection interface. We don't want to require repeated fields to
> be implemented in terms of RepeatedField or RepeatedPtrField.
>
> On Fri, Feb 12, 2010 at 12:11 PM, Jason Hsueh <jas...@google.com
> <mailto:jas...@google.com>> wrote:
>
> +kenton
>
> Kenton may have a better answer, but I surmise that it's to avoid
> tying the Reflection interface to implementation details. A Message
> implementation might not use RepeatedField at all. The original
> version of protobufs used a different class to represent repeated
> fields, so it wouldn't have been possible to implement Reflection
> for the original version if the interface required access to
> RepeatedField. So maybe the reason is historical. Recent changes
> have been made that would make this technically possible. However,
> adding methods to get direct access to the RepeatedField would still
> expand the Reflection interface quite a bit. I'll defer to Kenton on
> that.
>
> On Fri, Feb 12, 2010 at 2:51 AM, Romain Francois
> <romain.francoi...@gmail.com
> <mailto:romain.francoi...@gmail.com>> wrote:
>
>
> Why not ? It seems reasonnable to want to use e.g. std::copy and
> friends.
>
> On the documentation it says :
>
> "
> Most users will not ever use a RepeatedField directly; they will
> use the get-by-index, set-by-index, and add accessors that are
> generated for all repeated fields
> "
>
> What if I do want to use RepeatedField ?
>
> Romain
>
>
> On 02/11/2010 06:50 PM, Jason Hsueh wrote:
>
> No, there isn't a way to get the RepeatedField from the
> reflection
> interface. You can only do so via the generated interface.
>
> On Thu, Feb 11, 2010 at 5:57 AM, Romain Francois
> <romain.francoi...@gmail.com
> <mailto:romain.francoi...@gmail.com>
> <mailto:romain.francoi...@gmail.com
On 02/16/2010 10:44 PM, Kenton Varda wrote:
> GeneratedMessageReflection is an internal class used by the protobuf
> implementation. Currently, users are not allowed to use it directly,
> because we may change it at any time. You're suggesting that we promote
> it to a public interface, which has implications with regard to
> maintenance costs and implementation agility.
I understand that.
I'm not necessarily suggesting to make this particular class public, but
am looking for some way to iterate over the elements of a repeated field.
I see that this is available for classes that are generated by protoc,
which is what is mostly used, but unfortunately we don't/can't really do
that with R.
On the same score, if I want to increase the number of elements in a
repeated field, I have to do it one by one right ? I can't do something
like first reserve space, and then fill the generated space. Does that
mean that memory is reallocated each time ?
> I'm open to considering making this change for performance purposes.
> However, even them I'm hesitant to expose Repeated[Ptr]Field
> references directly via this interface. I'd like to see what happens if
> we simply make all of the existing public accessor methods "inline", so
> you could then do something like:
>
> int size = reflection->FieldSize(message, field);
> for (int i = 0; i < size; i++) {
> const Message& sub_message =
>
> reflection->GeneratedMessageReflection::GetRepeatedMessage(message,
> field, i);
> // Do something with sub_message.
> }
>
> If GetRepeatedMessage is inline then I believe the above loop would be
> nearly as efficient as iterating directly over a RepeatedPtrField. Note
> that the funny syntax for the method call avoids making a virtual call.
>
> BTW, the "has_templates" template you suggest would not work as you
> think -- how would you actually use it?
implement some sort of TMP dispatch :
template <typename T>
foo( T t ){
foo_dispatch( t, typename has_templates<T>::value_type() ) ;
}
and then have
template <typename T>
foo( T t , false_type){
/* the less efficient version */
}
template <typename T>
foo( T t, true_type){
/* the more efficient version using the RepeatedField */
}
This is similar to how e.g std::distance dispatches depending on whether
it deals with random access iterator or some other iterator category.
> What you would really need to
> use is dynamic_cast. My golden rule of dynamic_cast is that it should
> only be used for optimization, and you must provide an implementation
> for the case where dynamic_cast always returns NULL.
But dynamic_cast is a runtime thing, where TMP dispatch happens at
compile time.
> In your case, you
> are doing this, so that should be fine.
>
> On Sat, Feb 13, 2010 at 1:16 AM, Romain Francois
> <romain.francoi...@gmail.com
> <mailto:jas...@google.com <mailto:jas...@google.com>>> wrote:
>
> +kenton
>
> Kenton may have a better answer, but I surmise that it's to
> avoid
> tying the Reflection interface to implementation details. A
> Message
> implementation might not use RepeatedField at all. The original
> version of protobufs used a different class to represent
> repeated
> fields, so it wouldn't have been possible to implement
> Reflection
> for the original version if the interface required access to
> RepeatedField. So maybe the reason is historical. Recent changes
> have been made that would make this technically possible.
> However,
> adding methods to get direct access to the RepeatedField
> would still
> expand the Reflection interface quite a bit. I'll defer to
> Kenton on
> that.
>
> On Fri, Feb 12, 2010 at 2:51 AM, Romain Francois
> <romain.francoi...@gmail.com
> <mailto:romain.francoi...@gmail.com>
|- http://tr.im/OcQe : Rcpp 0.7.7
|- http://tr.im/O1wO : highlight 0.1-5
`- http://tr.im/O1qJ : Rcpp 0.7.6
Thanks for the feedback.I understand that.
On 02/16/2010 10:44 PM, Kenton Varda wrote:
GeneratedMessageReflection is an internal class used by the protobuf
implementation. Currently, users are not allowed to use it directly,
because we may change it at any time. You're suggesting that we promote
it to a public interface, which has implications with regard to
maintenance costs and implementation agility.
I'm not necessarily suggesting to make this particular class public, but am looking for some way to iterate over the elements of a repeated field.
On the same score, if I want to increase the number of elements in a repeated field, I have to do it one by one right ? I can't do something like first reserve space, and then fill the generated space. Does that mean that memory is reallocated each time ?
But dynamic_cast is a runtime thing, where TMP dispatch happens at compile time.What you would really need to
use is dynamic_cast. My golden rule of dynamic_cast is that it should
only be used for optimization, and you must provide an implementation
for the case where dynamic_cast always returns NULL.
Sure. I meant "STL algorithms" iterating.
> BTW, have you observed an actual performance problem or are you just
> speculating that this performance difference may be a problem for you?
In similar (non protobuf-related) settings, we have observed quite a bit
of difference between using plain loops and accessors as opposed to
iterators.
But this is mainly speculation on the performance.
Also, this looks nicer :
double sum = std::accumulate( y.begin(), y.end(), 0.0 ) ;
than this:
double x = 0.0 ;
for( int i=0; i<n; i++){
x += ref.GetRepeatedDouble( message, field, i ) ;
}
... but I suppose we can wrap these in our own iterators, but then I
guess we'd have to resort to some sort of proxy to deal with
GetRepeatedDouble and SetRepeatedDouble.
> On the same score, if I want to increase the number of elements in a
> repeated field, I have to do it one by one right ? I can't do
> something like first reserve space, and then fill the generated
> space. Does that mean that memory is reallocated each time ?
>
>
> No, adding an element is O(1), even if you haven't reserved space. It
> works the same way std::vector works.
But when you use std::vector, it is best to first reserve the target
size as opposed to create an empty vector and push_back each element.
> What you would really need to
> use is dynamic_cast. My golden rule of dynamic_cast is that it
> should
> only be used for optimization, and you must provide an
> implementation
> for the case where dynamic_cast always returns NULL.
>
>
> But dynamic_cast is a runtime thing, where TMP dispatch happens at
> compile time.
>
>
> GetReflection() returns Reflection*, NOT GeneratedMessageReflection*.
> Therefore the compile has no idea which type was returned, therefore
> you *must* use a runtime check. That's my point -- your template
> dispatch doesn't accomplish anything in this context.
Ah ok. Sorry about that.
> In your case, you
> are doing this, so that should be fine.
>
> On Sat, Feb 13, 2010 at 1:16 AM, Romain Francois
> <romain.francoi...@gmail.com
> <mailto:romain.francoi...@gmail.com>
> <mailto:romain.francoi...@gmail.com
> <mailto:romain.francoi...@gmail.com>>> wrote:
>
> Hello,
>
On 02/17/2010 12:51 AM, Kenton Varda wrote:
The Reflection interface already provides a way -- FieldSize() andGetRepeatedX(). The only problem is that it's a bit slower than the
generated accessors because these methods aren't inlineable.
Sure. I meant "STL algorithms" iterating.
In similar (non protobuf-related) settings, we have observed quite a bit of difference between using plain loops and accessors as opposed to iterators.
BTW, have you observed an actual performance problem or are you just
speculating that this performance difference may be a problem for you?
But this is mainly speculation on the performance.
But when you use std::vector, it is best to first reserve the target size as opposed to create an empty vector and push_back each element.