Deleting one of a repeated field?

10,524 views
Skip to first unread message

ojwl...@googlemail.com

unread,
Aug 12, 2008, 3:11:02 PM8/12/08
to Protocol Buffers
Given an object containing a repeated field, is it possible to remove
some arbitrary element from that list?

Kenton Varda

unread,
Aug 12, 2008, 3:12:13 PM8/12/08
to ojwl...@googlemail.com, Protocol Buffers
In what language?

ojwl...@googlemail.com

unread,
Aug 12, 2008, 3:22:12 PM8/12/08
to Protocol Buffers
Python or C++

On Aug 12, 8:12 pm, "Kenton Varda" <ken...@google.com> wrote:
> In what language?
>

Kenton Varda

unread,
Aug 12, 2008, 4:05:44 PM8/12/08
to ojwl...@googlemail.com, Protocol Buffers
For C++, use the RepeatedField interfaces:

You can get a mutable pointer to a RepeatedField or RepeatedPtrField that represents a field named "foo" by calling the message's "mutable_foo()" accessor.

For Python, I think you currently have to clear the field and repopulate it.

Alexandru Mosoi

unread,
Aug 13, 2008, 5:16:42 AM8/13/08
to Protocol Buffers
for python it would be nice to support slice assignment.

Kenton Varda

unread,
Aug 13, 2008, 2:04:05 PM8/13/08
to Alexandru Mosoi, Petar Petrov, Protocol Buffers
Yes, that sounds like a good idea.  Petar, what do you think?

Petar Petrov

unread,
Aug 13, 2008, 5:07:09 PM8/13/08
to Kenton Varda, Alexandru Mosoi, Protocol Buffers
On Wed, Aug 13, 2008 at 11:04 AM, Kenton Varda <ken...@google.com> wrote:
Yes, that sounds like a good idea.  Petar, what do you think?

This feature sounds good, but If we add it, it will only make sense for repeated scalar fields (int, string, bytes, etc) and not for repeated composite fields (groups and sub-messages).

One concern is that in future we may not be able to build a Python API wrapping the C++ API that can be used the same way. That is one of the ideas to drastically boost Python API performance.
Slice assignment is a linear combination of inserts and deletes, however looking at the C++ API it seems possible to translate slice assignments.

ojwl...@googlemail.com

unread,
Sep 16, 2008, 11:27:17 AM9/16/08
to Protocol Buffers
OK, rephrase that question. Say you have a releated field containing
10 items. You wish to delete the 2nd item so that the list contains 9
items and foo_size() returns 9. Is that possible?

I can't see an erase() function in mutable_foo(), and I can't see
anything in RepeatedField that decrements current_size_ other than
RemoveLast() which presumably will only remove the _last_ item?


Currently the only way to remove elements from a list seems to be
awful hacks like a.parse(a.serialise()) after clearing the items you
want to delete


On Aug 12, 9:05 pm, "Kenton Varda" <ken...@google.com> wrote:
> For C++, use the RepeatedField interfaces:http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google...
>
> You can get a mutable pointer to a RepeatedField or RepeatedPtrField that
> represents a field named "foo" by calling the message's "mutable_foo()"
> accessor.
>
> For Python, I think you currently have to clear the field and repopulate it.
>

ojwl...@googlemail.com

unread,
Sep 16, 2008, 12:28:26 PM9/16/08
to Protocol Buffers
reply to self: I guess you copy the item from the end of the list into
the space formerly occupied by the item you want to delete, then call
RemoveLast()...

Kenton Varda

unread,
Sep 16, 2008, 1:12:22 PM9/16/08
to ojwl...@googlemail.com, Protocol Buffers
On Tue, Sep 16, 2008 at 9:28 AM, <ojwl...@googlemail.com> wrote:

reply to self: I guess you copy the item from the end of the list into
the space formerly occupied by the item you want to delete, then call
RemoveLast()...

Yes, that's what you should do.

I didn't want to provide a Remove(int) because it would be O(n).  Version 1 of protocol buffers had such a function, and we frequently saw people writing loops like:

for (int i = 0; i < field.size(); i++) {
  if (ShouldFilter(field[i])) {
    field.Remove(i);
    --i;
  }
}

This loop is O(n^2), which is bad, but it's hard to tell that it is O(n^2).  The idea behind only providing RemoveLast() is to force you to either do something clever (like swapping the element with the last element first, as the documentation suggests) or write your own loop which makes the time complexity of your code obvious.
Reply all
Reply to author
Forward
0 new messages