These are actually very good questions, but I guess I do not really
know much more than you. Polygon data is different with other shapes
data, since one polygon has multiple vertices. NA-separated vectors is
one way to specify polygon data, and I think the other common spec is
in the sp package. In the former case, NA is only used for the
separation purpose. It is not carried over to the final polygon data,
e.g. c(1, 2, 3, NA, 4, 5, 6) will eventually become c(1, 2, 3) and
c(4, 5, 6).
One polygon only needs one layerId. It is not necessary to assign a id
to each of its vertex, so your use of unique() is absolutely
appropriate. It is just that you forgot to remove NA from the id's,
whereas we removed NA's from the polygon data internally. That led to
the mismatch (three polygons, but four layerId's).
I'm not a spatial expert, so I cannot comment on the "best practice"
of storing polygon data. In fact, personally I'm also interested to
know the answer. In particular, I'm looking for a data structure for
polygons so that I can easily assign/store ID's in it, get the ID's,
and query the polygon data by ID. sp::Polygons() has an ID argument,
but at the moment I do not see a helper function to extract the ID's
from the data object it creates.
Regards,
Yihui