Hashmap, associative arrays in R

2,661 views
Skip to first unread message

danoomis...@gmail.com

unread,
Jun 18, 2013, 8:45:43 PM6/18/13
to shiny-...@googlegroups.com
Is there a way to implement this functionality in R.

Alex Brown

unread,
Jun 18, 2013, 8:47:17 PM6/18/13
to shiny-...@googlegroups.com
Lists.

a<-list(yes="y",no="n")

a[["yes"]] == "y"
TRUE

a$yes == "y"
TRUE

Joe Cheng

unread,
Jun 18, 2013, 11:22:31 PM6/18/13
to shiny-...@googlegroups.com
Or environments...

a <- new.env()

a[["yes"]] <- "y"   # or equivalently, a$yes <- "y"

a[["yes"]] == "y"
TRUE

a$yes == "y"
TRUE

--
You received this message because you are subscribed to the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to shiny-discus...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Alex Brown

unread,
Jun 19, 2013, 1:57:46 AM6/19/13
to shiny-...@googlegroups.com
Or named vectors ...

Alex Brown

unread,
Jun 19, 2013, 2:08:05 AM6/19/13
to shiny-...@googlegroups.com
So, note that list in R are substantially more powerful than hash maps in other languages.

1) they are ordered and can be intreated and indexed in that order
2) they can contain a mix of named and not named data
3) they can contain repeated named.  The first is returned

They have compact addressing for path accesses, but I have not used this yet

> z<-list(a=list(b=list(c=1)))
> z[[c("a","b","c")]]

-Alex

On Tuesday, June 18, 2013 10:57:46 PM UTC-7, Alex Brown wrote:
Or named vectors ...

Winston Chang

unread,
Jun 19, 2013, 11:00:59 AM6/19/13
to shiny-...@googlegroups.com
One drawback to lists in R is that they're copied on write, so they can be slow if you're modifying a large list in-place. (I don't know exactly which parts are copied on write -- I don't think every item is copied on write, but the references to them are copied.)


Here's an example that shows how slow it can be to modify a list in-place using a for-loop:

# Generate a bunch of names to use as keys
item_names <- paste0("item", 1:20000)

# Measure the time to evaluate the expression
system.time( {
  x <- list()
  for (name in item_names) {
    x[[name]] <- 0
  }
})
#  user  system elapsed 
# 4.928   0.764   5.711 

Assigning 20,000 items to a list this way takes about 5 seconds.


What about environments? Unlike lists, environments aren't copied on write, so they don't slow down when modifying them in-place:

system.time( {
  e <- new.env()
  for (name in item_names) {
    e[[name]] <- 0
  }
})
#  user  system elapsed 
# 0.054   0.000   0.054 

# environments don't preserve order, so they won't be identical
identical(x, as.list(e))
# [1] FALSE


You can actually generate a list in one go using lapply instead of a for loop, and this is much faster since the list isn't being modified in place (and thus being copied many times):

system.time( {
  #  Create a named vector to use as the source of the lapply
  vec <- numeric(length(item_names))
  names(vec) <- item_names

  # Generate a new list using lapply
  y <- lapply(vec, function(x) 0)
})
#  user  system elapsed 
# 0.005   0.000   0.005 

# Exactly the same result as the original version
identical(x, y)
# [1] TRUE


You might think, maybe it's the lapply that makes this last example really fast -- perhaps the for-loop used in the first two versions is slower than the lapply used in the third. But actually, lapply is slightly slower than a for-loop. The speed advantage of an lapply is that it collects the output into a list -- if you instead do the list assignments in a loop, it can be an expensive operation, as shown in the first example.

To illustrate: the next example uses an environment (like the second example) and it uses lapply (like the third example) but the lapply is simply being used to assign items in an environment; it's not being used to collect the output.

# Speed with lapply instead of for loop is slightly slower
system.time({
  e2 <- new.env()
  lapply(names, function(name) {
    e2[[name]] <- 0
  })
})
#  user  system elapsed 
# 0.072   0.000   0.072 



-Winston



--
Reply all
Reply to author
Forward
0 new messages