New backend in dplyr - how to create separate package

232 views
Skip to first unread message

Tomáš Greif

unread,
Feb 25, 2015, 6:09:21 AM2/25/15
to manip...@googlegroups.com
Hi,

I am trying to create separate package which will support Oracle as additional backend in dplyr. I thought it is wise not to copy entire dplyr - it would be smart to have only as little as needed in my package so it is easier to maintain.

I started by copying the following files: tbl-sql, dbi-s3, utils. Added src-oracle. Added some packages to depends and imports. 

Now the process was
 1) added translate-sql after "object 'all_names' not found" when using document
 2) I can connect, but "batting <- tbl(my_db, "Batting")" resuts in " could not find function "is.sql_variant". Added translate-sql-helpers.r to my package
 3) Another error: Error in as.list(from) : object 'base_symbols' not found ,... added translate-sql-base.r
 4) .......STOP :/

Apparently my process will continue until I copy entire dplyr codebase to my project (something I tried to avoid). MonetDB.R (another dplyr backend) has quite few files in codebase, but I am not sure how they made it work.

Any hints on this?

Tomas


This what I have in description:

Depends:
    dplyr,
    DBI (>= 0.3.1),
    digest (>= 0.6.4),
    methods,
    ROracle,
    R (>= 3.1.1)
Imports:
    assertthat

Hadley Wickham

unread,
Feb 25, 2015, 8:19:18 AM2/25/15
to Tomáš Greif, manipulatr
Hi Tomas,

You shouldn't need to copy in any code from dplyr. I'd recommend a
description that looks like this:

Imports:
dplyr,
DBI (>= 0.3.1),
ROracle,
assertthat

And then add

#' @imports dplyr ROracle DBI

so you get the following lines in your NAMESPACE

import(dplyr)
import(ROracle)
import(DBI)

Then when testing your code, you'll need to load both dplyr and your package.

Hadley
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at http://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.



--
http://had.co.nz/

Tomáš Greif

unread,
Feb 25, 2015, 9:33:38 AM2/25/15
to manip...@googlegroups.com, greif...@gmail.com
Hi Hadley,

thank you for your response, I really appreciate your help. I must be still missing something obvious, still getting complaints about missing functions (%||% this time). I have followed your advise, but here is what I get (https://github.com/tomasgreif/dplyOracle/:

> library(devtools)
> library(dplyr)
> library(ROracle)
>
> document()
Updating dplyrOracle documentation
Loading dplyrOracle
Writing NAMESPACE
Writing dplyrOracle.Rd
Writing src_oracle.Rd
> load_all()
Loading dplyrOracle
>
> host <- "localhost"
> port <- 1521
> sid <- "siddb"
> connect.string <- paste(
+   "(DESCRIPTION=",
+   "(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))",
+   "(CONNECT_DATA=(SID=", sid, ")))", sep = "")
>
>
> my_db <- src_oracle(username = "test", password = "strong",
+                     dbname = connect.string)
Error in src_oracle(username = "test", password = "strong", dbname = connect.string) :
  could
not find function "%||%"

I have now in NAMESPACE (among other lines):

import(DBI)
import(ROracle)
import(assertthat)
import(dplyr)

I (again) tried adding functions from utils (%||%, is_travis), but then ended up with missing db_disconnector (dbi-s3.r).

Is this because these functions are not exported from dplyr or did I get you wrong?

Tomas

Dne středa 25. února 2015 14:19:18 UTC+1 Hadley Wickham napsal(a):

Tomáš Greif

unread,
Feb 25, 2015, 9:44:58 AM2/25/15
to manip...@googlegroups.com, greif...@gmail.com
After adding "%||%" , in_travis, db_disconnector and referencing "build_query" in my own head method for oracle using dplyr:::build_query it seems to work now.

Dne středa 25. února 2015 15:33:38 UTC+1 Tomáš Greif napsal(a):

Einar Hjorleifsson

unread,
Mar 26, 2015, 9:07:03 AM3/26/15
to manip...@googlegroups.com, greif...@gmail.com
hi tomas and the rest.

could you please clarify: "After adding "%||%" , in_travis, db_disconnector and referencing "build_query" in my own head method for oracle using dplyr:::build_query it seems to work now."

say this in light of that i am getting "could not find function "%||%"" like you had already reported on before in your thread:
```
# A. The conventional way, just to show it works
> require(ROracle)
Loading required package: ROracle
Loading required package: DBI
> con <- dbConnect(dbDriver('Oracle'),user='ops$einarhj',password=peeWord,dbname='mar')
> con
User name:             ops$einarhj
Connect string:        mar
Server version:        11.2.0.3.0
Server type:           Oracle RDBMS
Results processed:     0
OCI prefetch:          FALSE
Bulk read:             1000
Statement cache size:  0
Open results:          0
> query <- "SELECT SYNIS_ID, SKIP FROM fiskar.stodvar WHERE leidangur='TB1-2012'"
> output <- dbGetQuery(con,query)
> dbDisconnect(con)
[1] TRUE
> head(output)
  SYNIS_ID SKIP
1   379305 1278
2   379306 1278
3   379307 1278
4   379308 1278
5   379309 1278
6   379310 1278

# VIA dplyr-approach
> require(dplyrOracle)
Loading required package: dplyrOracle
Loading required package: digest
> my_db <- src_oracle(username = 'ops$einarhj', password = peeWord, dbname='mar')
Error in src_oracle(username = "ops$einarhj", password = peeWord, dbname = "mar") :
  could not find function "%||%"
```

regards,
einar

Tomáš Greif

unread,
Mar 26, 2015, 9:27:30 AM3/26/15
to Einar Hjorleifsson, manip...@googlegroups.com
Hi Einar,


I've tried to make dplyr work for Oracle. Unfortunately, my R programming skills are not good enough to make it work. I pushed last version of dplyrOracle to my github account right now. It works for some basic tasks, however it won't work for anything where window functions are needed. I would like to warn you that the code there is experimental and it is not a good idea to use it anywhere but for testing.

Let me know if you would like to contribute to the package development.

Tomas

Tomáš Greif

unread,
Mar 26, 2015, 11:36:15 AM3/26/15
to Einar Hjorleifsson, manip...@googlegroups.com
OK. I know why you get this error. It is because db_query_fields used db_query_fields.DBIConnection method and it just won't work on OraResult. I commited another version where this is fixed. Right now, the following works on my computer (my_db is src object created using src_oracle):

#db_list_tables(con)
db_list_tables(my_db$con)

#db_create_table(con, table, types, temporary = FALSE, ...)
db_create_table(my_db$con, 'DPLYR_TEST', c(a = 'number',  b = 'varchar(20)'))

#db_has_table(con, table)
db_has_table(my_db$con, 'DPLYR_TEST')

#db_data_type(con, fields)
db_data_type(my_db$con, iris)

#db_save_query(con, sql, name, temporary = TRUE, ...)

#db_begin(con, ...)
db_begin(my_db$con)

#db_commit(con, ...)
db_commit(my_db$con)

#db_rollback(con, ...)
db_rollback(my_db$con)

#db_insert_into(con, table, values, ...)
df <- data.frame(a = runif(5), b = LETTERS[1:5])
db_insert_into(my_db$con, 'DPLYR_TEST', df)

#db_create_index(con, table, columns, name = NULL, ...)
db_create_index(my_db$con, 'DPLYR_TEST', 'a')

#db_drop_table(con, table, force = FALSE, ...)
db_drop_table(my_db$con, 'DPLYR_TEST')

#db_analyze(con, table, ...)
# tbd....

#db_explain(con, sql, ...)
# works, but does not return explain plan (it is stored in plan_table)
db_explain(my_db$con, sql('select * from DPLYR_TEST'))

#db_query_fields(con, sql, ...)
db_query_fields(my_db$con, sql("DPLYR_TEST"))

#db_query_rows(con, sql, ...)
db_query_rows(my_db$con, sql('select * from DPLYR_TEST'))

On 26 March 2015 at 15:20, Einar Hjorleifsson <einar.hjo...@gmail.com> wrote:
hi tomas

thanx for this - moved one error message out of the way. but i hit a new one (Error in (function (classes, fdef, mtable)  :   unable to find an inherited method for function ‘dbListFields’ for signature ‘"OraResult", "missing"’) - not asking you to dig into this one, i think it has to do with me connecting to a central Oracle server, need to dig into that at this end.

with respect to R programming skills i guess we are on the same boat :-)

with respect to contribution to package development: fine, if there is anything that i can provide additionally. i will inform you if i make any progress at my end. just don't put your hopes up high.

einar

Einar Hjorleifsson

unread,
Mar 26, 2015, 2:09:25 PM3/26/15
to Tomáš Greif, manip...@googlegroups.com
thanks for a nice work.

even managed the <schema>.<table> syntax that we have on our institutional oracle database.

to proof the last point, check out http://rpubs.com/fishvise/68961  [just shows the station location taken so far in the year 2015 by the marine research institute, reykjavík, iceland].

regards,
einar

Reply all
Reply to author
Forward
0 new messages