Fwd: rtexttools

5 views
Skip to first unread message

Timothy Jurka

unread,
Aug 4, 2011, 2:51:36 PM8/4/11
to rtextto...@googlegroups.com
We had a successful install/operation at the University of Maryland Institute for Advanced Computer Studies. Looks like we're ready to send out the CAP announcement...

Places RTextTools has been publicized:

Best,
Tim

Begin forwarded message:

From: P Resnik <psresnik+...@gmail.com>
Date: August 4, 2011 11:33:51 AM PDT
To: Timothy Jurka <tpj...@ucdavis.edu>
Subject: Re: rtexttools

Thanks for the explanation, which is reassuring!  Yup, on my mac that directory is in just the same place.  Happy to provide feedback, at the cost of possibly bugging you with more questions in the future. ;)

Best regards,

  Philip

On Thu, Aug 4, 2011 at 2:31 PM, Timothy Jurka <tpj...@ucdavis.edu> wrote:
Hi Philip,

To make the demo run quickly, it is trained on a very small dataset. In this case, GLMNET classifies a lot of documents as #19 (leaving the other fields blank), and SVM doesn't classify any #4 documents. Therefore, those fields are filled with either NaN or 0.00. I agree that we should clarify this in the documentation.

On my mac the data files live at /Library/Frameworks/R.framework/Versions/2.13/Resources/library/RTextTools/, in the data/ directory. This brings up another point that I should probably include the raw .tar.gz source file on the installation page.

Thank you for your feedback!
Tim

--
Timothy P. Jurka
Graduate Student
Department of Political Science
University of California, Davis
www.timjurka.com

On Aug 4, 2011, at 11:25 AM, P Resnik wrote:

Ah, I see!  I read the example before executing it, and assumed that 'data' must be a missing subdirectory of the current directory.  Your code was smarter than I was. :)

This does raise the question of where the installed package and particularly the data directory actually live on my Mac, if you happen to know (please forgive my ignorance!)?

The simple_demo.R did run to completion, though I'm not sure if the NaN's below indicate something's not right.  Perhaps you might consider also including example output files in the package, so that it's possible to compare one's own output and feel confident that things are running correctly? 

Thanks again for your time,

  Philip


> head(results@algorithm_summary)
  SVM_PRECISION SVM_RECALL SVM_FSCORE GLMNET_PRECISION GLMNET_RECALL
1          0.50       0.45       0.47              NaN             0
2          0.43       0.19       0.26              NaN             0
3          0.50       0.56       0.53              NaN             0
4           NaN       0.00        NaN              NaN             0
5          0.69       0.64       0.66              NaN             0
6          0.71       0.77       0.74              NaN             0
  GLMNET_FSCORE MAXENTROPY_PRECISION MAXENTROPY_RECALL MAXENTROPY_FSCORE
1           NaN                 0.69              0.82              0.75
2           NaN                 0.71              0.31              0.43
3           NaN                 0.65              0.72              0.68
4           NaN                  NaN              0.00               NaN
5           NaN                 0.64              0.50              0.56
6           NaN                 0.65              0.85              0.74
  SVM_ACCURACY GLMNET_ACCURACY MAXENTROPY_ACCURACY
1     45.45455               0            81.81818
2     18.75000               0            31.25000
3     55.55556               0            72.22222
4      0.00000               0             0.00000
5     64.28571               0            50.00000
6     76.92308               0            84.61538


On Thu, Aug 4, 2011 at 2:17 PM, Timothy Jurka <tpj...@ucdavis.edu> wrote:
I see... the comments said to change the file path. That's an artifact of the older version of R, but now the datasets are bundled. I'll correct the instructions ASAP.

The example should run as-is on your computer. If it doesn't, please let me know!


Best,
Tim

--
Timothy P. Jurka
Department of Political Science
University of California, Davis
www.timjurka.com

On Aug 4, 2011, at 11:08 AM, P Resnik wrote:

Hi!  Trying out RTextTools... But I can't find pointers to example data in the quick start guide, documentation, or example scripts.   For example, simple_demo.R refers to data/NYTimes.csv.gz but ...  Ah, ok, I did some Google searching, which led to http://dirk.eddelbuettel.com/cranberries/, which mentions that file, which led me to the .tgz file at http://cran.r-project.org/web/packages/maxent/index.html.  So I've got the file and the demo ran to completion. 

So, pulling back from that little stream-of-conciousness meandering... :)   I guess my general suggestion is to provide pointers to any of the .csv files you refer to in your distribution (or better yet, include them with the distribution).  Also, thanks for putting these tools together; I look forward to playing with them!

All best,

  Philip


Philip Resnik, Professor
Dept of Linguistics and Institute for Advance Computer Studies
University of Maryland
http://umiacs.umd.edu/~resnik/
res...@umd.edu







Loren Collingwood

unread,
Aug 4, 2011, 2:58:34 PM8/4/11
to rtextto...@googlegroups.com
Excellent. It has also been publicized on facebook\lcollingwood! I'll publicize it on our grad student list serve too.
-Loren
Loren Collingwood
Ph.D. Candidate
Department of Political Science
University of Washington
http://staff.washington.edu/lorenc2
lor...@uw.edu



Amber Boydstun

unread,
Aug 4, 2011, 3:09:50 PM8/4/11
to rtextto...@googlegroups.com
Great!

~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Amber E. Boydstun
Assistant Professor

Department of Political Science
University of California, Davis
One Shields Ave
Davis, CA 95616

http://psfaculty.ucdavis.edu/boydstun/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Loren Collingwood

unread,
Aug 4, 2011, 3:22:56 PM8/4/11
to rtextto...@googlegroups.com
I'm going to also post to polmeth -- but let's wait a few days to see if we get other feedback...

On Aug 4, 2011, at 11:51 AM, Timothy Jurka wrote:


Loren Collingwood
Ph.D. Candidate
Department of Political Science
University of Washington
http://staff.washington.edu/lorenc2
lor...@uw.edu



Reply all
Reply to author
Forward
0 new messages