Problem with generated html and Rmarkdown

238 views
Skip to first unread message

Frank Harrell

unread,
Mar 30, 2015, 11:32:57 AM3/30/15
to kn...@googlegroups.com
I have some new features in the R Hmisc package that use the system tex4ht package's htlatex command  to convert advanced LaTeX tables to html.  The following script works when I'm not using R markdown:

require(Hmisc)  # uses a version of Hmisc not yet on CRAN

getHdata(pbc)
s <- summaryM(bili + albumin + stage + protime + sex + age + spiders ~ drug,
              data=pbc, test=TRUE)
w <- latex(s, npct='slash', file='s.tex')
z <- html(w)
browseURL(z$file)

d <- describe(pbc)
w <- latex(d, file='d.tex')
z <- html(w)
browseURL(z$file)

Here is the version of the script using Rmarkdown from within RStudio:

``{r,results='asis'} require(Hmisc) getHdata(pbc) s <- summaryM(bili + albumin + stage + protime + sex + age + spiders ~ drug, data=pbc, test=TRUE) w <- latex(s, npct='slash', file='s.tex') html(w, file='') ``` ```{r, results='asis'} d <- describe(pbc) w <- latex(d, file='d.tex') z <- html(w) ```
Note that html( ) when running htlatex produces a .css file that is referenced by the html that is produced.  file='' means that the generated html (but not the .css file) is put to the console.

When I run the script I get http://biostat.mc.vanderbilt.edu/tmp/test.html which does not render correctly, something like the generated .css file being ignored.

I would appreciate any pointers.
Thanks
Frank


Yihui Xie

unread,
Mar 30, 2015, 4:15:45 PM3/30/15
to Frank Harrell, knitr
It is a little tricky to get HTML rendering right in R Markdown in
your case. There are two things you need to do:

1. Let rmarkdown know your HTML output has dependencies (a CSS
stylesheet in this case). This can be done via
htmltools::htmlDependency and attachDependencies.

2. Make sure Pandoc does not touch the HTML code you generated, and
this can be done via htmltools::htmlPreserve(). In your case, some
HTML code was indented by four spaces, and that was treated as <pre>
blocks by Pandoc since four spaces is the syntax of Markdown to
generate <pre>.

Regards,
Yihui
--
Yihui Xie <xiey...@gmail.com>
Web: http://yihui.name

Frank Harrell

unread,
Mar 30, 2015, 6:15:47 PM3/30/15
to kn...@googlegroups.com, harr...@gmail.com
This is very helpful Hihui.  Thank you.  It looks very advanced.  Can you think of an application that has had to do something like this so that I can see the function calls in action?
Frank

Yihui Xie

unread,
Mar 30, 2015, 6:57:51 PM3/30/15
to Frank Harrell, knitr
I just prepared a quick example as attached. Please let me know if you
have further questions. I understand the whole thing may sound
complicated the first time you see it.

Regards,
Yihui
--
Yihui Xie <xiey...@gmail.com>
Web: http://yihui.name


test.Rmd

Frank Harrell

unread,
Mar 30, 2015, 10:56:35 PM3/30/15
to kn...@googlegroups.com, harr...@gmail.com
Thank you for taking the time Yihui!  I'll study that and try to translate it to my situation.  That's extremely helpful.
Sincerely,
Frank

Frank Harrell

unread,
May 25, 2015, 9:56:12 PM5/25/15
to kn...@googlegroups.com, harr...@gmail.com
Yihui,

I've been trying to understand this.  The html generated by htlatex uses a pretty complicated css style with lots of divs.  I attached a sample .html and .css file produced by htlatex if you have time to glance at it.  I don't understand which divs I need to set up and I don't understand the first argument to htmlDependency or the version argument.  Thanks for any further pointers.  

Frank
s-enclosed.css
s-enclosed.html

Yihui Xie

unread,
May 26, 2015, 1:03:10 AM5/26/15
to Frank Harrell, knitr
The first two arguments of htmlDependency() are not important. Just
pick up a name you like and a version number that makes sense to
yourself, e.g. htmlDependency('tex4ht', '0.1.2'). The important
argument is the path to the css file.

It does not matter if the HTML generated by htlatex is complicated.
You can just wrap up all the HTML content inside htmltools::HTML().

I may be able to send you a pull request on Github later this week (in
the Hmisc repository?) if you still cannot figure it out.

Regards,
Yihui
--
Yihui Xie <xiey...@gmail.com>
Web: http://yihui.name


Frank Harrell

unread,
May 26, 2015, 10:01:35 AM5/26/15
to kn...@googlegroups.com, harr...@gmail.com
Hi Yihui - I'm getting closer.  I'm able to see a complex table partially correctly rendered when I run R Markdown with knitr.  The main structure is OK but font changes and boldface are ignored.  Part of my confusion is what to put for class= in tags$div.  My self-contained test script is attached, if you have TeX4ht installed so that the htlatex command exists on your system.  I've also attached the .css and .html files that are generated by running the script just in case.  One other issue is that

div list(class = "whatever I call the class") list(" 

is rendered in the html report
Thanks -Frank
summarym-enclosed.css
summarym-enclosed.html
test3.Rmd

Frank Harrell

unread,
Aug 29, 2015, 11:10:55 AM8/29/15
to knitr, harr...@gmail.com
Yihui I hope you'll have time to get to this - thanks very much
Frank
Message has been deleted

Yihui Xie

unread,
Aug 29, 2015, 4:13:21 PM8/29/15
to Frank Harrell, knitr
I just took a look at your example, and your main issue was that you
have to separate the HTML <body> with the <head>. Here is a hackish
way to achieve that (using regular expressions, which is not
reliable):

ehtml = function(content) {
content = htmltools::HTML(gsub('^.*?<body\\s*>|</body>.*$', '', content))
d = htmltools::htmlDependency(
'TeX4ht', '1.0.0', src = getwd(), stylesheet = 'summarym-enclosed.css')
htmltools::attachDependencies(content, d)
}

Note summarym-enclosed.css was hard-coded in the function, and you'd
better replace it with a filename automatically elicited from the
HTML. I'm not sure if there is a way for htlatex to produce the body
and css separately, so you don't have to hack at the generated HTML.

Regards,
Yihui
--
Yihui Xie <xiey...@gmail.com>
Web: http://yihui.name


test3.rmd

Frank Harrell

unread,
Aug 29, 2015, 4:37:09 PM8/29/15
to Yihui Xie, knitr
It worked!  Thanks so much Yihui.  You had the best reason for not responding sooner!

I'll need to study this more to understand your last comment about the file name.

I'm going to also test an option for use latexml to go latex -> xml -> html.  That approach would handle equations better.

Frank
Reply all
Reply to author
Forward
0 new messages