I'm enclosing the relevant links for your amusement - there's a video
that shows the doc in action.
http://www.broadinstitute.org/cancer/software/genepattern/grrd/AddIn.html
- Has all the relevant links available.
In particular - even if you don't have a Science subscription you can
get to the paper from there.
The video for a quick peak is at
http://www.broadinstitute.org/cancer/software/genepattern/grrd/WordAddInDemo.mov
Best,
J
--
Jill P. Mesirov, Ph.D.
Associate Director and Chief Informatics Officer
Director, Computational Biology and Bioinformatics
Broad Institute of MIT and Harvard
7 Cambridge Center
Cambridge MA 02142
phone: 617-714-7070
fax : 617-714-8991
email: mes...@broad.mit.edu
I'd personally love to see a companion plugin for Adobe Acrobat and/or Reader to enable...the Word plugin would embed the necessary information into the produced PDF which could be picked up in Acrobat/Reader and enable the same views, reruns, etc.
Leonard
The ideas of Utopia are excellent, but their implementation isn’t (IMO) the right approach. The PDF itself doesn’t contain any of that rich information, so that it can be used/mined/extracted – instead, it appears to be sitting in one (or more) databases or data repositories online that Utopia is able to “magically” locate and then enable.
I’d prefer to see the same user experience (which is quite well done!) applied to a PDF with that type of rich semantics embedded…
Leonard
> email: mes...@broad.mit.edu <mailto:mes...@broad.mit.edu>
There are definitely two different, but potentially related, items here…
1 – richer material included natively into the PDF at the time of publication
This is where the actual data (XML, XLS, etc.) would be “attached” to the visual table/graph/chart in the PDF, or the MathML or ChemML (or whatever) associated with a given equation or molecular structure, etc. This would enable the type of extended UI that Utopia present on various elements in the PDF – which are indeed the types of things you want to be able to do, whether you are connected to the internet (or some subset thereof) or not.
2 – annotations, added after publication
At Adobe, because we believe that a PDF should be “self-contained”, we’ve approach document collaboration via a “synchronization model”. Everyone can work on their own copies of the document, and their comments are submitted (when they want) up to a “repository”. At any time, each person can either manually (or automatically) synch their comments with all others in the repository. This gives you the “best of both worlds” as you get individual copies of documents, private and public comments, collaboration on comments (replies, etc.) AND they can also live in the PDF itself for offline viewing/processing.
Which also takes us to another issue, and that’s archiving (esp. long term archiving) – which is another reason that the above solutions also work well – in that when the document needs to be “archived off” (be it for personal use, organization use or submission to something like NARA or LOC – or even submission to the FDA) you already have all the necessary pieces.
Leonard
This "in the PDF" approach is all good if you are talking only about contributions / annotations of a single person, or about something that is both completely authoritative and public. But there is a strong use case for multiple, shareable perspectives - for example within a lab or collaboration. This is why Steve with Utopia and our MGH + NIF with Annotation Framework use a standoff metadata model, and why we (speaking for myself but I believe Steve is likely to be in agreement) advocate standardizing and opening the model of metadata, which can be done using a fairly simple ontology model.
I believe that if we were to achieve agreement on a model of annotation metadata that could exist in the same form within the PDF, or outside the PDF, or both, that would be ideal.
I have no problem with there being external references and other information in a PDF – but as you note, there is the related issue of standardizing where/how such information is referenced. What format? Where in the PDF? Etc. If we could all agree on what goes in and how, then we now have interoperability and that’s the most important aspect.
Today there is no question that PDF is just the ‘view’ of the MVC model. However, our goal going forward is to add the ‘model’ to that – so that not only do you have a specific view, but you also have all the necessary pieces to go back to “edit mode” with the model and perhaps even recreate a view. (this may be the entire PDF or just some subsection of it) PDF already has all the necessary components (and many nice-to-have optional) for doing this – but it’s all about standardizing how it gets done and then building the tooling…
Leonard
In particular, I'm thinking about the provenance of the work. Provenance
by it's very nature goes beyond the pdf itself and can be much much
bigger than the pdf. For example, we've done some work where we maintain
a reproducible representation of the results of an astronomy workflow by
maintaining a virtual machine image along with the workflow itself.
Obviously, this is an extreme case, but I doubt people are going to want
to embed a 3GB virtual machine image in their pdfs.
Essentially, we need both embedding in the pdf and linking to the
outside and we need some nice guidance for how to do this.
cheers,
Paul
>> annotation metadata that could exist in the same form _within the
>> PDF_, or _outside the PDF_, or _both_, that would be ideal.
>
> My view is that metadata for the Article of Record goes in the PDF, size
> permitting, but also that links are kept to data outside the PDF, which
> can be resolved at 'read time' to make sure that the PDF is kept as a
> both an Article of Record (JV's 'minutes of science') and as a 'Living
> Document' with links to up-to-date data, comments etc. [and referring to
> my previous whitterings on the subject, as much as I like PDFs and I
> think they make an excellent 'View', I don't believe they make good
> vehicles for storing an articles 'Model')
>
> Best wishes
>
> Steve
>
>
>> Also - ideally when I open a PDF that contains annotation referencing
>> some entity that is commonly studied or used outside the document
>> itself - e.g. a protein, a database, a reagent, a computational tool,
>> a workflow - my Web browser should just natively be able to connect to
>> all other sources of information about that entity wherever they are
>> on the Web, and use these connections to enhance the information I see
>> without jumping all over the place. Annotation itself is or at least
>> should be, an independently sharable boundary object.
>>
>> Best
>>
>> Tim
>>
>>
>> On Nov 14, 2010, at 9:51 AM, Leonard Rosenthol wrote:
>>
>>> The ideas of Utopia are excellent, but their implementation isn�t
>>> (IMO) the right approach. The PDF itself doesn�t contain any of that
>>> rich information, so that it can be used/mined/extracted � instead,
>>> it appears to be sitting in one (or more) databases or data
>>> repositories online that Utopia is able to �magically� locate and
>>> then enable.
>>> I�d prefer to see the same user experience (which is quite well
>>> done!) applied to a PDF with that type of rich semantics embedded�
>>> Leonard
>>> *From:*beyond-...@googlegroups.com
>>> <mailto:beyond-...@googlegroups.com>[mailto:beyond-...@googlegroups.com]*On
>>> Behalf Of*Jodi Schneider
>>> *Sent:*Sunday, November 14, 2010 5:19 AM
>>> *To:*beyond-...@googlegroups.com
>>> <mailto:beyond-...@googlegroups.com>
>>> *Subject:*Re: capturing workflows and embedding in word documents
>>> email:mes...@broad.mit.edu <mailto:mes...@broad.mit.edu>
>>>
>>
>