Monday, February 22, 2010

Document Preview in DSpace, using Google Docs Viewer

    

I've made a modification that allows a DSpace repository to embed a "preview" of the document into the page, so the user never has to leave, just to get to the good stuff.

Adding a Preview link to the item's bitstream.

Then showing the document with Google Doc's Viewer. It essentially becomes a very easy way to make your content more easily accessible, as files that are MS Word or Adobe PDF don't require the visitor to have those programs installed to view them, and load instantly on the page.

How it works
The Google Docs viewer essentially works the same as the user downloading the document to their computer and having the Adobe Reader display the file for them, however, it uses Google Docs to download the document, and then render the PDF before their eyes in real-time so that they can immediately see the document on that page.

It involves modifying ItemTag.java to add the link and a preview box, adding four javascript functions, and used some jQuery. Going forward, I'm thinking that this builds off of Stuart Lewis' idea of having a preview anything. As this is generally a response (for documents) to:
Perhaps we should make this is into a pluggable system for DSpace 1.6 where you can register classes that can render file types, and then make a configurable option to register viewers to filetypes?

UPDATE: The code is available in a writeup at the DSpace Wiki: Document Preview with Google Docs Viewer

47 comments:

  1. Hi,

    It sounds great! Is there a way to get to code?

    ReplyDelete
  2. The code is now available on the DSpace Wiki: http://wiki.dspace.org/index.php/Document_Preview_with_Google_Docs_viewer

    ReplyDelete
  3. Hello Peter,

    perfect idea! Congratulations!

    I will try it asap!

    Greetings from Greece,

    Costas--

    ReplyDelete
  4. Hi, Peter
    That's nice. But where can I change?
    So you can write step-by-step guide?
    Help me

    Cheers, Zaya

    ReplyDelete
  5. Hi, this is a greate post, but can u make a step by step tutorial, I tried to folow but no sucess

    ReplyDelete
  6. There is a full writeup on the DSpace Wiki.
    http://wiki.dspace.org/index.php/Document_Preview_with_Google_Docs_viewer

    ReplyDelete
  7. I saw it and tried to follow it but on mvn package command i get this:


    [INFO] WEB-INF/web.xml already added, skipping
    [INFO] ------------------------------------------------------------------------
    [INFO] Building DSpace JSP-UI
    [INFO] task-segment: [package]
    [INFO] ------------------------------------------------------------------------
    [INFO] [enforcer:enforce {execution: enforce-maven}]
    [INFO] [site:attach-descriptor {execution: default-attach-descriptor}]
    [INFO] ------------------------------------------------------------------------
    [INFO] Building DSpace JSP-UI :: API and Implementation
    [INFO] task-segment: [package]
    [INFO] ------------------------------------------------------------------------
    [INFO] [enforcer:enforce {execution: enforce-maven}]
    [INFO] [resources:resources {execution: default-resources}]
    [INFO] Using 'UTF-8' encoding to copy filtered resources.
    [INFO] skip non existing resourceDirectory /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/resources
    [INFO] [compiler:compile {execution: default-compile}]
    [INFO] Compiling 1 source file to /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/target/classes
    [INFO] ------------------------------------------------------------------------
    [ERROR] BUILD FAILURE
    [INFO] ------------------------------------------------------------------------
    [INFO] Compilation failure

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[719,8] 'try' without 'catch' or 'finally'

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[972,13] class, interface, or enum expected

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[973,9] class, interface, or enum expected

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[978,8] class, interface, or enum expected

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[981,4] class, interface, or enum expected

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[987,4] class, interface, or enum expected

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[995,8] class, interface, or enum expected

    /usr/local/dspace-1.7.1-src-release/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java:[998,8] class, interface, or enum expected


    and so on

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. I'm not wery good with Java, i'm doing this as a project for my Faculty, can You please help me?

    ReplyDelete
  10. Its best to contact the mailing list dspace-tech for support. But... I went ahead and added this change to my github clone of DSpace. So, see the full diff yourself at: https://github.com/peterdietz/DSpace/commit/93b89f43a0cde69741880b971f195b2018298d59

    ReplyDelete
  11. Thank you for your help, but i must ask you, with wat editor i have to open thoes files to get that layout (two colums of numbers (addreses) at left side)?

    ReplyDelete
  12. Hi Igor,
    That code was tracked in Git. (A version control system an order of magnitude better than subversion).

    Steps to Generate a patch
    git clone git://github.com/peterdietz/DSpace.git
    cd DSpace
    git diff 285b3552d9..93b89f43a0
    ## or ##
    git diff 285b3552d9..93b89f43a0 > ~/google-preview.patch


    I suppose the simplest way to get the patch would be to simply append ".diff" to the url of the commit.
    https://github.com/peterdietz/DSpace/commit/93b89f43a0cde69741880b971f195b2018298d59.diff

    You could also grab the full-text of the file, already modified with the document preview modification.

    ItemTag.java
    https://github.com/peterdietz/DSpace/blob/93b89f43a0cde69741880b971f195b2018298d59/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/jsptag/ItemTag.java

    header-default.jsp
    https://github.com/peterdietz/DSpace/blob/93b89f43a0cde69741880b971f195b2018298d59/dspace-jspui/dspace-jspui-webapp/src/main/webapp/layout/header-default.jsp

    utils.js
    https://github.com/peterdietz/DSpace/blob/93b89f43a0cde69741880b971f195b2018298d59/dspace-jspui/dspace-jspui-webapp/src/main/webapp/utils.js

    Or, just download a zip file of the entire project -- https://github.com/peterdietz/DSpace/zipball/master (~65MB)

    ReplyDelete
  13. can u pliz tell me after making changes to itemtag.java, header-default.jsp and utils do i need to run mvn packages so that the changes will take place coz i ve already run mvn package and copy jspui folder under tomcat. thanks in advance

    ReplyDelete
  14. Hi peter i need ur help after making changes to temtag.java, header-default.jsp and utils do i need to run mvn packages so that the changes will take place coz i ve already run mvn package last time i install i.e 2 months back and copy jspui folder under tomcat already. thanks in advance

    ReplyDelete
  15. @ Laljaseh:

    You have to run maven package and then ant update.

    Peter, thank you wery wery much, I made it working, unfortunatley, i didn't settup handle server so it's not working yet but no errors hase ocured. Thanks one more time, btw, are you maby from Serbia, or Serbian, I see that you have text about Pećka patrijaršija?

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. What if I dont like to subscribe to CNRI's handle service, will I still be able to make Google Docs Viewer functional?

      Delete
    3. Alyssa, you don't have to use the CNRI handle service, it only helps to resolve handle's, which are links on the page. i.e. handle.net/handle/1234/2345, for actual file downloads, it is likely to be pulling from either dspace.example.com/handle/123456789/2233/1/thesis.pdf?sequence=1 or something like that. Or perhaps it could pull from dspace.exmaple.com/bitstream/id/7890/thesis.pdf (It's been a while since I've used this doc viewer).

      I've found an old branch on GitHub that I had working with this document viewer:
      See: https://github.com/peterdietz/DSpace/commit/b00081b22239a273d88c1a9cf4caa334c0f78f9a

      git clone https://github.com/peterdietz/DSpace.git peterdietzDSpace
      cd peterdietzDSpace
      git checkout doc-viewer

      Delete
    4. List of supported file formats on google docs viewer are listed on this link and unfortunately bitstreams are not of one them. Please enlighten me on this. Thanks.

      https://support.google.com/drive/answer/2423485?hl=en&p=docs_viewer&rd=1

      Delete
    5. A bitstream is a generic term that DSpace uses to refer to uploaded files. The actual file that you uploaded to your item, such as PDF or Word Doc, should be supported by the Google Docs Viewer.

      Also, I tried to visit your DSpace instance, by that IP address, is its not accessible from my location.

      You can test if this is accessible public via something like: http://webcab.de/fetchpage.htm
      Your site, I got:
      1 500 Can't connect to xxx.xxx.xxx.xxx:8081 (connect: timeout)
      2

      But, an item from the Longsight TryDSpace site ( https://trydspace.longsight.com/handle/123456789/83 ) I get a successful response.

      You could always temporary create a tunnel/proxy to test this google doc tool with something like: http://localtunnel.me/
      But you would eventually need to make your server publicly accessible.

      Delete
  16. @Igor
    If you're testing this on a computer that Google can't reach, look at modifying: https://github.com/peterdietz/DSpace/blob/master/dspace-jspui/dspace-jspui-webapp/src/main/webapp/utils.js#L289

    For testing, you could change "source" to be your production server, where Google is able to download from. You'd also have change it to say if(false) to if(true).

    I'm not Serbian, but our DSpace has a special collection of Serbian photographs. https://kb.osu.edu/dspace/handle/1811/37112

    ReplyDelete
  17. Hi, I've just adapted this code to DSpace 1.7.0 and it works great!!! Thanks a lot to Peter for his great work.
    Regards

    ReplyDelete
  18. Hi need ur help, i've run MVN package and build process's is succes, unfortunately i ve got an error build fails when i run ANT FRESH_INSTALL Pliz help me out to solve dis problem need it urgently

    ReplyDelete
  19. Hi peter i' ve installed successfully, bt when i create administrator i got the following error "fail to startup dspace service manager"

    ReplyDelete
  20. @thexrocks - Send your questions to the DSpace mailing list: https://lists.sourceforge.net/lists/listinfo/dspace-tech/

    ReplyDelete
  21. @thexrocks - If you allready have instalation completed, you don't have to run ant fresh_install, run ant update .

    Peter, I have one more question, I have made translation to Serbian language, and it's working nice, but have one problem, Instead of Srpski English, I have Српски English, how can I change it from Cirillyc to Latin letters, I have tried with diferent codes: sr (output Cirillyc), sr_yu - nothing, rs... and so on?

    ReplyDelete
  22. thanks alot now i can access dspace n created administrator ,i cldnt create administrator coz of firewall protection, i've got one more problem ie when a client open pdf files in dspace ,pdf files file will be open if he/ she has adobe reader installed in his system.I cant use Google doc Viewer since dis service will be offline all the time and its accessible by only few person in office so i cant installed the required software in their system, the same for video it should be open in media player installed in client computer. so i want a code for dis pliz help me. n Thanks alot for ur help

    ReplyDelete
  23. Hi its me again Thexrock
    thanks alot now i can access dspace n created administrator ,i cldnt create administrator coz of firewall protection, i've got one more problem ie when a client open pdf files in dspace ,pdf files file will be open if he/ she has adobe reader installed in his system.I cant use Google doc Viewer since dis service will be offline all the time and its accessible by only few person in office so i cant installed the required software in their system, the same for video it should be open in media player installed in client computer. so i want a code for dis pliz help me. n Thanks alot for ur help

    ReplyDelete
  24. Hi Peter if we use Google doc viewer we need to be online to view dos document in dspace. coz i got some error showing we are unable to retrieve ur document even if i m online.

    ReplyDelete
  25. Its not enought to just be online, your DSpace instance has to be on a public facing web server. Since it relies on Google to convert the PDF to a webpage.

    So, test that your bitstream (.doc, .xls) is viewable by the world / Google, by testing it at: http://docs.google.com/viewer

    ReplyDelete
  26. @Igor, if your question is about how to customize your site to support a different locale, then you gotta find that either in the DSpace documentation, or some Google searches, or lastly ask on the dspace-tech mailing list.

    ReplyDelete
  27. Hi Peter, Pls, How I can remove a pdf file from google docs that was load from my dspace site. I had delete it from my dspace site but the pdf file is available from google docs

    ReplyDelete
  28. Hi Peter Dietz,

    Your contribution is great

    ReplyDelete
  29. This is my error, can you help me ?
    http://ni2.upanh.com/b3.s13.d3/47f830da1a382ecba1a4748994daa448_41259502.preview.png

    ReplyDelete
  30. If its not loading your file, try testing your bitstream directly with the viewer: https://docs.google.com/viewer

    Its possible that the issues could be:
    - Spaces in file name
    - File Format (it might not want to do .doc)
    - URL, I think it wants to see something.doc or something.pdf in the filename, not resource/1234?parameter
    - Firewall. Your server must be publicly accessible to the internet, thus Google's server must be able to download the bitstream from your server.

    ReplyDelete
  31. Hello Peter,

    Currently trying to implement your code to our dspace repository at the University of Arts in Tirana, Albania.

    My question is can you give me detailed instructions where exactly to add the code found on this page
    https://wiki.duraspace.org/display/DSPACE/Document+Preview+with+Google+Docs+viewer

    as I've tried several times with no luck. Created the googleviewer.js file, tried patching theme.xls file, gives me an error. Tried adding it to the item-view.xsl, still error.
    By the way, I easily implemented the audio-visual pseudo streaming players with no problem. It just seems as though I am not finding the right way to utilize your code. I would appreciate very much if you could write me a dummies guide on how to do this.

    Thanks in advance

    ReplyDelete
  32. This comment has been removed by the author.

    ReplyDelete
  33. Finally after experimenting managed to embed google doc viewer, by simply adding an iframe to item-view.xsl. Did this for streaming pdf documents saved in our repository through google document viewer. It is a pretty practical solution, the only problem are the filesize and other restrictions that google applies. Below the modifications I used:

    In webapp/themes/theme/lib/xsl/aspect/artifactbrowser/item-view.xsl find the line:

    <xsl:apply-templates select="./mets:dmdSec/mets:mdWrap[@OTHERMDTYPE='DIM']
    /mets:xmlData/dim:dim"mode="itemSummaryView-DIM"/>

    Immediately after the above line add the following code:

    <!-- Preview pdf with google document viewer -->

    <xsl:if test="./mets:fileSec/mets:fileGrp[@USE='CONTENT']/mets:file[@MIMETYPE='application/pdf']">
    <hr/>
    <left>
    <h2>Item Preview</h2>
    </left>
    <xsl:for-each select="./mets:fileSec/mets:fileGrp[@USE='CONTENT']/mets:file[@MIMETYPE='application/pdf']">

    <div style="text-align:left;">

    <iframe height="400" width="680" src="http://docs.google.com/gview?url=yourProductionServerURL/{mets:FLocat/@xlink:href}/something.pdf&amp;embedded=true">

    <!-- be careful to enter your server url above, so that google doc viewer can reach your server to upload and convert pdf files -->

    <xsl:choose>
    <xsl:when test="./mets:fileSec/mets:fileGrp[@USE='CONTENT' or @USE='ORIGINAL']">
    <xsl:apply-templates select="./mets:fileSec/mets:fileGrp[@USE='CONTENT' or @USE='ORIGINAL']">
    <xsl:with-param name="context" select="."/>
    <xsl:with-param name="primaryBitstream" select="./mets:structMap[@TYPE='LOGICAL']/mets:div[@TYPE='DSpace Item']/mets:fptr/@FILEID"/>
    </xsl:apply-templates>
    </xsl:when>

    <xsl:when test="./mets:fileSec/mets:fileGrp[@USE='ORE']">
    <xsl:apply-templates select="./mets:fileSec/mets:fileGrp[@USE='ORE']"/>
    </xsl:when>
    <xsl:otherwise>
    <h2><i18n:text>xmlui.dri2xhtml.METS-1.0.item-files-head</i18n:text></h2>
    <table class="ds-table file-list">
    <tr class="ds-table-header-row">
    <th><i18n:text>xmlui.dri2xhtml.METS-1.0.item-files-file</i18n:text></th>
    <th><i18n:text>xmlui.dri2xhtml.METS-1.0.item-files-size</i18n:text></th>
    <th><i18n:text>xmlui.dri2xhtml.METS-1.0.item-files-format</i18n:text></th>
    <th><i18n:text>xmlui.dri2xhtml.METS-1.0.item-files-view</i18n:text></th>
    </tr>
    <tr>
    <td colspan="4">
    <p><i18n:text>xmlui.dri2xhtml.METS-1.0.item-no-files</i18n:text></p>
    </td>
    </tr>
    </table>
    </xsl:otherwise>
    </xsl:choose>


    </iframe>

    </div>
    </xsl:for-each>
    <hr/>
    <br />
    </xsl:if>

    That is basically it.

    ReplyDelete
    Replies
    1. Hi Erjon, Im confused. The url you had been using is different from Peter's url. You are using a link that has handle on the url, while Peter had use the bitstream id. What must be appropriate? Im really stuck on it. An iframe appear on my page but doesnt load anything. Please help me on this.

      Delete
  34. Although the sample above is a practical approach, still it is not a solution. Will try experimenting with flexpaper

    ReplyDelete
  35. This comment has been removed by the author.

    ReplyDelete
  36. It is even simplier if we insert the iframe above removing google viewer code, like so:

    <iframe height="400" width="680" src="{mets:FLocat/@xlink:href}">

    It should appear like the following link

    UART Dspace Sample

    ReplyDelete
  37. Hi,

    Can I pass file in streams to google docs for document preview??
    Any help would be greatly appreciated,thank you very much
    Thanks,
    Sathish

    ReplyDelete
  38. This comment has been removed by the author.

    ReplyDelete
  39. Hi an iframe appears in my page but does not load any page. How will I fix it?

    ReplyDelete
  40. You can also take a look at the Duraspace wiki page on this. https://wiki.duraspace.org/display/DSPACE/Document+Preview+with+Google+Docs+viewer

    Since this write up started about 5 years ago, the current DSpace code could have changed since then, so it might be different, or no longer compatible.

    ReplyDelete

Note: Only a member of this blog may post a comment.