Re: Google Scholar

626 views
Skip to first unread message
Message has been deleted

DSpace Community

unread,
Jul 25, 2023, 4:35:03 PM7/25/23
to DSpace Community
Hi Santiago,

I suspect you need to add both X-Forwarded-Proto and X-Forwarded-Host headers to your proxy.

Tim

On Tuesday, July 25, 2023 at 2:52:12 PM UTC-5 santilo...@gmail.com wrote:
Hello,

I am having a problem. For some reason, google scholar isn't indexing my website. The website generates almost all the meta tags perfectly. For example, with this article:

    <meta name="citation_title" content="Revisión sistemática sobre psicoterapias efectivas y/o tratamientos combinados con pacientes con severidad y comorbilidad">
    <meta name="citation_author" content="Scherb, Elena Diana">
    <meta name="citation_publication_date" content="2022-05">
    <meta name="citation_issn" content="2602-8379">
    <meta name="citation_language" content="es">
    <meta name="citation_keywords" content="PSICOTERAPIA; PSICOPATOLOGIA">
    <meta name="citation_abstract_html_url" content="https://hdl.handle.net/20.500.14340/909">
    <meta name="citation_publisher" content="Universidad Estatal de Milagro">

The only wrong tag and that is maybe the one that is causing the index to fail is the last one. How can I fix it? Because I can hardcode it in the frontend file but there must be a better solution. I am using a nginx proxy with the X-.. tags correctly. 

Do you know if this is the problem causing google scholar to not index our website? 

Regards,
Santiago.

Message has been deleted
Message has been deleted
Message has been deleted

DSpace Community

unread,
Jul 26, 2023, 1:02:13 PM7/26/23
to DSpace Community
Hi Santiago,

I receive regular updates from the Google Scholar team on DSpace indexing. I tend to talk with them a few times a year & receive updates about any common issues they've found with indexing DSpace sites.  They have never mentioned that the date format in "citation_publication_date" is an issue.  If they ever do, we'd treat it as a bug and get it fixed.

However, I can verify that the "citation_publication_date" field simply uses the same date as your "dc.date.issued" metadata field on the Item.  So, if you modify the "dc.date.issued" that will change the value in your "citation_publication_date" meta tag.   But, I do not believe that is necessary for Google Scholar to index your site.

Similarly, I've not heard of any issues with "citation_abstract_html_url" being the handle.  This field takes its value from the "dc.identifier.uri" metadata field on your Item.  So, if the Item has a different value in that field, it will be used in the "citation_abstract_html_url".

Overall, it is possible to modify the behavior of these Google Scholar tags in DSpace 7.  But, you have to modify the behavior of the corresponding "setCitation*Tag()" method in the "metadata.service.ts" file in the UI.  You'd then need to recompile and restart the UI.  For instance, here's the method that sets the "citation_abstract_html_url" tag value: https://github.com/DSpace/dspace-angular/blob/main/src/app/core/metadata/metadata.service.ts#L286

Tim

On Wednesday, July 26, 2023 at 2:02:35 AM UTC-5 santilo...@gmail.com wrote:
Also, `citation_date` is not formatted as required by Google. This is a problem? 

I don't know if we have to follow the format "obligatorily":

Provide full dates in the "2010/5/12" format if available; or a year alone otherwise.

And the last question: it is okay for the citation_abstract_html_url to be a handle URL (handle.net), isn't it? 

I really don't know why we are not indexed by Google.

Sorry to bother you with all these questions.

Regards,
Santiago.

On Wednesday, July 26, 2023 at 8:51:57 AM UTC+2 Santiago Lo Coco wrote:
I fixed the problem by adding this line:

proxy_set_header Host $host;

Regards,
Santiago.

On Wednesday, July 26, 2023 at 8:34:30 AM UTC+2 Santiago Lo Coco wrote:
Thank you Tim.

The problem is that I already done that.

This is my nginx config for the frontend:

location / {
    proxy_set_header X-Forwarded-Proto https;
    proxy_set_header X-Forwarded-Host $host;
    proxy_set_header X-Forwarded-Server $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_pass http://localhost:4000;
}

Do you know if there is a mistake? 

I also add some `add_header` directives for debugging and they are working perfectly. 

This is the output of `curl -v`:

X-Forwarded-Host: repositorio.uflo.edu.ar
X-Forwarded-Proto: https


Regards,
Santiago.
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages