Re: [dspace-tech] Robots.txt in DSpace 7x

25 views
Skip to first unread message

mw...@iu.edu

unread,
Jan 6, 2025, 1:27:11 PM1/6/25
to dspac...@googlegroups.com
On Fri, Dec 20, 2024 at 05:43:08PM +0000, Brian Keese wrote:
> You don't often get email from brian...@gmail.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
> I am having trouble understanding what I need to do to ensure our robots.txt is being placed in the right location.
>
> Our site's url is https://scholarworks.iu.edu/dspace<https://scholarworks.iu.edu/dpspace>
>
> I see robots.txt.ejs in the ui's code at dist/browser/assets.

That is where ours is. Also at 'dist/server/assets'.

> I've read here (https://wiki.lyrasis.org/display/DSDOC7x/Search+Engine+Optimization#SearchEngineOptimization-Createagoodrobots.txt) that the robots.txt needs to be at https://scholarworks.iu.edu<https://scholarworks.iu.edu/>/robots.txt

Yes. And you have one there. I just fetched it. Is the content not
what you were expecting?

> What I can't find anywhere is instructions for how to make that happen. I can use our nginx proxy to redirect https://scholarworks.iu.edu/robots.txt to something, but I don't know what to redirect it to.

There is specific code in 'server.ts' to handle requests for
'/robots.txt':

/**
* Serve the robots.txt ejs template, filling in the origin variable
*/
server.get('/robots.txt', (req, res) => {
res.setHeader('content-type', 'text/plain');
res.render('assets/robots.txt.ejs', {
'origin': req.protocol + '://' + req.headers.host,
});
});

The copy at 'src/robots.txt.ejs' is copied into 'dist' by webpack and
should be found there by 'server.ts'.

--
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
library.indianapolis.iu.edu
signature.asc
Reply all
Reply to author
Forward
0 new messages