On Fri, Dec 20, 2024 at 05:43:08PM +0000, Brian Keese wrote:
> You don't often get email from
brian...@gmail.com. Learn why this is important<
https://aka.ms/LearnAboutSenderIdentification>
> I am having trouble understanding what I need to do to ensure our robots.txt is being placed in the right location.
>
> Our site's url is
https://scholarworks.iu.edu/dspace<
https://scholarworks.iu.edu/dpspace>
>
> I see robots.txt.ejs in the ui's code at dist/browser/assets.
That is where ours is. Also at 'dist/server/assets'.
> I've read here (
https://wiki.lyrasis.org/display/DSDOC7x/Search+Engine+Optimization#SearchEngineOptimization-Createagoodrobots.txt) that the robots.txt needs to be at
https://scholarworks.iu.edu<
https://scholarworks.iu.edu/>/robots.txt
Yes. And you have one there. I just fetched it. Is the content not
what you were expecting?
> What I can't find anywhere is instructions for how to make that happen. I can use our nginx proxy to redirect
https://scholarworks.iu.edu/robots.txt to something, but I don't know what to redirect it to.
There is specific code in 'server.ts' to handle requests for
'/robots.txt':
/**
* Serve the robots.txt ejs template, filling in the origin variable
*/
server.get('/robots.txt', (req, res) => {
res.setHeader('content-type', 'text/plain');
res.render('assets/robots.txt.ejs', {
'origin': req.protocol + '://' + req.headers.host,
});
});
The copy at 'src/robots.txt.ejs' is copied into 'dist' by webpack and
should be found there by 'server.ts'.
--
Mark H. Wood
Lead Technology Analyst
University Library
Indiana University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
library.indianapolis.iu.edu