When I try to access the parent directory, say localhost/parent, it gives me 403 forbidden. However if I access the sub-directory, say localhost/parent/index.html, it goes through.I believe it's a config issue but could anyone walk me through a bit here?I tried to change the apache2.conf as many people suggested but it doesn't work.(shown below)
You have to pass the -np/--no-parent option to wget (in addition to -r/--recursive, of course), otherwise it will follow the link in the directory index on my site to the parent directory. So the command would look like this:
Afterwards, stripping the query params from URLs like main.css?crc=12324567 and running a local server (e.g. via python3 -m http.server in the dir you just wget'ed) to run JS may be necessary. Please note that the --convert-links option kicks in only after the full crawl was completed.
It sounds like you're trying to get a mirror of your file. While wget has some interesting FTP and SFTP uses, a simple mirror should work. Just a few considerations to make sure you're able to download the file properly.
Ensure that if you have a /robots.txt file in your public_html, www, or configs directory it does not prevent crawling. If it does, you need to instruct wget to ignore it using the following option in your wget command by adding:
Additionally, wget must be instructed to convert links into downloaded files. If you've done everything above correctly, you should be fine here. The easiest way I've found to get all files, provided nothing is hidden behind a non-public directory, is using the mirror command.
Using -m instead of -r is preferred as it doesn't have a maximum recursion depth and it downloads all assets. Mirror is pretty good at determining the full depth of a site, however if you have many external links you could end up downloading more than just your site, which is why we use -p -E -k. All pre-requisite files to make the page, and a preserved directory structure should be the output. -k converts links to local files.Since you should have a link set up, you should get your config folder with a file /.vim.
Depending on the side of the site you are doing a mirror of, you're sending many calls to the server. In order to prevent you from being blacklisted or cut off, use the wait option to rate-limit your downloads.
That's it. It will download into the following local tree: ./example.com/configs/.vim .However if you do not want the first two directories, then use the additional flag --cut-dirs=2 as suggested in earlier replies:
Anyone have a smart workaround for me? Should I create /images/index.html and redirect it to the homepage? I hate to create a bunch of empty useless pages. I submitted to Google Webmaster URL removals but I am afraid someone else is linking to these directories and it will just cause it to get indexed again.
If these are literally directory index pages that are being auto generated (and you don't have access to .htaccess) then yes, you'll need to create an index document in these directories. Either index.html or index.php if that works for you.
Or, if you can't send custom headers, then simply a soft-404 would be a "make do" (ie. a page that simply states "Not Found", but returns a 200 OK status) - it's unlikely to appear in search results, and contains nothing meaningful if it does. Or even just a blank page!?
If you click on "test" in the temp directory it properly displays the index.php file inside that folder (the test directory does NOT contain a WP installation). This implies to me the parent redirect issue is WordPress related.
I am wondering if an S3 backend with -js-s3-explorer , simply proxied through our Netlify sites.
Or a serverless function that would do the S3 interaction like in GitHub - juvs/s3-bucket-browser: AWS S3 Bucket Browser, based on AWS JavaScript API to serve a dynamic directory index.
I think you can use a static site generator like Hugo. It has a function called readDir. You can achive what you want in a few lines of Go templating code. So, any more PDF files that you add would automatically get a link to them on the page. If you want, I can put together a working example.
The DirectoryCheckHandler directive determines whether mod_dir should check for directory indexes or add trailing slashes when some other handler has been configured for the current URL. Handlers can be set by directives such as SetHandler or by other modules, such as mod_rewrite during per-directory substitutions.
In releases prior to 2.4, this module did not take any action if any other handler was configured for a URL. This allows directory indexes to be served even when a SetHandler directive is specified for an entire directory, but it can also result in some conflicts with modules such as mod_rewrite.
The DirectoryIndex directive sets the list of resources to look for, when the client requests an index of the directory by specifying a / at the end of the directory name. Local-url is the (%-encoded) URL of a document on the server relative to the requested directory; it is usually the name of a file in the directory. Several URLs may be given, in which case the server will return the first one that it finds. If none of the resources exist and the Indexes option is set, the server will generate its own listing of the directory.
A single argument of "disabled" prevents mod_dir from searching for an index. An argument of "disabled" will be interpreted literally if it has any arguments before or after it, even if they are "disabled" as well.
Turning off the trailing slash redirect may result in an information disclosure. Consider a situation where mod_autoindex is active (Options +Indexes) and DirectoryIndex is set to a valid resource (say, index.html) and there's no other special handler defined for that URL. In this case a request with a trailing slash would show the index.html file. But a request without trailing slash would list the directory contents.
It is frequently desirable to have a single file or resource handle all requests to a particular directory, except those requests that correspond to an existing file or script. This is often referred to as a 'front controller.'
A fallback handler (in the above case, /blog/index.php) can access the original requested URL via the server variable REQUEST_URI. For example, to access this variable in PHP, use $_SERVER['REQUEST_URI'].
I just installed Moodle on a local Apache24 server. If I go to localhost/CELO ("CELO" being the name of the project) or localhost/CELO/index.php using my browser, all that returns is the folder directory, a la
cPanel enables you to specify how directories on your web site are displayed. By default, if an index file is not in a directory, the directory's contents are listed in the user's browser. This is usually not recommended and is a potential security issue.
We use cookies to personalize the website for you and to analyze the use of our website. You consent to this by clicking on "I consent" or by continuing your use of this website. Further information about cookies can be found in our Privacy Policy.Iconsent
I have an Ubuntu 14.10 computer that is used for local website testing, it is not serving to the internet. On it, I have seven websites set up. However, when I access two of the seven, I get the Apache2 Ubuntu Default Page instead of my own index page.
As far as I can tell, I set up all seven using the exact same process, so I don't know what these two are missing. Also, in my Apache logs directory, I have two log files, error and access for each of the two misbehaving sites, but all of them are empty. When I restart the apache2 service, there are no errors. I have retraced my steps multiple times and I can not see any difference between the working sites and the non working sites.
795a8134c1