Dispatcher Guide

0 views
Skip to first unread message

Novella Poinsett

unread,
Aug 5, 2024, 9:04:32 AM8/5/24
to bertlonvifer
Youcan use environment variables in string-valued properties in the dispatcher.any file instead of hard-coding the values. To include the value of an environment variable, use the format $variable_name.

The /farms property is a top-level property in the configuration structure. To define a farm, add a child property to the /farms property. Use a property name that uniquely identifies the farm within the Dispatcher instance.


For example, a Dispatcher instance that handles page activation requests for publish instances requires the PATH header in the /clientheaders section. The PATH header enables communication between the replication agent and the Dispatcher.


The /virtualhosts property defines a list of all hostname and URI combinations that Dispatcher accepts for this farm. You can use the asterisk (*) character as a wildcard. Values for the /virtualhosts property use the following format:


When Dispatcher receives an HTTP or HTTPS request, it finds the virtual host value that best-matches the host, uri, and scheme headers of the request. Dispatcher evaluates the values in the virtualhosts properties in the following order:


Create a secure session for access to the render farm so that users must log in to access any page in the farm. After logging in, users can access pages in the farm. See Creating a Closed User Group for information about using this feature with CUGs. Also, see the Dispatcher Security Checklist before going live.


How the session information is encoded. Use md5 for encryption using the md5 algorithm, or hex for hexadecimal encoding. If you encrypt the session data, a user with access to the file system cannot read the session contents. The default is md5.


The name of the HTTP header or cookie that stores the authorization information. If you store the information in the http header, use HTTP:. To store the information in a cookie, use Cookie:. If you do not specify a value, HTTP:authorization is used.


If the timeout is reached while parsing response headers, an HTTP Status of 504 (Bad Gateway) is returned. If the timeout is reached while the response body is read, the Dispatcher returns the incomplete response to the client. It also deletes any cached files that might have been written.


Specifies whether Dispatcher uses the getaddrinfo function (for IPv6) or the gethostbyname function (for IPv4) for obtaining the IP address of the render. A value of 0 causes getaddrinfo to be used. A value of 1 causes gethostbyname to be used. The default value is 0.


The getaddrinfo function returns a list of IP addresses. Dispatcher iterates the list of addresses until it establishes a TCP/IP connection. Therefore, the ipv4 property is important when the render hostname is associated with multiple IP addresses. And, the host, in response to the getaddrinfo function, returns a list of IP addresses that are always in the same order. In this situation, you should use the gethostbyname function so that the IP address that Dispatcher connects with is randomized.


Use the /filter section to specify the HTTP requests that Dispatcher accepts. All other requests are sent back to the web server with a 404 error code (page not found). If no /filter section exists, all requests are accepted.


The /filter section consists of a series of rules that either deny or allow access to content according to patterns in the request-line part of the HTTP request. Use an allowlist strategy for your /filter section:


Element of the Request Line: Include /method, /url, /query, or /protocol. And, include a pattern for filtering requests. Filter them according to specific parts of the request-line part in the HTTP request. Filtering on elements of the request line (rather than on the entire request line) is the preferred filter method.


Advanced Elements of the Request Line: Starting with Dispatcher 4.2.0, four new filter elements are available for use. These new elements are /path, /selectors, /extension, and /suffix respectively. Include one or more of these items to further control URL patterns.


When creating your filter rules, use double quotation marks "pattern" for simple patterns. If you are using Dispatcher 4.2.0 or later and your pattern includes a regular expression, you must enclose the regex pattern '(pattern1pattern2)' within single quotation marks.


This example is based on the default configuration file that is provided with Dispatcher and is intended as an example for use in a production environment. Items prefixed with # are deactivated (commented out). Care should be taken if you decide to activate any of these items (by removing the # on that line). Doing so can have a security impact.


Depending on your installation, there might be more resources under /libs, /apps or elsewhere, that must be made available. You can use the access.log file as one method of determining resources that are being accessed externally.


Since Dispatcher version 4.1.5, use the /filter section to restrict query strings. It is recommended that you explicitly allow query strings and exclude generic allowance through allow filter elements.


A single entry can have either glob or some combination of method, url, query, and version, but not both. The following example allows the a=* query string and denies all other query strings for URLs that resolve to the /etc node:


Dispatcher filters should block access to the following pages and scripts on AEM publish instances. Use a web browser to attempt to open the following pages as a site visitor would and verify that a code 404 is returned. If any other result is obtained, adjust your filters.


When access to vanity URLs is enabled, Dispatcher periodically calls a service that runs on the render instance to obtain a list of vanity URLs. Dispatcher stores this list in a local file. When a request for a page is denied due to a filter in the /filter section, Dispatcher consults the list of vanity URLs. If the denied URL is on the list, Dispatcher allows access to the vanity URL.


The /serveStaleOnError property controls whether Dispatcher returns invalidated documents when the render server returns an error. By default, when a statfile is touched and invalidates cached content, the Dispatcher deletes the cached content. This action is done the next time it is requested.


If /serveStaleOnError is set to "1", Dispatcher does not delete invalidated content from the cache. That is, unless the render server returns a successful response. A 5xx response from AEM or a connection timeout causes the Dispatcher to serve the outdated content and respond with and HTTP Status of 111 (Revalidation Failed).


By default, requests that include this authentication information are not cached because authentication is not performed when a cached document is returned to the client. This configuration prevents Dispatcher from serving cached documents to users who do not have the necessary rights.


On Apache web servers, you can compress the cached documents. Compression allows Apache to return the document in a compressed form if so requested by the client. Compression is done automatically by enabling the Apache module mod_deflate, for example:


When a file at a certain level is invalidated, all .stat files from the docroot to the level of the invalidated file or the configured statsfilevel (whichever is smaller) are touched.


Only resources along the path to the invalidated file are affected. Consider the following example: a website uses the structure /content/myWebsite/xx/. If you set statfileslevel as 3, a .statfile is created as follows:


When a file in /content/myWebsite/xx is invalidated, then every .stat file from docroot down to /content/myWebsite/xxis touched. This scenario is the case only for /content/myWebsite/xx and not for example /content/myWebsite/yy or /content/anotherWebSite.


With automatic invalidation, Dispatcher does not delete cached files after a content update, but checks their validity when they are next requested. Documents in the cache that are not auto-invalidated remain in the cache until a content update explicitly deletes them.


Automatic invalidation is typically used for HTML pages. HTML pages often contain links to other pages, making it difficult to determine whether a content update affects a page. To make sure that all relevant pages are invalidated when content is updated, automatically invalidate all HTML pages. The following configuration invalidates all HTML pages:


The AEM integration with Adobe Analytics delivers configuration data in an analytics.sitecatalyst.js file in your website. The example dispatcher.any file that is provided with Dispatcher includes the following invalidation rule for this file:


This method can be used to cover several different use cases. For example, invalidating other application-specific caches, or to handle cases where the externalized URL of a page, and its place in the docroot, does not match the content path.


When a parameter is ignored for a page, the page is cached the first time that the page is requested. Subsequent requests for the page are served to the cached page, regardless of the value of the parameter in the request.


The /headers property lets you define the HTTP header types that Dispatcher is going to cache. On the first request to an uncached resource, all headers matching one of the configured values (see the configuration sample below) are stored in a separate file, next to the cache file. On subsequent requests to the cached resource, the stored headers are added to the response.


The mode property specifies what file permissions are applied to new directories and files in the cache. The umask of the calling process restricts this setting. It is an octal number constructed from the sum of one or more of the following values:


The /gracePeriod property defines the number of seconds a stale, auto-invalidated resource may still be served from the cache after the last occurring activation. The property can be used in a setup where a batch of activations would otherwise repeatedly invalidate the entire cache. The recommended value is 2 seconds.

3a8082e126
Reply all
Reply to author
Forward
0 new messages