Php Input Validation

0 views

Skip to first unread message

Stephani Kapnick

unread,

Aug 5, 2024, 8:55:02 AM8/5/24

to turnoharpe

Inputvalidation is performed to ensure only properly formed data is entering the workflow in an information system, preventing malformed data from persisting in the database and triggering malfunction of various downstream components. Input validation should happen as early as possible in the data flow, preferably as soon as the data is received from the external party.

Data from all potentially untrusted sources should be subject to input validation, including not only Internet-facing web clients but also backend feeds over extranets, from suppliers, partners, vendors or regulators, each of which may be compromised on their own and start sending malformed data.

Input Validation should not be used as the primary method of preventing XSS, SQL Injection and other attacks which are covered in respective cheat sheets but can significantly contribute to reducing their impact if implemented properly.

It is always recommended to prevent attacks as early as possible in the processing of the user's (attacker's) request. Input validation can be used to detect unauthorized input before it is processed by the application.

It is a common mistake to use denylist validation in order to try to detect possibly dangerous characters and patterns like the apostrophe ' character, the string 1=1, or the tag, but this is a massively flawed approach as it is trivial for an attacker to bypass such filters.

If it's well structured data, like dates, social security numbers, zip codes, email addresses, etc. then the developer should be able to define a very strong validation pattern, usually based on regular expressions, for validating such input.

It's also free-form text input that highlights the importance of proper context-aware output encoding and quite clearly demonstrates that input validation is not the primary safeguards against Cross-Site Scripting. If your users want to type apostrophe ' or less-than sign

When designing regular expression, be aware of RegEx Denial of Service (ReDoS) attacks. These attacks cause a program using a poorly designed Regular Expression to operate very slowly and utilize CPU resources for a very long time.

The type of encoding is specific to the context of the page where the user controlled data is inserted. For example, HTML entity encoding is appropriate for data placed into the HTML body. However, user data placed into a script would need JavaScript specific output encoding.

The upload feature should be using an allowlist approach to only allow specific file types and extensions. However, it is important to be aware of the following file types that, if allowed, could result in security vulnerabilities:

The biggest caveat on this is that although the RFC defines a very flexible format for email addresses, most real world implementations (such as mail servers) use a far more restricted address format, meaning that they will reject addresses that are technically valid. Although they may be technically correct, these addresses are of little use if your application will not be able to actually send emails to them.

As such, the best way to validate email addresses is to perform some basic initial validation, and then pass the address to the mail server and catch the exception if it rejects it. This means that the application can be confident that its mail server can send emails to any addresses it accepts. The initial validation could be as simple as:

Semantic validation is about determining whether the email address is correct and legitimate. The most common way to do this is to send an email to the user, and require that they click a link in the email, or enter a code that has been sent to them. This provides a basic level of assurance that:

In some cases, users may not want to give their real email address when registering on the application, and will instead provide a disposable email address. These are publicly available addresses that do not require the user to authenticate, and are typically used to reduce the amount of spam received by users' primary email addresses.

Blocking disposable email addresses is almost impossible, as there are a large number of websites offering these services, with new domains being created every day. There are a number of publicly available lists and commercial lists of known disposable domains, but these will always be incomplete.

If these lists are used to block the use of disposable email addresses then the user should be presented with a message explaining why they are blocked (although they are likely to simply search for another disposable provider rather than giving their legitimate address).

If it is essential that disposable email addresses are blocked, then registrations should only be allowed from specifically-allowed email providers. However, if this includes public providers such as Google or Yahoo, users can simply register their own disposable address with them.

Sub-addressing allows a user to specify a tag in the local part of the email address (before the @ sign), which will be ignored by the mail server. For example, if that example.org domain supports sub-addressing, then the following email addresses are equivalent:

Some users will use a different tag for each website they register on, so that if they start receiving spam to one of the sub-addresses they can identify which website leaked or sold their email address.

Because it could allow users to register multiple accounts with a single email address, some sites may wish to block sub-addressing by stripping out everything between the + and @ signs. This is not generally recommended, as it suggests that the website owner is either unaware of sub-addressing or wishes to prevent users from identifying them when they leak or sell email addresses. Additionally, it can be trivially bypassed by using disposable email addresses, or simply registering multiple email accounts with a trusted provider.

Semantic validity includes only accepting input that is within an acceptable range for the given application functionality and context. For example, a start date must be before an end date when choosing date ranges.

ImportantWhen building secure software, allowlisting is the recommended minimal approach. Denylisting is prone to error and can be bypassed with various evasion techniques and can be dangerous when depended on by itself. Even though denylisting can often be evaded, it can often be useful to help detect obvious attacks. So while allowlisting helps limit the attack surface by ensuring data is of the right syntactic and semantic validity, denylisting helps detect and potentially stop obvious attacks.

Input validation must always be done on the server-side for security. While client side validation can be useful for both functional and some security purposes, it can often be easily bypassed. This makes server-side validation even more fundamental to security. For example, JavaScript validation may alert the user that a particular field must consist of numbers but the server side application must validate that the submitted data only consists of numbers in the appropriate numerical range for that feature.

Care should be exercised when creating regular expressions. Poorly designed expressions may result in potential denial of service conditions (aka ReDoS ). Various tools can test to verify that regular expressions are not vulnerable to ReDoS.

Regular expressions are just one way to accomplish validation. Regular expressions can be difficult to maintain or understand for some developers. Other validation alternatives involve writing validation methods programmatically which can be easier to maintain for some developers.

Some frameworks support automatic binding of HTTP requests parameters to server-side objects used by the application. This auto-binding feature can allow an attacker to update server-side objects that were not meant to be modified. The attacker can possibly modify their access control level or circumvent the intended business logic of the application with this feature.

Consider an application that needs to accept HTML from users (via a WYSIWYG editor that represents content as HTML or features that directly accept HTML in input). In this situation validation or escaping will not help.

Possible bug report or maybe I'm looking at this the wrong way.

I have a few text inputs that have the validation "Required" and a button that triggers a javascript query where I check the validation of those inputs. If the user interacts with each of the text inputs everything is fine, but if they skip this and click the submit button validation hasn't triggered yet and allows them to continue when it shouldn't (despite me manually triggering validation for these inputs in that query).

I made a small app to demonstrate. I dragged a textbox into the new project and set the required validation.

image730929 38.8 KB

I then added the following javascript query. The first time it runs it says it is valid (I'm not sure if the "await" is appropriate here, but the behavior is the same with or without it) despite it not being valid, the second time it runs it does work properly though.

image1220756 62.6 KB

I also tried making a custom validation and had the same issues.

The workaround I've found so far is triggering textInput1.validate() and having all my validations hidden (so the user isn't greeted with a bunch of errors) on launch of the app, then have a javascript query un-hide the validation when a confirmation button is pressed.

One suggestion would be to automatically enable it when the JS query uses variables that are then subject to this issue. Or possibly, display a notice that you may need to enable this property if the query is using variables. I use this parameter frequently and I still forget at times.

Yes, 100% agree! We have a feature request internally to add a warning that could conditionally show in the query editor depending on whether the code would be impacted by the setting. I'll add another +1 to that request & will follow up here when it ships