Subsections of Getting Started
CRS Installation
This guide aims to get a CRS installation up and running. This guide assumes that a compatible ModSecurity engine is already present and working. If unsure then refer to the extended install page for full details.
Downloading the Rule Set
The first step is to download CRS. The CRS project strongly recommends using a supported version.
Official CRS releases can be found at the following URL: https://github.com/coreruleset/coreruleset/releases.
For production environments, it is recommended to use the latest release, which is v4.0.0. For testing the bleeding edge CRS version, nightly releases are also provided.
Verifying Releases
Note
Releases are signed using the CRS project’s GPG key (fingerprint: 3600 6F0E 0BA1 6783 2158 8211 38EE ACA1 AB8A 6E72). Releases can be verified using GPG/PGP compatible tooling.
To retrieve the CRS project’s public key from public key servers using gpg
, execute: gpg --keyserver pgp.mit.edu --recv 0x38EEACA1AB8A6E72
(this ID should be equal to the last sixteen hex characters in the fingerprint).
It is also possible to use gpg --fetch-key https://coreruleset.org/security.asc
to retrieve the key directly.
The following steps assume that a *nix operating system is being used. Installation is similar on Windows but likely involves using a zip file from the CRS releases page.
To download the release file and the corresponding signature:
wget https://github.com/coreruleset/coreruleset/archive/refs/tags/v4.0.0.tar.gz
wget https://github.com/coreruleset/coreruleset/releases/download/v4.0.0/coreruleset-4.0.0.tar.gz.asc
To verify the integrity of the release:
gpg --verify coreruleset-4.0.0.tar.gz.asc v4.0.0.tar.gz
gpg: Signature made Wed Jun 30 10:05:48 2021 -03
gpg: using RSA key 36006F0E0BA167832158821138EEACA1AB8A6E72
gpg: Good signature from "OWASP CRS <security@coreruleset.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 3600 6F0E 0BA1 6783 2158 8211 38EE ACA1 AB8A 6E72
If the signature was good then the verification succeeds. If a warning is displayed, like the above, it means the CRS project’s public key is known but is not trusted.
To trust the CRS project’s public key:
gpg --edit-key 36006F0E0BA167832158821138EEACA1AB8A6E72
gpg> trust
Your decision: 5 (ultimate trust)
Are you sure: Yes
gpg> quit
The result when verifying a release will then look like so:
gpg --verify coreruleset-4.0.0.tar.gz.asc v4.0.0.tar.gz
gpg: Signature made Wed Jun 30 15:05:48 2021 CEST
gpg: using RSA key 36006F0E0BA167832158821138EEACA1AB8A6E72
gpg: Good signature from "OWASP CRS <security@coreruleset.org>" [ultimate]
Installing the Rule Set
Once the rule set has been downloaded and verified, extract the rule set files to a well known location on the server. This will typically be somewhere in the web server directory.
The examples presented below demonstrate using Apache. For information on configuring Nginx or IIS see the extended install page.
Note that while it’s common practice to make a new modsecurity.d
folder, as outlined below, this isn’t strictly necessary. The path scheme outlined is common on RHEL-based operating systems; the Apache path used may need to be adjusted to match the server’s installation.
mkdir /etc/crs4
tar -xzvf v4.0.0.tar.gz --strip-components 1 -C /etc/crs4
Now all the CRS files will be located below the /etc/crs4
directory.
Setting Up the Main Configuration File
After extracting the rule set files, the next step is to set up the main OWASP CRS configuration file. An example configuration file is provided as part of the release package, located in the main directory: crs-setup.conf.example
.
Note
Other aspects of ModSecurity, particularly engine-specific parameters, are controlled by the ModSecurity “recommended” configuration rules, modsecurity.conf-recommended
. This file comes packaged with ModSecurity itself.
In many scenarios, the default example CRS configuration will be a good enough starting point. It is, however, a good idea to take the time to look through the example configuration file before deploying it to make sure it’s right for a given environment.
Warning
In particular, Response rules are enabled by default. You must be aware that you may be vulnerable to RFDoS attacks, depending on the responses your application is sending back to the client. You could be vulnerable, if your responses from your application can contain user input. If an attacker can submit user input that is returned as part of a response, the attacker can craft the input in such a way that the response rules of the WAF will block responses containing that input for all clients. For example, a blog post might no longer be accessible because of the contents of a comment on the post. See this blog post about the problems you could face.
There is an experimental scanner that uses nuclei to find out if are affected. So if
you are unsure, first test your application before enabling the response rules, or risk accidentally blocking some valid responses.
Response rules can be easily disabled by uncommenting the rule with id 900500
in the crs-setup.conf
file,
since CRS version 4.10.0.
The CRS team believes that the damage that can be caused by webshells and information leakage outweighs the damage of RFDos attacks, in general. Thus, the response rules remain active in the default configuration for now.
Once any settings have been changed within the example configuration file, as needed, it should be renamed to remove the .example portion, like so:
cd /etc/crs4
mv crs-setup.conf.example crs-setup.conf
Include-ing the Rule Files
The last step is to tell the web server where the rules are. This is achieved by include
-ing the rule configuration files in the httpd.conf
file. Again, this example demonstrates using Apache, but the process is similar on other systems (see the extended install page for details).
echo 'IncludeOptional /etc/crs4/crs-setup.conf' >> /etc/httpd/conf/httpd.conf
echo 'IncludeOptional /etc/crs4/plugins/*-config.conf' >> /etc/httpd/conf/httpd.conf
echo 'IncludeOptional /etc/crs4/plugins/*-before.conf' >> /etc/httpd/conf/httpd.conf
echo 'IncludeOptional /etc/crs4/rules/*.conf' >> /etc/httpd/conf/httpd.conf
echo 'IncludeOptional /etc/crs4/plugins/*-after.conf' >> /etc/httpd/conf/httpd.conf
Now that everything has been configured, it should be possible to restart and begin using the OWASP CRS. The CRS rules typically require a bit of tuning with rule exclusions, depending on the site and web applications in question. For more information on tuning, see false positives and tuning.
systemctl restart httpd.service
Alternative: Using Containers
Another quick option is to use the official CRS pre-packaged containers. Docker, Podman, or any compatible container engine can be used. The official CRS images are published on Docker Hub and GitHub Container Repository. The image most often deployed is modsecurity-crs
(owasp/modsecurity-crs
from Docker Hub or ghcr.io/coreruleset/modsecurity-crs
from GHCR): it already has everything needed to get up and running quickly.
The CRS project pre-packages both Apache and Nginx web servers along with the appropriate corresponding ModSecurity engine. More engines, like Coraza, will be added at a later date.
To protect a running web server, all that’s required is to get the appropriate image and set its configuration variables to make the WAF receives requests and proxies them to your backend server.
Below is an example docker compose file that can be used to pull the container images. If you don’t have compose installed, please read the installation instructions. All that needs to be changed is the BACKEND
variable so that the WAF points to the backend server in question:
services:
modsec2-apache:
container_name: modsec2-apache
image: owasp/modsecurity-crs:apache
# if you are using Linux, you will need to uncomment the below line
# user: root
environment:
SERVERNAME: modsec2-apache
BACKEND: http://<backend server>
PORT: "80"
MODSEC_RULE_ENGINE: DetectionOnly
BLOCKING_PARANOIA: 2
TZ: "${TZ}"
ERRORLOG: "/var/log/error.log"
ACCESSLOG: "/var/log/access.log"
MODSEC_AUDIT_LOG_FORMAT: Native
MODSEC_AUDIT_LOG_TYPE: Serial
MODSEC_AUDIT_LOG: "/var/log/modsec_audit.log"
MODSEC_TMP_DIR: "/tmp"
MODSEC_RESP_BODY_ACCESS: "On"
MODSEC_RESP_BODY_MIMETYPE: "text/plain text/html text/xml application/json"
COMBINED_FILE_SIZES: "65535"
volumes:
ports:
- "80:80"
That’s all that needs to be done. Simply starting the container described above will instantly provide the protection of the latest stable CRS release in front of a given backend server or service. There are lots of additional variables that can be used to configure the container image and its behavior, so be sure to read the full documentation.
Verifying that the CRS is active
Always verify that CRS is installed correctly by sending a ‘malicious’ request to your site or application, for instance:
curl 'https://www.example.com/?foo=/etc/passwd&bar=/bin/sh'
Depending on your configurated thresholds, this should be detected as a malicious request. If you use blocking mode, you should receive an Error 403. The request should also be logged to the audit log, which is usually in /var/log/modsec_audit.log
.
Upgrading
Upgrading from CRS 3.x to CRS 4
The most impactful change is the removal of application exclusion packages in favor of a plugin system. If you had activated the exclusion packages in CRS 3, you should download the plugins for them and place them in the plugins subdirectory. We maintain the list of plugins in our Plugin Registry. You can find detailed information on working with plugins in our plugins documentation.
In terms of changes to the detection rules, the amount of changes is smaller than in the CRS 2—3 changeover. Most rules have only evolved slightly, so it is recommended that you keep any existing custom exclusions that you have made under CRS 3.
We recommend to start over by copying our crs-setup.conf.example
to crs-setup.conf
with a copy of your old file at hand, and re-do the customizations that you had under CRS 3.
Please note that we added a large number of new detections, and any new detection brings a certain risk of false alarms. Therefore, we recommend to test first before going live.
Upgrading from CRS 2.x to CRS 3
In general, you can update by unzipping our new release over your older one, and updating the crs-setup.conf
file with any new settings. However, CRS 3.0 is a major rewrite, incompatible with CRS 2.x. Key setup variables have changed their name, and new features have been introduced. Your former modsecurity_crs_10_setup.conf file is thus no longer usable. We recommend you to start with a fresh crs-setup.conf file from scratch.
Most rule IDs have been changed to reorganize them into logical sections. This means that if you have written custom configuration with exclusion rules (e.g. SecRuleRemoveById
, SecRuleRemoveTargetById
, ctl:ruleRemoveById
or ctl:ruleRemoveTargetById
) you must renumber the rule numbers in that configuration. You can do this using the supplied utility util/id_renumbering/update.py or find the changes in util/id_renumbering/IdNumbering.csv.
However, a key feature of the CRS 3 is the reduction of false positives in the default installation, and many of your old exclusion rules may no longer be necessary. Therefore, it is a good option to start fresh without your old exclusion rules.
If you are experienced in writing exclusion rules for CRS 2.x, it may be worthwhile to try running CRS 3 in Paranoia Level 2 (PL2). This is a stricter mode, which blocks additional attack patterns, but brings a higher number of false positives — in many situations the false positives will be comparable with CRS 2.x. This paranoia level however will bring you a higher protection level than CRS 2.x or a CRS 3 default install, so it can be worth the investment.
Extended Install
All the information needed to properly install CRS is presented on this page. The installation concepts are expanded upon and presented in more detail than the quick start guide.
To contact the CRS project with questions or problems, reach out via the project’s Google group or Slack channel (for Slack channel access, use this link to get an invite).
Prerequisites
Installing the CRS isn’t very difficult but does have one major requirement: a compatible engine. The reference engine used throughout this page is ModSecurity.
Note
In order to successfully run CRS 3.x
using ModSecurity it is recommended to use the latest version available. For Nginx use the 3.x
branch of ModSecurity, and for Apache use the latest 2.x
branch.
Installing a Compatible WAF Engine
Two different methods to get an engine up and running are presented here:
- using the chosen engine as provided and packaged by the OS distribution
- compiling the chosen engine from source
A ModSecurity installation is presented in the examples below, however the install documentation for the Coraza engine can be found here.
Option 1: Installing Pre-Packaged ModSecurity
ModSecurity is frequently pre-packaged and is available from several major Linux distributions.
- Debian: Friends of the CRS project DigitalWave package and, most importantly, keep ModSecurity updated for Debian and derivatives.
- Fedora: Execute
dnf install mod_security
for Apache + ModSecurity v2. - RHEL compatible: Install EPEL and then execute
yum install mod_security
.
For Windows, get the latest MSI package from https://github.com/owasp-modsecurity/ModSecurity/releases.
Warning
Distributions might not update their ModSecurity releases frequently.
As a result, it is quite likely that a distribution’s version of ModSecurity may be missing important features or may even contain security vulnerabilities. Additionally, depending on the package and package manager used, the ModSecurity configuration will be laid out slightly differently.
As the different engines and distributions have different layouts for their configuration, to simplify the documentation presented here the prefix <web server config>/
will be used from this point on.
Examples of <web server config>/
include:
/etc/apache2
in Debian and derivatives/etc/httpd
in RHEL and derivatives/usr/local/apache2
if Apache was compiled from source using the default prefixC:\Program Files\ModSecurity IIS\
(or Program Files(x86), depending on configuration) on Windows/etc/nginx
Option 2: Compiling ModSecurity From Source
Compiling ModSecurity is easy, but slightly outside the scope of this document. For information on how to compile ModSecurity, refer to:
Unsupported Configurations
Note that the following configurations are not supported. They do not work as expected. The CRS project recommendation is to avoid these setups:
- Nginx with ModSecurity v2
- Apache with ModSecurity v3
Testing the Compiled Module
Once ModSecurity has been compiled, there is a simple test to see if the installation is working as expected. After compiling from source, use the appropriate directive to load the newly compiled module into the web server. For example:
- Apache:
LoadModule security2_module modules/mod_security2.so
- Nginx:
load_module modules/ngx_http_modsecurity_module.so;
Now restart the web server. ModSecurity should output that it’s being used.
Nginx should show something like:
2022/04/21 23:45:52 [notice] 1#1: ModSecurity-nginx v1.0.2 (rules loaded inline/local/remote: 0/6/0)
Apache should show something like:
[Thu Apr 21 23:55:35.142945 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity for Apache/2.9.3 (http://www.modsecurity.org/) configured.
[Thu Apr 21 23:55:35.142980 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity: APR compiled version="1.6.5"; loaded version="1.6.5"
[Thu Apr 21 23:55:35.142985 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity: PCRE compiled version="8.39 "; loaded version="8.39 2016-06-14"
[Thu Apr 21 23:55:35.142988 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity: LUA compiled version="Lua 5.1"
[Thu Apr 21 23:55:35.142991 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity: YAJL compiled version="2.1.0"
[Thu Apr 21 23:55:35.142994 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity: LIBXML compiled version="2.9.4"
[Thu Apr 21 23:55:35.142997 2022] [:notice] [pid 2528:tid 140410548673600] ModSecurity: Status engine is currently disabled, enable it by set SecStatusEngine to On.
[Thu Apr 21 23:55:35.187082 2022] [mpm_event:notice] [pid 2530:tid 140410548673600] AH00489: Apache/2.4.41 (Ubuntu) configured -- resuming normal operations
[Thu Apr 21 23:55:35.187125 2022] [core:notice] [pid 2530:tid 140410548673600] AH00094: Command line: '/usr/sbin/apache2'
Microsoft IIS with ModSecurity 2.x
The initial configuration file is modsecurity_iis.conf
. This file will be parsed by ModSecurity for both ModSecurity directives and 'Include'
directives.
Additionally, in the Event Viewer, under Windows Logs\Application
, it should be possible to see a new log entry showing ModSecurity being successfully loaded.
At this stage, the ModSecurity on IIS setup is working and new directives can be placed in the configuration file as needed.
Downloading OWASP CRS
With a compatible WAF engine installed and working, the next step is typically to download and install the OWASP CRS. The CRS project strongly recommends using a supported version.
Official CRS releases can be found at the following URL: https://github.com/coreruleset/coreruleset/releases.
For production environments, it is recommended to use the latest release, which is v4.0.0. For testing the bleeding edge CRS version, nightly releases are also provided.
Verifying Releases
Note
Releases are signed using the CRS project’s GPG key (fingerprint: 3600 6F0E 0BA1 6783 2158 8211 38EE ACA1 AB8A 6E72). Releases can be verified using GPG/PGP compatible tooling.
To retrieve the CRS project’s public key from public key servers using gpg
, execute: gpg --keyserver pgp.mit.edu --recv 0x38EEACA1AB8A6E72
(this ID should be equal to the last sixteen hex characters in the fingerprint).
It is also possible to use gpg --fetch-key https://coreruleset.org/security.asc
to retrieve the key directly.
The following steps assume that a *nix operating system is being used. Installation is similar on Windows but likely involves using a zip file from the CRS releases page.
To download the release file and the corresponding signature:
wget https://github.com/coreruleset/coreruleset/archive/refs/tags/v4.0.0.tar.gz
wget https://github.com/coreruleset/coreruleset/releases/download/v4.0.0/coreruleset-4.0.0.tar.gz.asc
To verify the integrity of the release:
gpg --verify coreruleset-4.0.0.tar.gz.asc v4.0.0.tar.gz
gpg: Signature made Wed Jun 30 10:05:48 2021 -03
gpg: using RSA key 36006F0E0BA167832158821138EEACA1AB8A6E72
gpg: Good signature from "OWASP CRS <security@coreruleset.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 3600 6F0E 0BA1 6783 2158 8211 38EE ACA1 AB8A 6E72
If the signature was good then the verification succeeds. If a warning is displayed, like the above, it means the CRS project’s public key is known but is not trusted.
To trust the CRS project’s public key:
gpg --edit-key 36006F0E0BA167832158821138EEACA1AB8A6E72
gpg> trust
Your decision: 5 (ultimate trust)
Are you sure: Yes
gpg> quit
The result when verifying a release will then look like so:
gpg --verify coreruleset-4.0.0.tar.gz.asc v4.0.0.tar.gz
gpg: Signature made Wed Jun 30 15:05:48 2021 CEST
gpg: using RSA key 36006F0E0BA167832158821138EEACA1AB8A6E72
gpg: Good signature from "OWASP CRS <security@coreruleset.org>" [ultimate]
With the CRS release downloaded and verified, the rest of the set up can continue.
Setting Up OWASP CRS
OWASP CRS contains a setup file that should be reviewed prior to completing set up. The setup file is the only configuration file within the root ‘coreruleset-4.0.0’ folder and is named crs-setup.conf.example
. Examining this configuration file and reading what the different options are is highly recommended.
At a minimum, keep in mind the following:
- CRS does not configure features such as the rule engine, audit engine, logging, etc. This task is part of the initial engine setup and is not a job for the rule set. For ModSecurity, if not already done, see the recommended configuration.
- Decide what ModSecurity should do when it detects malicious activity, e.g., drop the packet, return a 403 Forbidden status code, issue a redirect to a custom page, etc.
- Make sure to configure the anomaly scoring thresholds. For more information see Anomaly.
- By default, the CRS rules will consider many issues with different databases and languages. If running in a specific environment, e.g., without any SQL database services present, it is probably a good idea to limit this behavior for performance reasons.
- Make sure to add any HTTP methods, static resources, content types, or file extensions that are needed, beyond the default ones listed.
Once reviewed and configured, the CRS configuration file should be renamed by changing the file suffix from .example
to .conf
:
mv crs-setup.conf.example crs-setup.conf
In addition to crs-setup.conf.example
, there are two other “.example” files within the CRS repository. These are:
rules/REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf.example
rules/RESPONSE-999-EXCLUSION-RULES-AFTER-CRS.conf.example
These files are designed to provide the rule maintainer with the ability to modify rules (see false positives and tuning) without breaking forward compatibility with rule set updates. These two files should be renamed by removing the .example
suffix. This will mean that installing updates will not overwrite custom rule exclusions. To rename the files in Linux, use a command similar to the following:
mv rules/REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf.example rules/REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf
mv rules/RESPONSE-999-EXCLUSION-RULES-AFTER-CRS.conf.example rules/RESPONSE-999-EXCLUSION-RULES-AFTER-CRS.conf
Proceeding with the Installation
The engine should support the Include
directive out of the box. This directive tells the engine to parse additional files for directives. The question is where to put the CRS rules folder in order for it to be included.
Looking at the CRS files, there are quite a few “.conf” files. While the names attempt to do a good job at describing what each file does, additional information is available in the rules section.
Includes for Apache
It is recommended to create a folder specifically to contain the CRS rules. In the example presented here, a folder named modsecurity.d
has been created and placed within the root <web server config>/
directory. When using Apache, wildcard notation can be used to vastly simplify the Include
rules. Simply copying the cloned directory into the modsecurity.d
folder and specifying the appropriate Include
directives will install OWASP CRS. In the example below, the modsecurity.conf
file has also been included, which includes recommended configurations for ModSecurity.
<IfModule security2_module>
Include modsecurity.d/modsecurity.conf
Include /etc/crs4/crs-setup.conf
Include /etc/crs4/plugins/*-config.conf
Include /etc/crs4/plugins/*-before.conf
Include /etc/crs4/rules/*.conf
Include /etc/crs4/plugins/*-after.conf
</IfModule>
Includes for Nginx
Nginx will include files from the Nginx configuration directory (/etc/nginx
or /usr/local/nginx/conf/
, depending on the environment). Because only one ModSecurityConfig
directive can be specified within nginx.conf
, it is recommended to name that file modsec_includes.conf
and include additional files from there. In the example below, the cloned coreruleset
folder was copied into the Nginx configuration directory. From there, the appropriate include directives are specified which will include OWASP CRS when the server is restarted. In the example below, the modsecurity.conf
file has also been included, which includes recommended configurations for ModSecurity.
Include modsecurity.d/modsecurity.conf
Include /etc/crs4/crs-setup.conf
Include /etc/crs4/plugins/*-config.conf
Include /etc/crs4/plugins/*-before.conf
Include /etc/crs4/rules/*.conf
Include /etc/crs4/plugins/*-after.conf
Note
You will also need to include the plugins you want along with your CRS installation.
Using Containers
The CRS project maintains a set of ‘CRS with ModSecurity’ Docker images.
A full operational guide on how to use and deploy these images will be written in the future. In the meantime, refer to the GitHub README page for more information on how to use these official container images.
ModSecurity CRS Docker Image
https://github.com/coreruleset/modsecurity-crs-docker
A Docker image supporting the latest stable CRS release on:
- the latest stable ModSecurity v2 on Apache
- the latest stable ModSecurity v3 on Nginx
Engine and Integration Options
CRS runs on WAF engines that are compatible with a subset of ModSecurity’s SecLang configuration language. There are several options outside of ModSecurity itself, namely cloud offerings and content delivery network (CDN) services. There is also an open-source alternative to ModSecurity in the form of the new Coraza WAF engine.
Compatible Free and Open-Source WAF Engines
ModSecurity v2
ModSecurity v2, originally a security module for the Apache web server, is the reference implementation for CRS.
ModSecurity 2.9.x passes 100% of the CRS unit tests on the Apache platform.
When running ModSecurity, this is the option that is practically guaranteed to work with most documentation and know-how all around.
ModSecurity is released under the Apache License 2.0, and the project now lives under the OWASP Foundation umbrella.
There is a ModSecurity v2 / Apache Docker container which is maintained by the CRS project.
ModSecurity v3
ModSecurity v3, also known as libModSecurity, is a re-implementation of ModSecurity v3 with an architecture that is less dependent on the web server. The connection between the standalone ModSecurity and the web server is made using a lean connector module.
As of spring 2021, only the Nginx connector module is really usable in production.
ModSecurity v3 fails with 2-4% of the CRS unit tests due to bugs and implementation gaps. Nginx + ModSecurity v3 also suffers from performance problems when compared to the Apache + ModSecurity v2 platform. This may be surprising for people familiar with the high performance of Nginx.
ModSecurity v3 is used in production together with Nginx, but the CRS project recommends to use the ModSecurity v2 release line with Apache.
ModSecurity is released under the Apache License 2.0. It is primarily developed by Spiderlabs, an entity within the company Trustwave. In summer 2021, Trustwave announced their plans to end development of ModSecurity in 2024. Attempts to convince Trustwave to hand over the project in the meantime, in the interests of guaranteeing the project’s continuation, have failed. Trustwave have stated that they will not relinquish control of the project before 2024.
To learn more about the situation around ModSecurity, read this CRS blog post discussing the matter.
There is a ModSecurity v3 / Nginx Docker container which is maintained by the CRS project.
Coraza
OWASP Coraza WAF is meant to provide an open-source alternative to the two ModSecurity release lines.
Coraza passes 100% of the CRS v4 test suite and is thus fully compatible with CRS.
Coraza has been developed in Go and currently runs on the Caddy and Traefik platforms. Additional ports are being developed and the developers also seek to bring Coraza to Nginx and, eventually, Apache. In parallel to this expansion, Coraza will be developed further with its own feature set.
To learn more about CRS and Coraza, read this CRS blog post which introduces Coraza.
Commercial WAF Appliances
Dozens of commercial WAFs, both virtual and hardware-based, offer CRS as part of their service. Many of them use ModSecurity underneath, or some alternative implementation (although this is rare on the WAF appliance approach). Most of these commercial WAFs either don’t offer the full feature set of CRS or they don’t make it easily accessible. With some of these companies, there is often also a lack of CRS experience and knowledge.
The CRS project recommends evaluating these commercial appliance-based offerings in a holistic way before buying a license.
In light of the many, many appliance offerings on the market and the CRS project’s relatively limited exposure, only a few offerings are listed here.
HAProxy Technologies
HAProxy Technologies embeds ModSecurity v3 in three of its products via the Libmodsecurity module. ModSecurity is included with: HAProxy Enterprise, HAProxy ALOHA, and HAProxy Enterprise Kubernetes Ingress Controller.
To learn more, visit the HAProxy WAF solution page on haproxy.com.
Kemp/Progressive LoadMaster
The Kemp LoadMaster is a popular load balancer that integrates ModSecurity v2 and CRS in a typical way. It goes further than the competition with the support of most CRS features.
To learn more, read this blog post about CRS on LoadMaster.
Kemp/Progressive is a sponsor of CRS.
Loadbalancer.org
The load balancer appliance from Loadbalancer.org features WAF functionality based on Apache + ModSecurity v2 + CRS, sandwiched by HAProxy for load balancing. It’s available as a hardware, virtual, and cloud appliance.
To learn more, read this blog post about CRS at Loadbalancer.org.
Avi Networks “Vantage”
Avi Vantage from Avi Networks is a modern virtual load balancer and proxy with strong WAF capabilities. It’s based on a fork of ModSecurity v3.
To learn more, read Avi’s WAF documentation.
Existing CRS Integrations: Cloud and CDN Offerings
Most big cloud providers and CDNs provide a CRS offering. While originally these were mostly based on ModSecurity, over time they have all moved to alternative (usually proprietary) implementations of ModSecurity’s SecLang configuration language, or they transpose the CRS rules written in SecLang into their own domain specific language (DSL).
The CRS project has some insight into some of these platforms and is in touch with most of these providers. The exact specifics are not really known, however, but what is known is that almost all of these integrators compromised and provide a subset of CRS rules and a subset of features, in the interests of ease of integration and operation.
Info
The CRS Status page project will be testing cloud and CDN offerings. As part of this effort, the CRS project will be documenting the results and even publishing code on how to quickly get started using CRS in CDN/cloud providers. This status page project is in development as of spring 2022.
A selection of these platforms are listed below, along with links to get more info.
AWS WAF
Note
AWS provides a rule set called the “Core rule set (CRS) managed rule group” which “…provides protection against… commonly occurring vulnerabilities described in OWASP publications such as OWASP Top 10.”
The CRS project does not believe that the AWS WAF “core rule set” is based on or related to OWASP CRS.
Cloudflare WAF
Cloudflare WAF supports CRS as one of its WAF rule sets. Documentation on how to use it can be found in Cloudflare’s documentation.
Edgecast
Edgecast offers CRS as a managed rule set as part of their WAF service that runs on a ModSecurity re-implementation called WAFLZ.
To learn more about Edgecast, read their WAF documentation.
Fastly
Fastly has offered CRS as part of their Fastly WAF for several years, but they have started to migrate their existing customers to the recently acquired Signal Sciences WAF. Interestingly, Fastly is transposing CRS rules into their own Varnish-based WAF engine. Unfortunately, documentation on their legacy WAF offering has been removed.
Google Cloud Armor
Google integrates CRS into its Cloud Armor WAF offering. Google runs the CRS rules on their own WAF engine. As of fall 2022, Google offers version 3.3.2 of CRS.
To learn more about CRS on Google’s Cloud Armor, read this document from Google.
Google Cloud Armor is a sponsor of CRS.
Microsoft Azure WAF
Azure Application Gateways can be configured to use the WAFv2 and managed rules with different versions of CRS. Azure provides the 3.2, 3.1, 3.0, and 2.2.9 CRS versions. We recommend using version 3.2 (see our security policy for details on supported CRS versions).
Oracle WAF
The Oracle WAF is a cloud-based offering that includes CRS. To learn more, read Oracle’s WAF documentation.
Alternative Use Cases
Outside of the narrower implementation of a WAF, CRS can also be found in different security-related setups.
Sqreen/Datadog
Sqreen uses a subset of CRS as an innovative part of their RASP offering. A few pieces of information about this offering can be found in this Sqreen blog post.
Subsections of How CRS Works
Anomaly Scoring
CRS 3 is designed as an anomaly scoring rule set. This page explains what anomaly scoring is and how to use it.
Overview of Anomaly Scoring
Anomaly scoring, also known as “collaborative detection”, is a scoring mechanism used in CRS. It assigns a numeric score to HTTP transactions (requests and responses), representing how ‘anomalous’ they appear to be. Anomaly scores can then be used to make blocking decisions. The default CRS blocking policy, for example, is to block any transaction that meets or exceeds a defined anomaly score threshold.
How Anomaly Scoring Mode Works
Anomaly scoring mode combines the concepts of collaborative detection and delayed blocking. The key idea to understand is that the inspection/detection rule logic is decoupled from the blocking functionality.
Individual rules designed to detect specific types of attacks and malicious behavior are executed. If a rule matches, no immediate disruptive action is taken (e.g. the transaction is not blocked). Instead, the matched rule contributes to a transactional anomaly score, which acts as a running total. The rules just handle detection, adding to the anomaly score if they match. In addition, an individual matched rule will typically log a record of the match for later reference, including the ID of the matched rule, the data that caused the match, and the URI that was being requested.
Once all of the rules that inspect request data have been executed, blocking evaluation takes place. If the anomaly score is greater than or equal to the inbound anomaly score threshold then the transaction is denied. Transactions that are not denied continue on their journey.
Continuing on, once all of the rules that inspect response data have been executed, a second round of blocking evaluation takes place. If the outbound anomaly score is greater than or equal to the outbound anomaly score threshold, then the response is not returned to the user. (Note that in this case, the request is fully handled by the backend or application; only the response is stopped.)
Info
Having separate inbound and outbound anomaly scores and thresholds allows for request data and response data to be inspected and scored independently.
Summary of Anomaly Scoring Mode
To summarize, anomaly scoring mode in the CRS works like so:
- Execute all request rules
- Make a blocking decision using the inbound anomaly score threshold
- Execute all response rules
- Make a blocking decision using the outbound anomaly score threshold
The Anomaly Scoring Mechanism In Action
As described, individual rules are only responsible for detection and inspection: they do not block or deny transactions. If a rule matches then it increments the anomaly score. This is done using ModSecurity’s setvar
action.
Below is an example of a detection rule which matches when a request has a Content-Length
header field containing something other than digits. Notice the final line of the rule: it makes use of the setvar
action, which will increment the anomaly score if the rule matches:
SecRule REQUEST_HEADERS:Content-Length "!@rx ^\d+$" \
"id:920160,\
phase:1,\
block,\
t:none,\
msg:'Content-Length HTTP header is not numeric',\
logdata:'%{MATCHED_VAR}',\
tag:'application-multi',\
tag:'language-multi',\
tag:'platform-multi',\
tag:'attack-protocol',\
tag:'paranoia-level/1',\
tag:'OWASP_CRS',\
tag:'capec/1000/210/272',\
ver:'OWASP_CRS/3.4.0-dev',\
severity:'CRITICAL',\
setvar:'tx.anomaly_score_pl1=+%{tx.critical_anomaly_score}'"
Info
Notice that the anomaly score variable name has the suffix pl1
. Internally, CRS keeps track of anomaly scores on a per paranoia level basis. The individual paranoia level anomaly scores are added together before each round of blocking evaluation takes place, allowing the total combined inbound or outbound score to be compared to the relevant anomaly score threshold.
Tracking the anomaly score per paranoia level allows for clever scoring mechanisms to be employed, such as the executing paranoia level feature.
The rules files REQUEST-949-BLOCKING-EVALUATION.conf
and RESPONSE-959-BLOCKING-EVALUATION.conf
are responsible for executing the inbound (request) and outbound (response) rounds of blocking evaluation, respectively. The rules in these files calculate the total inbound or outbound transactional anomaly score and then make a blocking decision, by comparing the result to the defined threshold and taking blocking action if required.
Configuring Anomaly Scoring Mode
The following settings can be configured when using anomaly scoring mode:
- Anomaly score thresholds
- Severity levels
- Early blocking
If using a native CRS installation on a web application firewall, these settings are defined in the file crs-setup.conf
. If running CRS where it has been integrated into a commercial product or CDN then support varies. Some vendors expose these settings in the GUI while other vendors require custom rules to be written which set the necessary variables. Unfortunately, there are also vendors that don’t allow these settings to be configured at all.
Anomaly Score Thresholds
An anomaly score threshold is the cumulative anomaly score at which an inbound request or an outbound response will be blocked.
Most detected inbound threats carry an anomaly score of 5 (by default), while smaller violations, e.g. protocol and standards violations, carry lower scores. An anomaly score threshold of 7, for example, would require multiple rule matches in order to trigger a block (e.g. one “critical” rule scoring 5 plus a lesser-scoring rule, in order to reach the threshold of 7). An anomaly score threshold of 10 would require at least two “critical” rules to match, or a combination of many lesser-scoring rules. Increasing the anomaly score thresholds makes the CRS less sensitive and hence less likely to block transactions.
Rule coverage should be taken into account when setting anomaly score thresholds. Different CRS rule categories feature different numbers of rules. SQL injection, for example, is covered by more than 50 rules. As a result, a real world SQLi attack can easily gain an anomaly score of 15, 20, or even more. On the other hand, a rare protocol attack might only be covered by a single, specific rule. If such an attack only causes the one specific rule to match then it will only gain an anomaly score of 5. If the inbound anomaly score threshold is set to anything higher than 5 then attacks like the one described will not be stopped. As such, a CRS installation should aim for an inbound anomaly score threshold of 5.
Warning
Increasing the anomaly score thresholds may allow some attacks to bypass the CRS rules.
Info
An outbound anomaly score threshold of 4 (the default) will block a transaction if any single response rule matches.
Tip
A common practice when working with a new CRS deployment is to start in blocking mode from the very beginning with very high anomaly score thresholds (even as high as 10000). The thresholds can be gradually lowered over time as an iterative process.
This tuning method was developed and advocated by Christian Folini, who documented it in detail, along with examples, in a popular tutorial titled Handling False Positives with OWASP CRS.
CRS uses two anomaly score thresholds, which can be defined using the variables listed below:
Threshold | Variable |
---|
Inbound anomaly score threshold | tx.inbound_anomaly_score_threshold |
Outbound anomaly score threshold | tx.outbound_anomaly_score_threshold |
A simple way to set these thresholds is to uncomment and use rule 900110:
SecAction \
"id:900110,\
phase:1,\
nolog,\
pass,\
t:none,\
setvar:tx.inbound_anomaly_score_threshold=5,\
setvar:tx.outbound_anomaly_score_threshold=4"
Severity Levels
Each CRS rule has an associated severity level. Different severity levels have different anomaly scores associated with them. This means that different rules can increment the anomaly score by different amounts if the rules match.
The four severity levels and their default anomaly scores are:
Severity Level | Default Anomaly Score |
---|
CRITICAL | 5 |
ERROR | 4 |
WARNING | 3 |
NOTICE | 2 |
For example, by default, a single matching CRITICAL
rule would increase the anomaly score by 5, while a single matching WARNING
rule would increase the anomaly score by 3.
The default anomaly scores are rarely ever changed. It is possible, however, to set custom anomaly scores for severity levels. To do so, uncomment rule 900100 and set the anomaly scores as desired:
SecAction \
"id:900100,\
phase:1,\
nolog,\
pass,\
t:none,\
setvar:tx.critical_anomaly_score=5,\
setvar:tx.error_anomaly_score=4,\
setvar:tx.warning_anomaly_score=3,\
setvar:tx.notice_anomaly_score=2"
Info
The CRS makes use of a ModSecurity feature called macro expansion to propagate the value of the severity level anomaly scores throughout the entire rule set.
Early Blocking
Early blocking is an optional setting which can be enabled to allow blocking decisions to be made earlier than usual.
As summarized previously, anomaly scoring mode works like so:
- Execute all request rules
- Make a blocking decision using the inbound anomaly score threshold
- Execute all response rules
- Make a blocking decision using the outbound anomaly score threshold
The early blocking option takes advantage of the fact that the request and response rules are actually split across different phases. A more detailed overview of anomaly scoring mode looks like so:
- Execute all phase 1 (request header) rules
- Execute all phase 2 (request body) rules
- Make a blocking decision using the inbound anomaly score threshold
- Execute all phase 3 (response header) rules
- Execute all phase 4 (response body) rules
- Make a blocking decision using the outbound anomaly score threshold
More data from a transaction becomes available for inspection in each subsequent processing phase. In phase 1 the request headers are available for inspection. Detection rules that are only concerned with request headers are executed here. In phase 2 the request body also becomes available for inspection. Rules that need to inspect the request body, perhaps in addition to request headers, are executed here.
If a transaction’s anomaly score already meets or exceeds the inbound anomaly score threshold by the end of phase 1 (due to causing phase 1 rules to match) then, in theory, the phase 2 rules don’t need to be executed. This saves the time and resources it would take to process the detection rules in phase 2 and also protects the server from being attacked when handling the body of the request. The majority of CRS rules take place in phase 2, which is also where the request body inspection rules are located. When dealing with large request bodies, it may be worthwhile to avoid executing the phase 2 rules in this way. The same logic applies to blocking responses that have already met the outbound anomaly score threshold in phase 3, before reaching phase 4. This saves the time and resources required to execute the phase 4 rules, which inspect the response body.
Early blocking makes this possible by inserting two additional rounds of blocking evaluation: one after the phase 1 detection rules have finished executing, and another after the phase 3 detection rules:
- Execute all phase 1 (request header) rules
- Make an early blocking decision using the inbound anomaly score threshold
- Execute all phase 2 (request body) rules
- Make a blocking decision using the inbound anomaly score threshold
- Execute all phase 3 (response header) rules
- Make an early blocking decision using the outbound anomaly score threshold
- Execute all phase 4 (response body) rules
- Make a blocking decision using the outbound anomaly score threshold
Info
More information about processing phases can be found in the processing phases section of the ModSecurity Reference Manual.
Warning
The early blocking option has a major drawback to be aware of: it can cause potential alerts to be hidden.
If a transaction is blocked early then its body is not inspected. For example, if a transaction is blocked early at the end of phase 1 (the request headers phase) then the body of the request is never inspected. If the early blocking option is not enabled, it’s possible that such a transaction would proceed to cause phase 2 rules to match. Early blocking hides these potential alerts. The same applies to responses that trigger an early block: it’s possible that some phase 4 rules would match if early blocking were not enabled.
Using the early blocking option results in having less information to work with, due to fewer rules being executed. This may mean that the full picture is not present in log files when looking back at attacks and malicious traffic. It can also be a problem when dealing with false positives: tuning away a false positive in phase 1 will allow the same request to proceed to the next phase the next time it’s issued (instead of being blocked at the end of phase 1). The problem is that now, with the request making it past phase 1, more, previously “hidden” false positives may appear in phase 2.
Warning
If early blocking is not enabled, there’s a chance that the web server will interfere with the handling of a request between phases 1 and 2. Take the example where the Apache web server issues a redirect to a new location. With a request that violates CRS rules in phase 1, this may mean that the request has a higher anomaly score than the defined threshold but it gets redirected away before blocking evaluation happens.
Enabling the Early Blocking Option
If using a native CRS installation on a web application firewall, the early blocking option can be enabled in the file crs-setup.conf
. This is done by uncommenting rule 900120, which sets the variable tx.blocking_early
to 1 in order to enable early blocking. CRS otherwise gives this variable a default value of 0, meaning that early blocking is disabled by default.
SecAction \
"id:900120,\
phase:1,\
nolog,\
pass,\
t:none,\
setvar:tx.blocking_early=1"
If running CRS where it has been integrated into a commercial product or CDN then support for the early blocking option varies. Some vendors may allow it to be enabled through the GUI, through a custom rule, or they might not allow it to be enabled at all.
Paranoia Levels
Paranoia levels are an essential concept when working with CRS. This page explains the concept behind paranoia levels and how to work with them on a practical level.
Introduction to Paranoia Levels
The paranoia level (PL) makes it possible to define how aggressive CRS is. Paranoia level 1 (PL 1) provides a set of rules that hardly ever trigger a false alarm (ideally never, but it can happen, depending on the local setup). PL 2 provides additional rules that detect more attacks (these rules operate in addition to the PL 1 rules), but there’s a chance that the additional rules will also trigger new false alarms over perfectly legitimate HTTP requests.
This continues at PL 3, where more rules are added, namely for certain specialized attacks. This leads to even more false alarms. Then at PL 4, the rules are so aggressive that they detect almost every possible attack, yet they also flag a lot of legitimate traffic as malicious.
A higher paranoia level makes it harder for an attacker to go undetected. Yet this comes at the cost of more false positives: more false alarms. That’s the downside to running a rule set that detects almost everything: your business / service / web application is also disrupted.
When false positives occur they need to be tuned away. In ModSecurity parlance: rule exclusions need to be written. A rule exclusion is a rule that disables another rule, either disabled completely or disabled partially only for certain parameters or for certain URIs. This means the rule set remains intact yet the CRS installation is no longer affected by the false positives.
Note
Depending on the complexity of the service (web application) in question and on the paranoia level, the process of writing rule exclusions can be a substantial amount of work.
This page won’t explore the problem of handling false positives further: for more information on this topic, see the appropriate chapter or refer to the tutorials at netnea.com.
Description of the Four Paranoia Levels
The CRS project views the four paranoia levels as follows:
Paranoia Level | Description |
---|
1 | Baseline security with a minimal need to tune away false positives. This is CRS for everybody running an HTTP server on the internet. Please report any false positives encountered with a PL 1 system via GitHub. |
2 | Rules that are adequate when real user data is involved. Perhaps an off-the-shelf online shop. Expect to encounter false positives and learn how to tune them away. |
3 | Online banking level security with lots of false positives. From a project perspective, false positives are accepted and expected here, so it’s important to learn how to write rule exclusions. |
4 | Rules that are so strong (or paranoid) they’re adequate to protect the “crown jewels”. To be used at one’s own risk: be prepared to face a large number of false positives. |
Choosing an Appropriate Paranoia Level
It’s important to think about a service’s security requirements. The difference between protecting a personal website and the admin gateway controlling access to an enterprise’s Active Directory are very different. The paranoia level needs to be chosen accordingly, while also considering the resources (time) required to tune away false positives at higher paranoia levels.
Running at the highest paranoia level, PL 4, may seem appealing from a security standpoint, but it could take many weeks to tune away the false positives encountered. It is crucial to have enough time to fully deal with all false positives.
Warning
Failure to properly tune an installation runs the risk of exposing users to a vast number of false positives. This can lead to a poor user experience, and might ultimately lead to a decision to completely disable CRS. As such, setting a high PL in blocking mode without adequate tuning to deal with false positives is very risky.
If working in an enterprise environment, consider developing an internal policy to map the risk levels and security needs of different assets to the minimum acceptable paranoia level to be used for them, for example:
- Risk Class 0: No personal data involved → PL 1
- Risk Class 1: Personal data involved, e.g. names and addresses → PL 2
- Risk Class 2: Sensitive data involved, e.g. financial/banking data; highest risk class → PL 3
Setting the Paranoia Level
If using a native CRS installation on a web application firewall, the paranoia level is defined by setting the variable tx.paranoia_level
in the file crs-setup.conf
. This is done in rule 900000, but technically the variable can be set in the Apache or Nginx configuration instead.
If running CRS where it has been integrated into a commercial product or CDN then support varies. Some vendors expose the PL setting in the GUI while other vendors require a custom rule to be written that sets tx.paranoia_level
. Unfortunately, there are also vendors that don’t allow the PL to be set at all. (The CRS project considers this to be an incomplete CRS integration, since paranoia levels are a defining feature of CRS.)
How Paranoia Levels Relate to Anomaly Scoring
It’s important to understand that paranoia levels and CRS anomaly scoring (the CRS anomaly threshold/limit) are two entirely different things with no direct connection. The paranoia level controls the number of rules that are enabled while the anomaly threshold defines how many rules can be triggered before a request is blocked.
At the conceptual level, these two ideas could be mixed if the goal was to create a particularly granular security concept. For example, saying “we define the anomaly threshold to be 10, but we compensate for this by running at paranoia level 3, which we acknowledge brings more rule alerts and higher anomaly scores.”
This is technically correct but it overlooks the fact that there are attack categories where CRS scores very low. For example, there is a plan to introduce a new rule to detect POP3 and IMAP injections: this will be a single rule, so, under normal circumstances, an IMAP injection would never score more than 5. Therefore, an installation running at an anomaly threshold of 10 could never block an IMAP injection, even if running at PL 3. In light of this, it’s generally advised to keep things simple and separate: a CRS installation should aim for an anomaly threshold of 5 and a paranoia level as deemed appropriate.
Moving to a Higher Paranoia Level
Introducing the Executing Paranoia Level
Consider an example successful CRS installation: it operates at paranoia level 1, a handful of rule exclusions are in place to deal with false positives, and the inbound anomaly score threshold is set to 5 which blocks would-be attackers immediately. Things are running smoothly at paranoia level 1, but imagine that there’s now a requirement to increase the level of security by raising the paranoia level to 2. Moving to PL 2 will almost certainly cause new false positives: given the strict anomaly score threshold of 5, these will likely cause legitimate users to be blocked.
There’s a simple, but risky, way to raise the paranoia level of a working and tuned CRS installation: raise the anomaly score threshold for a period of time, in order to account for the additional false positives that are anticipated. Raising the anomaly score threshold will allow through attacks that would have been blocked previously. The idea of decreasing security in order to improve it is counter-intuitive, as well as being bad practice.
There is a better solution. First, think of the paranoia level as being the “blocking paranoia level”. The rules enabled in the blocking paranoia level count towards the anomaly score threshold, which is used to determine whether or not to block a given request. Now introduce an additional paranoia level: the “executing paranoia level”. By default, the executing paranoia level is automatically set to be equal to the blocking paranoia level. If, however, the executing paranoia level is set to be higher than the blocking paranoia level then the additional rules from the higher paranoia level are executed but will never count towards the anomaly score threshold used to make the blocking decision.
Example: Blocking paranoia level of 1 and executing paranoia level of 2
The executing paranoia level allows rules from a higher paranoia level to be run, and potentially to trigger false positives, without increasing the probability of blocking legitimate users. Any new false positives can then be tuned away using rule exclusions. Once ready and with all the new rule exclusions in place, the blocking paranoia level can then be raised to match the executing paranoia level. This approach is a flexible and secure way to raise the paranoia level on a working production system without the risk of new false positives blocking users in error.
Moving to a Lower Paranoia Level
It is always possible to lower the paranoia level in order to experience fewer false positives, or none at all. The way that the rule set is constructed, lowering the paranoia level always means fewer or no false positives; raising the paranoia level is very likely to introduce more false positives.
Further Reading
For a slightly longer explanation of paranoia levels, please refer to our blog post on the subject. The blog post also discusses the pros and cons of dynamically setting the paranoia level on a per-request basis, firstly by geolocation (i.e. a lower PL for domestic traffic and a higher PL for non-domestic traffic) and secondly based on previous behavior (i.e. a user is dealt with at PL 1, but if they ever trigger a rule then they’re handled at PL 2 for all future requests).
False Positives and Tuning
When a genuine transaction causes a rule from CRS to match in error it is described as a false positive. False positives need to be tuned away by writing rule exclusions, as this page explains.
What are False Positives?
CRS provides generic attack detection capabilities. A fresh CRS deployment has no awareness of the web services that may be running behind it, or the quirks of how those services work. It is possible that genuine transactions may cause some CRS rules to match in error, if the transactions happen to match one of the generic attack behaviors or patterns that are being detected. Such a match is referred to as a false positive, or false alarm.
False positives are particularly likely to happen when operating at higher paranoia levels. While paranoia level 1 is designed to cause few, ideally zero, false positives, higher paranoia levels are increasingly likely to cause false positives. Each successive paranoia level introduces additional rules, with higher paranoia levels adding more aggressive rules. As such, the higher the paranoia level is the more likely it is that false positives will occur. That is the cost of the higher security provided by higher paranoia levels: the additional time it takes to tune away the increasing number of false positives.
Example False Positive
Imagine deploying the CRS in front of a WordPress instance. The WordPress engine features the ability to add HTML to blog posts (as well as JavaScript, if you’re an administrator). Internally, WordPress has rules controlling which HTML tags are allowed to be used. This list of allowed tags has been studied heavily by the security community and it’s considered to be a secure mechanism.
Consider the CRS inspecting a request with a URL like the following:
www.example.com/?wp_post=<h1>Welcome+To+My+Blog</h1>
At paranoia level 2, the wp_post
query string parameter would trigger a match against an XSS attack rule due to the presence of HTML tags. CRS is unaware that the problem is properly mitigated on the server side and, as a result, the request causes a false positive and may be blocked. The false positive may generate an error log line like the following:
[Wed Jan 01 00:00:00.123456 2022] [:error] [pid 2357:tid 140543564093184] [client 10.0.0.1:0] [client 10.0.0.1] ModSecurity: Warning. Pattern match "<(?:a|abbr|acronym|address|applet|area|audioscope|b|base|basefront|bdo|bgsound|big|blackface|blink|blockquote|body|bq|br|button|caption|center|cite|code|col|colgroup|comment|dd|del|dfn|dir|div|dl|dt|em|embed|fieldset|fn|font|form|frame|frameset|h1|head ..." at ARGS:wp_post. [file "/etc/crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf"] [line "783"] [id "941320"] [msg "Possible XSS Attack Detected - HTML Tag Handler"] [data "Matched Data: <h1> found within ARGS:wp_post: <h1>welcome to my blog</h1>"] [severity "CRITICAL"] [ver "OWASP_CRS/3.3.2"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-xss"] [tag "OWASP_CRS"] [tag "capec/1000/152/242/63"] [tag "PCI/6.5.1"] [tag "paranoia-level/2"] [hostname "www.example.com"] [uri "/"] [unique_id "Yad-7q03dV56xYsnGhYJlQAAAAA"]
This example log entry provides lots of information about the rule match. Some of the key pieces of information are:
The message from ModSecurity, which explains what happened and where:
ModSecurity: Warning. Pattern match "<(?:a|abbr|acronym ..." at ARGS:wp_post.
The rule ID of the matched rule:
[id "941320"]
The additional matching data from the rule, which explains precisely what caused the rule match:
[data "Matched Data: <h1> found within ARGS:wp_post: <h1>welcome to my blog</h1>"]
Tip
CRS ships with a prebuilt rule exclusion package for WordPress, as well as other popular web applications, to help prevent false positives. See the section on rule exclusion packages for details.
Why are False Positives a Problem?
Alert Fatigue
If a system is prone to reporting false positives then the alerts it raises may be ignored. This may lead to real attacks being overlooked. For this reason, leaving false positives mixed in with real attacks is dangerous: the false positives should be resolved.
A false positive alert may contain sensitive information, for example usernames, passwords, and payment card data. Imagine a situation where a web application user has set their password to ‘/bin/bash’: without proper tuning, this input would cause a false positive every time the user logged in, writing the user’s password to the error log file in plaintext as part of the alert.
It’s also important to consider issues surrounding regulatory compliance. Data protection and privacy laws, like GDPR and CCPA, place strict duties and limitations on what information can be gathered and how that information is processed and stored. The unnecessary logging data generated by false positives can cause problems in this regard.
Poor User Experience
When working in strict blocking mode, false positives can cause legitimate user transactions to be blocked, leading to poor user experience. This can create pressure to disable the CRS or even to remove the WAF solution entirely, which is an unnecessary sacrifice of security for usability. The correct solution to this problem is to tune away the false positives so that they don’t reoccur in the future.
Tuning Away False Positives
Directly Modifying CRS Rules
Warning
Making direct modifications to CRS rule files is a bad idea and is strongly discouraged.
It may seem logical to prevent false positives by modifying the offending CRS rules. If a detection pattern in a CRS rule is causing matches with genuine transactions then the pattern could be modified. This is a bad idea.
Directly modifying CRS rules essentially creates a fork of the rule set. Any modifications made would be undone by a rule set update, meaning that any changes would need to be continually reapplied by hand. This is a tedious, time consuming, and error-prone solution.
There are alternative ways to deal with false positives, as described below. These methods sometimes require slightly more effort and knowledge but they do not cause problems when performing rule set updates.
Rule Exclusions
Overview
The ModSecurity WAF engine has flexible ways to tune away false positives. It provides several rule exclusion (RE) mechanisms which allow rules to be modified without directly changing the rules themselves. This makes it possible to work with third-party rule sets, like CRS, by adapting rules as needed while leaving the rule set files intact and unmodified. This allows for easy rule set updates.
Two fundamentally different types of rule exclusions are supported:
Configure-time rule exclusions: Rule exclusions that are applied once, at configure-time (e.g. when (re)starting or reloading ModSecurity, or the server process that holds it). For example: “remove rule X at startup and never execute it.”
This type of rule exclusion takes the form of a ModSecurity directive, e.g. SecRuleRemoveById
.
Runtime rule exclusions: Rule exclusions that are applied at runtime on a per-transaction basis (e.g. exclusions that can be conditionally applied to some transactions but not others). For example: “if a transaction is a POST request to the location ’login.php’, remove rule X.”
This type of rule exclusion takes the form of a SecRule
.
Info
Runtime rule exclusions, while granular and flexible, have a computational overhead, albeit a small one. A runtime rule exclusion is an extra SecRule which must be evaluated for every transaction.
In addition to the two types of exclusions, rules can be excluded in two different ways:
- Exclude the entire rule/tag: An entire rule, or entire category of rules (by specifying a tag), is removed and will not be executed by the rule engine.
- Exclude a specific variable from the rule/tag: A specific variable will be excluded from a specific rule, or excluded from a category of rules (by specifying a tag).
These two methods can also operate on multiple individual rules, or even entire rule categories (identified either by tag or by using a range of rule IDs).
The combinations of rule exclusion types and methods allow for writing rule exclusions of varying granularity. Very coarse rule exclusions can be written, for example “remove all SQL injection rules” using SecRuleRemoveByTag
. Extremely granular rule exclusions can also be written, for example “for transactions to the location ‘web_app_2/function.php’, exclude the query string parameter ‘user_id’ from rule 920280” using a SecRule and the action ctl:ruleRemoveTargetById
.
The different rule exclusion types and methods are summarized in the table below, which presents the main ModSecurity directives and actions that can be used for each type and method of rule exclusion:
| Exclude entire rule/tag | Exclude specific variable from rule/tag |
---|
Configure-time | SecRuleRemoveById * SecRuleRemoveByTag | SecRuleUpdateTargetById SecRuleUpdateTargetByTag |
Runtime | ctl:ruleRemoveById ** ctl:ruleRemoveByTag | ctl:ruleRemoveTargetById ctl:ruleRemoveTargetByTag |
*Can also exclude ranges of rules or multiple space separated rules.
**Can also exclude ranges of rules (not currently supported in ModSecurity v3).
Tip
This table is available as a well presented, downloadable Rule Exclusion Cheatsheet from Christian Folini.
Note
There’s also a third group of rule exclusion directives and actions, the use of which is discouraged. As well as excluding rules “ById” and “ByTag”, it’s also possible to exclude “ByMsg” (SecRuleRemoveByMsg
, SecRuleUpdateTargetByMsg
, ctl:ruleRemoveByMsg
, and ctl:ruleRemoveTargetByMsg
). This excludes rules based on the message they write to the error log. These messages can be dynamic and may contain special characters. As such, trying to exclude rules by message is difficult and error-prone.
CRS rules typically feature multiple tags, grouping them into different categories. For example, a rule might be tagged by attack type (‘attack-rce’, ‘attack-xss’, etc.), by language (’language-java’, ’language-php’, etc.), and by platform (‘platform-apache’, ‘platform-unix’, etc.).
Tags can be used to remove or modify entire categories of rules all at once, but some tags are more useful than others in this regard. Tags for specific attack types, languages, and platforms may be useful for writing rule exclusions. For example, if lots of the SQL injection rules are causing false positives but SQL isn’t in use anywhere in the back end web application then it may be worthwhile to remove all CRS rules tagged with ‘attack-sqli’ (SecRuleRemoveByTag attack-sqli
).
Some rule tags are not useful for rule exclusion purposes. For example, there are generic tags like ’language-multi’ and ‘platform-multi’: these contain hundreds of rules across the entire CRS, and they don’t represent a meaningful rule property to be useful in rule exclusions. There are also tags that categorize rules based on well known security standards, like CAPEC and PCI DSS (e.g. ‘capec/1000/153/267’, ‘PCI/6.5.4’). These tags may be useful for informational and reporting purposes but are not useful in the context of writing rule exclusions.
Excluding rules using tags may be more useful than excluding using rule ranges in situations where a category of rules is spread across multiple files. For example, the ’language-php’ rules are spread across several different rule files (both inbound and outbound rule files).
Rule Ranges
As well as rules being tagged using different categories, CRS rules are organized into files by general category. In addition, CRS rule IDs follow a consistent numbering convention. This makes it easy to remove unwanted types of rules by removing ranges of rule IDs. For example, the file REQUEST-913-SCANNER-DETECTION.conf
contains rules related to detecting well known scanners and crawlers, which all have rule IDs in the range 913000-913999. All of the rules in this file can be easily removed using a configure-time rule exclusion, like so:
SecRuleRemoveById "913000-913999"
Excluding rules using rule ranges may be more useful than excluding using tags in situations where tags are less relevant or where tags vary across the rules in question. For example, a rule range may be the most appropriate solution if the goal is to remove all rules contained in a single file, regardless of how the rules are tagged.
Support for Regular Expressions
Most of the configure-time rule exclusion directives feature some level of support for using regular expressions. This makes it possible, for example, to exclude a dynamically named variable from a rule. The directives with support for regular expressions are:
SecRuleRemoveByTag
A regular expression is used for the tag match. For example, SecRuleRemoveByTag "injection"
would match both “attack-injection-generic” and “attack-injection-php”.
SecRuleRemoveByMsg
A regular expression is used for the message match. For example, SecRuleRemoveByMsg "File Access"
would match both “OS File Access Attempt” and “Restricted File Access Attempt”.
SecRuleUpdateTargetById
, SecRuleUpdateTargetByTag
, SecRuleUpdateTargetByMsg
A regular expression can optionally be used in the target specification by enclosing the regular expression in forward slashes. This is useful for dealing with dynamically named variables, like so:
SecRuleUpdateTargetById 942440 "!REQUEST_COOKIES:/^uid_.*/"
.
This example would exclude request cookies named “uid_0123456”, “uid_6543210”, etc. from rule 942440.
Note
The ‘ctl’ action for writing runtime rule exclusions does not support any use of regular expressions. This is a known limitation of the ModSecurity rule engine.
Placement of Rule Exclusions
It is crucial to put rule exclusions in the correct place, otherwise they may not work.
Configure-time rule exclusions: These must be placed after the CRS has been included in a configuration. For example:
# Include CRS
Include crs/rules/*.conf
# Configure-time rule exclusions
...
Configure-time rule exclusions remove rules. A rule must already be defined before it can be removed (something cannot be removed if it doesn’t yet exist). As such, this type of rule exclusion must appear after the CRS and all its rules have been included.
Runtime rule exclusions: These must be placed before the CRS has been included in a configuration. For example:
# Runtime rule exclusions
...
# Include CRS
Include crs/rules/*.conf
Runtime rule exclusions modify rules in some way. If a rule is to be modified then this should occur before the rule is executed (modifying a rule after it has been executed has no effect). As such, this type of rule exclusion must appear before the CRS and all its rules have been included.
Tip
CRS ships with the files REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf.example
and RESPONSE-999-EXCLUSION-RULES-AFTER-CRS.conf.example
. After dropping the “.example” suffix, these files can be used to house “BEFORE-CRS” (i.e. runtime) and “AFTER-CRS” (i.e. configure-time) rule exclusions in their correct places relative to the CRS rules. These files also contain example rule exclusions to copy and learn from.
Example 1 (SecRuleRemoveById)
(Configure-time RE. Exclude entire rule.)
Scenario: Rule 933151, “PHP Injection Attack: Medium-Risk PHP Function Name Found”, is causing false positives. The web application behind the WAF makes no use of PHP. As such, it is deemed safe to tune away this false positive by completely removing rule 933151.
Rule Exclusion:
# CRS Rule Exclusion: 933151 - PHP Injection Attack: Medium-Risk PHP Function Name Found
SecRuleRemoveById 933151
Example 2 (SecRuleRemoveByTag)
(Configure-time RE. Exclude entire tag.)
Scenario: Several different parts of a web application are causing false positives with various SQL injection rules. None of the web services behind the WAF make use of SQL, so it is deemed safe to tune away these false positives by removing all the SQLi detection rules.
Rule Exclusion:
# CRS Rule Exclusion: Remove all SQLi detection rules
SecRuleRemoveByTag attack-sqli
Example 3 (SecRuleUpdateTargetById)
(Configure-time RE. Exclude specific variable from rule.)
Scenario: The content of a POST body parameter named ‘wp_post’ is causing false positives with rule 941320, “Possible XSS Attack Detected - HTML Tag Handler”. Removing this rule entirely is deemed to be unacceptable: the rule is not causing any other issues, and the protection it provides should be retained for everything apart from ‘wp_post’. It is decided to tune away this false positive by excluding ‘wp_post’ from rule 941320.
Rule Exclusion:
# CRS Rule Exclusion: 941320 - Possible XSS Attack Detected - HTML Tag Handler
SecRuleUpdateTargetById 941320 "!ARGS:wp_post"
Example 4 (SecRuleUpdateTargetByTag)
(Configure-time RE. Exclude specific variable from rule.)
Scenario: The values of request cookies with random names of the form ‘uid_<STRING>’ are causing false positives with various SQL injection rules. It is decided that it is not a risk to allow SQL-like content in cookie values, however it is deemed unacceptable to disable the SQLi detection rules for anything apart from the request cookies in question. It is decided to tune away these false positives by excluding only the problematic request cookies from the SQLi detection rules. A regular expression is to be used to handle the random string portion of the cookie names.
Rule Exclusion:
# CRS Rule Exclusion: Exclude the request cookies 'uid_<STRING>' from the SQLi detection rules
SecRuleUpdateTargetByTag attack-sqli "!REQUEST_COOKIES:/^uid_.*/"
Example 5 (ctl:ruleRemoveById)
(Runtime RE. Exclude entire rule.)
Scenario: Rule 920230, “Multiple URL Encoding Detected”, is causing false positives at the specific location ‘/webapp/function.php’. This is being caused by a known quirk in how the web application has been written, and it cannot be fixed in the application. It is deemed safe to tune away this false positive by removing rule 920230 for that specific location only.
Rule Exclusion:
# CRS Rule Exclusion: 920230 - Multiple URL Encoding Detected
SecRule REQUEST_URI "@beginsWith /webapp/function.php" \
"id:1000,\
phase:1,\
pass,\
nolog,\
ctl:ruleRemoveById=920230"
Example 6 (ctl:ruleRemoveByTag)
(Runtime RE. Exclude entire tag.)
Scenario: Several different locations under ‘/web_app_1/content’ are causing false positives with various SQL injection rules. Nothing under that location makes any use of SQL, so it is deemed safe to remove all the SQLi detection rules for that location. Other locations may make use of SQL, however, so the SQLi detection rules must remain in place everywhere else. It has been decided to tune away the false positives by removing all the SQLi detection rules for locations under ‘/web_app_1/content’ only.
Rule Exclusion:
# CRS Rule Exclusion: Remove all SQLi detection rules
SecRule REQUEST_URI "@beginsWith /web_app_1/content" \
"id:1010,\
phase:1,\
pass,\
nolog,\
ctl:ruleRemoveByTag=attack-sqli"
Example 7 (ctl:ruleRemoveTargetById)
(Runtime RE. Exclude specific variable from rule.)
Scenario: The content of a POST body parameter named ’text_input’ is causing false positives with rule 941150, “XSS Filter - Category 5: Disallowed HTML Attributes”, at the specific location ‘/dynamic/new_post’. Removing this rule entirely is deemed to be unacceptable: the rule is not causing any other issues, and the protection it provides should be retained for everything apart from ’text_input’ at the specific problematic location. It is decided to tune away this false positive by excluding ’text_input’ from rule 941150 for location ‘/dynamic/new_post’ only.
Rule Exclusion:
# CRS Rule Exclusion: 941150 - XSS Filter - Category 5: Disallowed HTML Attributes
SecRule REQUEST_URI "@beginsWith /dynamic/new_post" \
"id:1020,\
phase:1,\
pass,\
nolog,\
ctl:ruleRemoveTargetById=941150;ARGS:text_input"
Example 8 (ctl:ruleRemoveTargetByTag)
(Runtime RE. Exclude specific variable from rule.)
Scenario: The values of request cookie ‘uid’ are causing false positives with various SQL injection rules when trying to log in to a web service at location ‘/webapp/login.html’. It is decided that it is not a risk to allow SQL-like content in this specific cookie’s values for the login page, however it is deemed unacceptable to disable the SQLi detection rules for anything apart from the specific request cookie in question at the login page only. It is decided to tune away these false positives by excluding only the problematic request cookie from the SQLi detection rules, and only when accessing ‘/webapp/login.html’.
Rule Exclusion:
# CRS Rule Exclusion: Exclude the request cookie 'uid' from the SQLi detection rules
SecRule REQUEST_URI "@beginsWith /webapp/login.html" \
"id:1030,\
phase:1,\
pass,\
nolog,\
ctl:ruleRemoveTargetByTag=attack-sqli;REQUEST_COOKIES:uid"
Tip
It’s possible to write a conditional rule exclusion that tests something other than just the request URI. Conditions can be built which test, for example, the source IP address, HTTP request method, HTTP headers, and even the day of the week.
Multiple conditions can also be chained together to create a logical AND by using ModSecurity’s chain action. This allows for creating powerful rule logic like “for transactions that are from source IP address 10.0.0.1 AND that are for location ‘/login.html’, exclude the query string parameter ‘user_id’ from rule 920280”. Extremely granular and specific rule exclusions can be written, in this way.
Rule Exclusion Packages
CRS ships with prebuilt rule exclusion packages for a selection of popular web applications. These packages contain application-specific rule exclusions designed to prevent false positives from occurring when CRS is put in front of one of these web applications.
The packages should be viewed as a good starting point from which to build upon. Some false positives may still occur, for example if working at a high paranoia level, if using a very new or old version of the application, if using plug-ins, add-ons, or user customizations.
If using a native CRS installation, rule exclusion packages can be enabled in the file crs-setup.conf
. Modify rule 900130 to select the web applications in question, e.g. to enable the DokuWiki rule exclusion package use setvar:tx.crs_exclusions_dokuwiki=1
, and then uncomment the rule to enable it.
If running CRS where it has been integrated into a commercial product or CDN then support varies. Some vendors expose rule exclusion packages in the GUI while other vendors require custom rules to be written which set the necessary variables. Unfortunately, there are also vendors that don’t allow rule exclusion packages to be used at all.
Tip
If running multiple web applications, it is highly recommended to enable a rule exclusion package only for the location where the corresponding web application resides. For example, to enable the WordPress rule exclusion package only for locations under ‘/wordpress’, a rule like the following could be used:
SecRule REQUEST_URI "@beginsWith /wordpress/" setvar:tx.crs_exclusions_wordpress=1...
Rule exclusion packages are currently available for the following web applications:
The CRS project is always looking to work with other communities and individuals to add support for additional web applications. Please get in touch via GitHub to discuss writing a rule exclusion package for a specific web application.
Further Reading
A popular tutorial titled Handling False Positives with OWASP CRS by Christian Folini walks through a full CRS tuning process, with examples.
Detailed reference of each of the rule exclusion mechanisms outlined above can be found in the ModSecurity Reference Manual:
- Configure-time rule exclusion mechanisms:
- Runtime rule exclusion mechanisms:
Sampling Mode
Sampling mode makes it possible to apply CRS to a limited percentage of traffic only. This may be useful in certain scenarios when enabling CRS for the first time, as this page explains.
Introduction to Sampling Mode
The CRS’s sampling mode mechanism was first introduced in version 3.0.0 in 2016. Although the feature has been available since then, it’s rarely used in practice, partly due to it being one of the lesser-known features of CRS.
When deploying ModSecurity and CRS in front of an existing web service for the first time, it’s difficult to predict what’s going to happen when CRS is turned on. A well-developed test environment can help, but it’s rare to find an installation where real world traffic can be reproduced 1:1 on a test setup. As such, fully enabling ModSecurity and CRS can be something of a leap into the unknown, and potentially very disruptive. This scenario prompted the introduction of CRS 3’s sampling mode.
Sampling mode makes it possible to run CRS on a limited percentage of traffic. The remaining traffic will bypass the rule set. If it turns out that ModSecurity is extremely disruptive or if the rules are too resource heavy for the server, only a limited percentage of the total traffic can be negatively affected (for example, only 1%). This significantly reduces the potential impact and risk of enabling CRS, especially when the logs are being monitored and the deployment can be rolled back if the alerts start to pile up.
Using sampling mode means that CRS offers relatively little security if the sampling percentage is set to be low. The idea, however, is to increase the percentage over time, from 1% to 2%, to 5%, 10%, 20%, 50%, and ultimately to 100%, where the rules are applied to all traffic.
The default sampling percentage of CRS is 100%. As such, if sampling is not of interest then the option can be safely ignored.
Applying Sampling Mode
Sampling mode is controlled by setting the sampling percentage. This is defined in the crs-setup.conf
configuration file and can be found in the rule with ID 900400. To use sampling mode, uncomment this rule and set the variable tx.sampling_percentage
to the desired value:
SecAction "id:900400,\
phase:1,\
pass,\
nolog,\
setvar:tx.sampling_percentage=50"
To test sampling mode, set the sampling percentage to 50 (which represents 50%), reload the server, and issue a few requests featuring a payload resembling an exploit. For example:
$ curl -v http://localhost/index.html?test=/etc/passwd
- If CRS is applied to the transaction (and the inbound anomaly threshold is set to 10 or lower) then a
403 Forbidden
status code will be returned, since the request causes two critical rules to match, by default. - If sampling mode is triggered for the transaction (with a 50% probability) then the rule set will be bypassed and an ordinary response will be received, e.g. a
200 OK
status code.
In the latter case, where sampling mode is triggered and CRS is bypassed, an alert like the following can be found in the error log:
[Wed Jan 01 00:00:00.123456 2022] [:error] [pid 3728:tid 139664291870464] [client 10.0.0.1:0] [client 10.0.0.1] ModSecurity: Warning. Match of "lt %{tx.sampling_percentage}" against "TX:sampling_rnd100" required. [file "/etc/crs/rules/REQUEST-901-INITIALIZATION.conf"] [line "434"] [id "901450"] [msg "Sampling: Disable the rule engine based on sampling_percentage 50 and random number 81"] [ver "OWASP_CRS/3.3.2"] [hostname "www.example.com"] [uri "/index.html"] [unique_id "YgBKx4BqNoKe-XoGhPCPtAAAAIQ"]
Here, CRS reports that it disabled the rule engine because the random number was above the sampling limit. The sampling percentage is set at the desired level, the rule set generates a random integer in the range 0-99 per-transaction, and if it’s above the sampling percentage then the WAF is disabled for the remainder of the transaction.
Warning
As sampling mode works by selectively disabling the ModSecurity WAF engine, if other rule sets are installed then they will be bypassed too.
- For requests where the rule set is bypassed, a log entry is emitted by rule 901450.
- For the other requests, those without a corresponding 901450 entry, the rule set is applied normally.
Disabling Sampling Mode Log Entries
If the log entires generated by rule 901450 seem excessive in volume then the rule can be silenced by applying the following directive, either after the CRS include statement or in the file RESPONSE-999-EXCLUSION-RULES-AFTER-CRS.conf
:
SecRuleUpdateActionById 901450 "nolog"
Rollback
If CRS is deployed in front of a large service and sampling mode is in use, with a low sampling rate defined, if the logs still start piling up then it may be desirable to completely disable CRS. Rather than carrying out a full rollback of the deployment, the quickest solution is to define tx.sampling_percentage
to be 0, which means that every request will bypass the WAF (a sampling percentage of 0%: “sample 0% of traffic”). This will take effect once the web server has been reloaded so that it picks up the modified configuration. This leaves ModSecurity and CRS installed and ready for use, but completely disabled.
Random Number Generation
ModSecurity has no built-in functionality to return random numbers, forcing CRS to find entropy for itself. It does this by taking advantage of the fact that the UNIQUE_ID
variable, which identifies each request with a token that’s guaranteed to be unique, has a random element to it. This is the entropy that’s used for sampling.
Rule 901410 hashes the unique ID and encodes the result as a string of hexadecimal characters. The first two digits of the string are then extracted to get a random number from 0 to 99. In the extremely rare case where the hex encoded hash doesn’t contain a digit, there’s a fallback routine in place which takes the last digits of the DURATION
variable.
The random numbers generated using this method are not cryptographically secure, but they are sufficient for the purposes of sampling.
Subsections of About Plugins
Plugins
The CRS plugin mechanism allows the rule set to be extended in specific, experimental, or unusual ways, as this page explains.
Note
Plugins are not part of the CRS 3.3.x release line. They are released officially with CRS 4.0. In the meantime, plugins can be used with one of the stable releases by following the instructions presented below.
What are Plugins?
Plugins are sets of additional rules that can be plugged in to a web application firewall in order to expand CRS with complementary functionality or to interact with CRS. Rule exclusion plugins are a special case: these are plugins that disable certain rules to integrate CRS in to a context that is otherwise likely to trigger certain false alarms.
Why are Plugins Needed?
Installing only a minimal set of rules is desirable from a security perspective. A term often used is “minimizing the attack window”. For CRS, this means that by having fewer rules, it is less likely to deploy a bug. In the past, CRS had a major bug in one of the rule exclusion packages which affected every standard CRS installation (see CVE-2021-41773). By moving all rule exclusion packages into optional plugins, the risk is reduced in this regard. As such, security is a prime driver for the use of plugins.
A second driver is the need for certain functionality that does not belong in mainline CRS releases. Typical candidates include the following:
- ModSecurity features deemed too exotic for mainline, like the use of Lua scripting
- New rules that are not yet trusted enough to integrate into the mainline
- Specialized functionality with a very limited audience
A plugin might also evolve quicker than the slow release cycle of stable CRS releases. That way, a new and perhaps experimental plugin can be updated quickly.
Finally, there is a need to allow third parties to write plugins that interact with CRS. This was previously very difficult to manage, but with plugins everybody has the opportunity to write anomaly scoring rules.
How do Plugins Work Conceptually?
Plugins are a set of rules. These rules can run in any phase, but in practice it is expected that most of them run in phase 1 and, especially, in phase 2, just like the rules in CRS. The rules of a plugin are separated into a rule file that is loaded before the CRS rules are loaded and a rule file with rules to be executed after the CRS rules are executed.
Optionally, a plugin can also have a separate configuration file with rules that configure the plugin, just like the crs-setup.conf
configuration file.
The order of execution is as follows:
- CRS configuration
- Plugin configuration
- Plugin rules before CRS rules
- CRS rules
- Plugin rules after CRS rules
This can be mapped almost 1:1 to the Includes
involved:
Include crs/crs-setup.conf
Include crs/plugins/*-config.conf
Include crs/plugins/*-before.conf
Include crs/rules/*.conf
Include crs/plugins/*-after.conf
The two existing CRS Include
statements are complemented with three additional generic plugin Includes
. This means CRS is configured first, then the plugins are configured (if any), then the first batch of plugin rules are executed, followed by the main CRS rules, and finally the second batch of plugin rules run, after CRS.
How to Install a Plugin
The first step is to prepare the plugin folder.
CRS 4.x will come with a plugins folder next to the rules folder. When using an older CRS release without a plugins folder, create one and place three empty config files in it (e.g. by using the shell command touch
):
crs/plugins/empty-config.conf
crs/plugins/empty-before.conf
crs/plugins/empty-after.conf
These empty rule files ensure that the web server does not fail when Include
-ing *.conf
if there are no plugin files present.
Info
Apache supports the IncludeOptional
directive, but that is not available on all web servers, so Include
is used here in the interests of having consistent and simple documentation.
For the installation, there are two methods:
Method 1: Copying the plugin files
This is the simple way. Download or copy the plugin files, which are likely rules and data files, and put them in the plugins folder of the CRS installation, as prepared above.
There is a chance that a plugin configuration file comes with a .example
suffix in the filename, like the crs-setup.conf.example
configuration file in the CRS release. If that’s the case then rename the plugin configuration file by removing the suffix.
Be sure to look at the configuration file and see if there is anything that needs to be configured.
Finally, reload the WAF and the plugin should be active.
Method 2: Placing symbolic links to separate plugin files downloaded elsewhere
This is the more advanced setup and the one that’s in sync with many Linux distributions.
With this approach, download the plugin to a separate location and put a symlink to each individual file in the plugins folder. If the plugin’s configuration file comes with a .example
suffix then that file needs to be renamed first.
With this approach it’s easier to upgrade and downgrade a plugin by simply changing the symlink to point to a different version of the plugin. It’s also possible to git checkout
the plugin and pull the latest version when there’s an update. It’s not possible to do this in the plugins folder itself, namely when multiple plugins need to be installed side by side.
This symlink setup also makes it possible to git clone
the latest version of a plugin and update it in the future without further ado. Be sure to pay attention to any updates in the config file, however.
If updating plugins this way, there’s a chance of missing out a new variable that’s defined in the latest version of the plugin’s config file. Plugin authors should make sure this is not happening to plugin users by adding a rule that checks for the existence of all config variables in the Before-File. Examples of this can be found in CRS file REQUEST-901-INITIALIZATION.conf
.
How to Disable a Plugin
Disabling a plugin is simple. Either remove the plugin files in the plugins folder or, if installed using the symlink method, remove the symlinks to the real files. Working with symlinks is considered to be a ‘cleaner’ approach, since the plugin files remain available to re-enable in the future.
Alternatively, it is also valid to disable a plugin by renaming a plugin file from plugin-before.conf
to plugin-before.conf.disabled
.
Conditionally enable plugins for multi-application environments
If CRS is installed on a reverse-proxy or a web server with multiple web applications, then you may wish to only enable certain plugins (such as rule exclusion plugins) for certain virtual hosts (VirtualHost
for Apache httpd, Server
context for Nginx). This ensures that rules designed for a specific web application are only enabled for the intended web application, reducing the scope of any possible bypasses within a plugin.
Most plugins provide an example to disable the plugin in the file plugin-config.conf
, you can define the WebAppID
variable for each virtual host and then disable the plugin when the WebAppID
variable doesn’t match.
See: https://github.com/owasp-modsecurity/ModSecurity/wiki/Reference-Manual-(v2.x)#secwebappid
Below is an example for enabling only the WordPress plugin for WordPress virtual hosts:
SecRule &TX:wordpress-rule-exclusions-plugin_enabled "@eq 0" \
"id:9507010,\
phase:1,\
pass,\
nolog,\
ver:'wordpress-rule-exclusions-plugin/1.0.0',\
chain"
SecRule WebAppID "!@streq wordpress" \
"t:none,\
setvar:'tx.wordpress-rule-exclusions-plugin_enabled=0'"
⚠️ Warning: As of 05/06/2024, Coraza doesn’t support the use of WebAppID, you can use theHost
header instead of the WebAppID
variable:
SecRule &TX:wordpress-rule-exclusions-plugin_enabled "@eq 0" \
"id:9507010,\
phase:1,\
pass,\
nolog,\
ver:'wordpress-rule-exclusions-plugin/1.0.0',\
chain"
SecRule REQUEST_HEADERS:Host "!@streq wordpress.example.com" \
"t:none,\
setvar:'tx.wordpress-rule-exclusions-plugin_enabled=0'"
See: https://coraza.io/docs/seclang/variables/#webappid
What Plugins are Available?
All official plugins are listed on GitHub in the CRS plugin registry repository: https://github.com/coreruleset/plugin-registry.
Available plugins include:
- Template Plugin: This is the example plugin for getting started.
- Auto-Decoding Plugin: This uses ModSecurity transformations to decode encoded payloads before applying CRS rules at PL 3 and double-decoding payloads at PL 4.
- Antivirus Plugin: This helps to integrate an antivirus scanner into CRS.
- Body-Decompress Plugin: This decompresses/unzips the response body for inspection by CRS.
- Fake-Bot Plugin: This performs a reverse DNS lookup on IP addresses pretending to be a search engine.
- Incubator Plugin: This plugin allows non-scoring rules to be tested in production before pushing them into the mainline.
How to Write a Plugin
For information on writing a new plugin, refer to the development documentation on writing plugins.
Collection Timeout
If plugins need to work with collections and set a custom SecCollectionTimeout
outside of the default 3600 seconds defined by the ModSecurity engine, the plugin should either set it in its configuration or indicate the desired value in the plugin documentation. CRS used to define SecCollectionTimeout
in crs-setup.conf
before but removed this setting with the introduction of plugins for CRS v4. That’s because CRS itself does not work with collections anymore.
Writing Plugins
The CRS plugin mechanism allows the rule set to be extended in specific, experimental, or unusual ways. This page explains how to write a new plugin to extend CRS.
How to Write a Plugin
Is a Plugin the Right Approach for a Given Rule Problem?
This is the first and most important question to ask.
CRS is a generic rule set. The rule set has no awareness of the particular setup it finds itself deployed in. As such, the rules are written with caution and administrators are given the ability to steer the behavior of CRS by setting the anomaly threshold accordingly. An administrator writing their own rules knows a lot more about their specific setup, so there’s probably no need to be as cautious. It’s also probably futile to write anomaly scoring rules in this situation. Anomaly scoring adds little value if an administrator knows that everybody issuing a request to /no-access
, for example, is an attacker.
In such a situation, it’s better to write a simple deny-rule that blocks said requests. There’s no need for a plugin in most situations.
Plugin Writing Guidance
When there really is a good use case for a plugin, it’s recommended to start with a clone of the template plugin. It’s well documented and a good place to start from.
Plugins are a new idea for CRS. As such, there aren’t currently any strict rules about what a plugin is and isn’t allowed to do. There are definitely fewer rules and restrictions for writing plugin rules than for writing a mainline CRS rule, which is becoming increasingly strict as the project evolves. This means that it’s basically possible to do anything in a plugin, especially when there’s no plan to contribute the plugin to the CRS project.
When it is planned to contribute a plugin back to the CRS project, the following guidance will help:
- Try to keep plugins separate. Try not to interfere with other plugins and make sure that any other plugin can run next to yours.
- Be careful when interfering with CRS. It’s easy to disrupt CRS by excluding essential rules or by messing with variables.
- Keep an eye on performance and think of use cases.
Anomaly Scoring: Getting the Phases Right
The anomaly scores are only initialized in the CRS rules file REQUEST-901-INITIALIZATION.conf
. This happens in phase 1, but it still happens after a plugin’s *-before.conf
file has been executed for phase 1. As a consequence, if anomaly scores are set there then they’ll be overwritten in CRS phase 1.
The effect for phase 2 anomaly scoring in a plugin’s *-after.conf
file is similar. It happens after the CRS request blocking happens in phase 2. This can mean a plugin raises the anomaly score after the blocking decision. This might result in a higher anomaly score in the log file and confusion as to why the request was not blocked.
What to do is as follows:
- Scoring in phase 1: Put in the plugin’s After-File (and be aware that early blocking won’t work).
- Scoring in phase 2: Put in the plugin’s Before-File.
Plugin Use of Persistent Collections: ModSecurity SecCollectionTimeout
If a plugin uses persistent collections (stores stateful information across multiple requests, e.g., to implement DoS protection functionality), it is important to note that CRS does not change the default value (3600
) for the ModSecurity SecCollectionTimeout
directive. Plugin authors must instruct users to set the directive to an appropriate value if the plugin requires a value that differs from the default. A plugin should never actively set SecCollectionTimeout
, as other plugins may specify different values for the directive and the choice for the effective value must be made by the user.
Quality Guarantee
The official CRS plugins are separated from third party plugins. The rationale is to keep the quality of official plugins on par with the quality of the CRS project itself. It’s not possible to guarantee the quality of third party plugins as their code is not under the control of the CRS project. Third party plugins should be examined and considered separately to decide whether their quality is sufficient for use in production.
How to Integrate a Plugin into the Official Registry
Plugins should be developed and refined until they are production-ready. The next step is to open a pull request at the plugin registry. Any free rule ID range can be used for a new plugin. The plugin will then be reviewed and assigned a block of rule IDs. Afterwards, the plugin will be listed as a new third party plugin.
Subsections of Development
Contribution Guidelines
The CRS project values third party contributions. To make the contribution process as easy as possible, a helpful set of contribution guidelines are in place which all contributors and developers are asked to adhere to.
Getting Started with a New Contribution
- Sign in to GitHub.
- Open a new issue for the contribution, assuming a similar issue doesn’t already exist.
- Clearly describe the issue, including steps to reproduce if reporting a bug.
- Specify the CRS version in question if reporting a bug.
- Bonus points for submitting tests along with the issue.
- Fork the repository on GitHub and begin making changes there.
- Signed commits are preferred. (For more information and help with this, refer to the GitHub documentation).
Making Changes
- Base any changes on the latest dev branch (e.g.,
main
). - Create a topic branch for each new contribution.
- Fix only one problem at a time. This helps to quickly test and merge submitted changes. If intending to fix multiple unrelated problems then use a separate branch for each problem.
- Make commits of logical units.
- Make sure commits adhere to the contribution guidelines presented in this document.
- Make sure commit messages follow the standard Git format.
- Make sure changes are submitted as a pull request (PR) on GitHub.
- PR titles should follow the Conventional Commits format, for example:
fix(rce): Fix a FP in rule 912345 with keyword 'time'
. - If a PR only affects a single rule then the rule ID should be included in the title.
- If a PR title does not follow the correct format then a CRS developer will fix it.
- American English should be used throughout.
- 4 spaces should be used for indentation (no tabs).
- Files must end with a single newline character.
- No trailing whitespace at EOL.
- No trailing blank lines at EOF (only the required single EOF newline character is allowed).
- Adhere to an 80 character line length limit where possible.
- Add comments where possible and clearly explain any new rules.
- Comments must not appear between chained rules and should instead be placed before the start of a rule chain.
- All chained rules should be indented like so, for readability:
SecRule .. .. \
"..."
SecRule .. .. \
"..."
SecRule .. .. \
"..."
- Action lists in rules must always be enclosed in double quotes for readability, even if there is only one action (e.g., use
"chain"
instead of chain
, and "ctl:requestBodyAccess=Off"
instead of ctl:requestBodyAccess=Off
). - Always use numbers for phases instead of names.
- Format all use of
SecMarker
using double quotes, using UPPERCASE, and separating words with hyphens. For example:
SecMarker "END-RESPONSE-959-BLOCKING-EVALUATION"
SecMarker "END-REQUEST-910-IP-REPUTATION"
- Rule actions should appear in the following order, for consistency:
id
phase
allow | block | deny | drop | pass | proxy | redirect
status
capture
t:xxx
log
nolog
auditlog
noauditlog
msg
logdata
tag
sanitiseArg
sanitiseRequestHeader
sanitiseMatched
sanitiseMatchedBytes
ctl
ver
severity
multiMatch
initcol
setenv
setvar
expirevar
chain
skip
skipAfter
- Rule operators must always be explicitly specified. Although ModSecurity defaults to using the
@rx
operator, for clarity @rx
should always be explicitly specified when used. For example, write:
SecRule ARGS "@rx foo" "id:1,phase:1,pass,t:none"
instead of
SecRule ARGS "foo" "id:1,phase:1,pass,t:none"
For CRS Documentation Contributions
- Directory naming: Use lowercase letters and hyphens to separate words. For example:
.
├── 1-getting-started
│ ├── 1-1-crs-installation.md
│ ├── 1-2-extended_install.md
│ ├── 1-3-using-containers.md
│ ├── 1-4-engine_integration_options.md
│ └── _index.md
...
- Inner-link referencing should be used with the following format:
# general markdown format
{{< ref "path/to/file.md" >}}
# this will point to the directory 2-how-crs-works
{{< ref "2-how-crs-works" >}}
# this will point to the .md
{{< ref "2-3-false-positives-and-tuning.md" >}}
Variable Naming Conventions
- Variable names should be lowercase and should use the characters a-z, 0-9, and underscores only.
- To reflect the different syntax between defining a variable (using
setvar
) and using a variable, the following visual distinction should be applied:- Variable definition: Lowercase letters for collection name, dot as the separator, variable name. E.g.:
setvar:tx.foo_bar_variable
- Variable use: Capital letters for collection name, colon as the separator, variable name. E.g.:
SecRule TX:foo_bar_variable
Writing Regular Expressions
- Use the following character class, in the stated order, to cover alphanumeric characters plus underscores and hyphens:
[a-zA-Z0-9_-]
Portable Backslash Representation
CRS uses \x5c
to represent the backslash \
character in regular expressions. Some of the reasons for this are:
- It’s portable across web servers and WAF engines: it works with Apache, Nginx, and Coraza.
- It works with the crs-toolchain for building optimized regular expressions.
The older style of representing a backslash using the character class [\\\\]
must not be used. This was previously used in CRS to get consistent results between Apache and Nginx, owing to a quirk with how Apache would “double un-escape” character escapes. For future reference, the decision was made to stop using this older method because:
- It can be confusing and difficult to understand how it works.
- It doesn’t work with crs-toolchain.
- It doesn’t work with Coraza.
- It isn’t obvious how to use it in a character class, e.g.,
[a-zA-Z<portable-backslash>]
.
Forward Slash Representation
CRS uses literal, unescaped forward slash /
characters in regular expressions.
Regular expression engines and libraries based on PCRE use the forward slash /
character as the default delimiter. As such, forward slashes are often escaped in regular expression patterns. In the interests of readability, CRS does not escape forward slashes in regular expression patterns, which may seem unusual at first to new contributors.
If testing a CRS regular expression using a third party tool, it may be useful to change the delimiter to something other than /
if a testing tool raises errors because a CRS pattern features unescaped forward slashes.
Vertical Tab Representation
CRS uses \x0b
to represent the vertical tab in character classes. This is necessary because most regular expressions are generated and simplified using external libraries. These libraries produce output that the generator must convert into a form that is compatible with both PCRE and RE2 compatible engines. In RE2, \s
does not include \v
(which is the case in PCRE), and needs to be added. However, \v
in PCRE expands to a list of vertical characters and is, therefore, not allowed to start range expressions. Since we only care about the vertical tab we use \x0b
.
When and Why to Anchor Regular Expressions
Engines running OWASP CRS will use regular expressions to search the input string, i.e., the regular expression engine is asked to find the first match in the input string. If an expression needs to match the entire input then the expression must be anchored appropriately.
Beginning of String Anchor (^)
It is often necessary to match something at the start of the input to prevent false positives that match the same string in the middle of another argument, for example. Consider a scenario where the goal is to match the value of REQUEST_HEADERS:Content-Type
to multipart/form-data
. The following regular expression could be used:
"@rx multipart/form-data"
HTTP headers can contain multiple values, and it may be necessary to guarantee that the value being searched for is the first value of the header. There are different ways to do this but the simplest one is to use the ^
caret anchor to match the beginning of the string:
"@rx ^multipart/form-data"
It will also be useful to ignore case sensitivity in this scenario:
"@rx (?i)^multipart/form-data"
End of String Anchor ($)
Consider, for example, needing to find the string /admin/content/assets/add/evil
in the REQUEST_FILENAME
. This could be achieved with the following regular expression:
"@rx /admin/content/assets/add/evil"
If the input is changed, it can be seen that this expression can easily produce a false positive: /admin/content/assets/add/evilbutactuallynot/nonevilfile
. If it is known that the file being searched for can’t be in a subdirectory of add
then the $
anchor can be used to match the end of the input:
"@rx /admin/content/assets/add/evil$"
This could be made a bit more general:
"@rx /admin/content/assets/add/[a-z]+$"
It is sometimes necessary to match the entire input string to ensure that it exactly matches what is expected. It might be necessary to find the “edit” action transmitted by WordPress, for example. To avoid false positives on variations (e.g., “myedit”, “the edit”, “editable”, etc.), the ^
caret and $
dollar anchors can be used to indicate that an exact string is expected. For example, to only match the exact strings edit
or editpost
:
"@rx ^(?:edit|editpost)$"
Other Anchors
Other anchors apart from ^
caret and $
dollar exist, such as \A
, \G
, and \Z
in PCRE. CRS strongly discourages the use of other anchors for the following reasons:
- Not all regular expression engines support all anchors and OWASP CRS should be compatible with as many regular expression engines as possible.
- Their function is sometimes not trivial.
- They aren’t well known and would require additional documentation.
- In most cases that would justify their use the regular expression can be transformed into a form that doesn’t require them, or the rule can be transformed (e.g., with an additional chain rule).
Use Capture Groups Sparingly
Capture groups, i.e., parts of the regular expression surrounded by parentheses ((
and )
), are used to store the matched information from a string in memory for later use. Capturing input uses both additional CPU cycles and additional memory. In many cases, parentheses are mistakenly used for grouping and ensuring precedence.
To group parts of a regular expression, or to ensure that the expression uses the precedence required, surround the concerning parts with (?:
and )
. Such a group is referred to as being “non-capturing”. The following will create a capture group:
On the other hand, this will create a non-capturing group, guaranteeing the precedence of the alternative without capturing the input:
Lazy Matching
The question mark ?
can be used to turn “greedy” quantifiers into “lazy” quantifiers, i.e., .+
and .*
are greedy while .+?
and .*?
are lazy. Using lazy quantifiers can help with writing certain expressions that wouldn’t otherwise be possible. However, in backtracking regular expression engines, like PCRE, lazy quantifiers can also be a source of performance issues. The following is an example of an expression that uses a lazy quantifier:
"@rx (?i)\.cookie\b.*?;\W*?(?:expires|domain)\W*?="
This expression matches cookie values in HTML to detect session fixation attacks. The input string could be document.cookie = "name=evil; domain=https://example.com";
.
The lazy quantifiers in this expression are used to reduce the amount of backtracking that engines such as PCRE have to perform (others, such as RE2, are not affected by this). Since the asterisk *
is greedy, .*
would match every character in the input up to the end, at which point the regular expression engine would realize that the next character, ;
, can’t be matched and it will backtrack to the previous position (;
). A few iterations later, the engine will realize that the character d
from domain
can’t be matched and it will backtrack again. This will happen again and again, until the ;
at evil;
is found. Only then can the engine proceed with the next part of the expression.
Using lazy quantifiers, the regular expression engine will instead match as few characters as possible. The engine will match
(a space), then look for ;
and will not find it. The match will then be expanded to =
and, again, a match of ;
is attempted. This continues until the match is = "name=evil
and the engine finds ;
. While lazy matching still includes some work, in this case, backtracking would require many more steps.
Lazy matching can have the inverse effect, though. Consider the following expression:
"@rx (?i)\b(?:s(?:tyle|rc)|href)\b[\s\S]*?="
It matches some HTML attributes and then expects to see =
. Using a somewhat contrived input, the lazy quantifier will require more steps to match then the greedy version would: style =
. With the lazy quantifier, the regular expression engine will expand the match by one character for each of the space characters in the input, which means 21 steps in this case. With the greedy quantifier, the engine would match up to the end in a single step, backtrack one character and then match =
(note that =
is included in [\s\S]
), which makes 3 steps.
To summarize: be very mindful about when and why you use lazy quantifiers in your regular expressions.
Possessive Quantifiers and Atomic Groups
Lazy and greedy matching change the order in which a regular expression engine processes a regular expression. However, the order of execution does not influence the backtracking behavior of backtracking engines.
Possessive quantifiers (e.g., x++
) and atomic groups (e.g., (?>x)
) are tools that can be used to prevent a backtracking engine from backtracking. They can be used for performance optimization but are only supported by backtracking engines and, therefore, are not permitted in CRS rules.
Writing Regular Expressions for Non-Backtracking Compatibility
Traditional regular expression engines use backtracking to solve some additional problems, such as finding a string that is preceded or followed by another string. While this functionality can certainly come in handy and has its place in certain applications, it can also lead to performance issues and, in uncontrolled environments, open up possibilities for attacks (the term “ReDoS” is often used to describe an attack that exhausts process or system resources due to excessive backtracking).
OWASP CRS tries to be compatible with non-backtracking regular expression engines, such as RE2, because:
- Non-backtracking engines are less vulnerable to ReDoS attacks.
- Non-backtracking engines can often outperform backtracking engines.
- CRS aims to leave the choice of the engine to the user/system.
To ensure compatibility with non-backtracking regular expression engines, the following operations are not permitted in regular expressions:
- positive lookahead (e.g.,
(?=regex)
) - negative lookahead (e.g.,
(?!regex)
) - positive lookbehind (e.g.,
(?<=regex)
) - negative lookbehind (e.g.,
(?<!regex)
) - named capture groups (e.g.,
(?P<name>regex)
) - backreferences (e.g.,
\1
) - named backreferences (e.g.,
(?P=name)
) - conditionals (e.g.,
(?(regex)then|else)
) - recursive calls to capture groups (e.g.,
(?1)
) - possessive quantifiers (e.g.,
(?:regex)++
) - atomic (or possessive) groups (e.g.,
(?>regex)
)
This list is not exhaustive but covers the most important points. The RE2 documentation includes a complete list of supported and unsupported features that various engines offer.
When and How to Optimize Regular Expressions
Optimizing regular expressions is hard. Often, a change intended to improve the performance of a regular expression will change the original semantics by accident. In addition, optimizations usually make expressions harder to read. Consider the following example of URL schemes:
An optimized version (produced by the crs-toolchain) could look like this:
m(?:a(?:ilto|ven)|umble|ms)
The above expression is an optimization because it reduces the number of backtracking steps when a branch fails. The regular expressions in the CRS are often comprised of lists of tens or even hundreds of words. Reading such an expression in an optimized form is difficult: even the simple optimized example above is difficult to read.
In general, contributors should not try to optimize contributed regular expressions and should instead strive for clarity. New regular expressions will usually be required to be submitted as a .ra
file for the crs-toolchain to process. In such a file, the regular expression is decomposed into individual parts, making manual optimizations much harder or even impossible (and unnecessary with the crs-toolchain
). The crs-toolchain
performs some common optimizations automatically, such as the one shown above.
Whether optimizations make sense in a contribution is assessed for each case individually.
Rules Compliance with Paranoia Levels
The rules in CRS are organized into paranoia levels (PLs) which makes it possible to define how aggressive CRS is. See the documentation on paranoia levels for an introduction and more detailed explanation.
Each rule that is placed into a paranoia level must contain the tag paranoia-level/N
, where N is the PL value, however this tag can only be added if the rule does not use the nolog action.
The types of rules that are allowed at each paranoia level are as follows:
PL 0:
- ModSecurity / WAF engine installed, but almost no rules
PL 1:
- Default level: keep in mind that most installations will normally use this level
- Any complex, memory consuming evaluation rules will surely belong to a higher level, not this one
- CRS will normally use atomic checks in single rules at this level
- Confirmed matches only; all scores are allowed
- No false positives / low false positives: try to avoid adding rules with potential false positives!
- False negatives could happen
PL 2:
- Chain usage is allowed
- Confirmed matches use score critical
- Matches that cause false positives are limited to using scores notice or warning
- Low false positive rates
- False negatives are not desirable
PL 3:
- Chain usage with complex regular expression look arounds and macro expansions are allowed
- Confirmed matches use scores warning or critical
- Matches that cause false positives are limited to using score notice
- False positive rates are higher but limited to multiple matches (not single strings)
- False negatives should be a very unlikely accident
PL 4:
- Every item is inspected
- Variable creations are allowed to avoid engine limitations
- Confirmed matches use scores notice, warning, or critical
- Matches that cause false positives are limited to using scores notice or warning
- False positive rates are higher (even on single strings)
- False negatives should not happen at this level
- Check everything against RFCs and allow listed values for the most popular elements
ID Numbering Scheme
The CRS project uses the numerical ID rule namespace from 900,000 to 999,999 for CRS rules, as well as 9,000,000 to 9,999,999 for default CRS rule exclusion packages and plugins.
- Rules applying to the incoming request use the ID range 900,000 to 949,999.
- Rules applying to the outgoing response use the ID range 950,000 to 999,999.
The rules are grouped by the vulnerability class they address (SQLi, RCE, etc.) or the functionality they provide (e.g., initialization). These groups occupy blocks of thousands (e.g., SQLi: 942,000 - 942,999). These grouped rules are defined in files dedicated to a single group or functionality. The filename takes up the first three digits of the rule IDs defined within the file (e.g., SQLi: REQUEST-942-APPLICATION-ATTACK-SQLI.conf
).
The individual rules within each file for a vulnerability class are organized by the paranoia level of the rules. PL 1 is first, then PL 2, etc.
The ID block 9xx000 - 9xx099 is reserved for use by CRS helper functionality. There are no blocking or filtering rules in this block.
Among the rules providing CRS helper functionality are rules that skip other rules depending on the paranoia level. These rules always use the following reserved rule IDs: 9xx011 - 9xx018, with very few exceptions.
The blocking and filter rules start at 9xx100 with a step width of 10, e.g., 9xx100, 9xx110, 9xx120, etc.
The ID of a rule does not correspond directly with its paranoia level. Given the size of rule groups and how they’re organized by paranoia level (starting with the lower PL rules first), PL 2 and above tend to be composed of rules with higher ID numbers.
Stricter Siblings
Within a rule file / block, there are sometimes smaller groups of rules that belong together. They’re closely linked and very often represent copies of the original rules with a stricter limit (alternatively, they can represent the same rule addressing a different target in a second rule, where this is necessary). These are stricter siblings of the base rule. Stricter siblings usually share the first five digits of the rule ID and raise the rule ID by one, e.g., a base rule at 9xx160 and a stricter sibling at 9xx161.
Stricter siblings often have different paranoia levels. This means that the base rule and the stricter siblings don’t usually reside next to each another in the rule file. Instead, they’re ordered by paranoia level and are linked by the first digits of their rule IDs. It’s good practice to introduce all stricter siblings together as part of the definition of the base rule: this can be done in the comments of the base rule. It’s also good practice to refer back to the base rule with the keywords “stricter sibling” in the comments of the stricter siblings themselves. For example: “…This is performed in two separate stricter siblings of this rule: 9xxxx1 and 9xxxx2”, and “This is a stricter sibling of rule 9xxxx0.”
Writing Tests
Each rule should be accompanied by tests. Rule tests are an invaluable way to check that a rule behaves as expected:
- Does the rule correctly match against the payloads and behaviors that the rule is designed to detect? (Positive tests)
- Does the rule correctly not match against legitimate requests, i.e., the rule doesn’t cause obvious false positives? (Negative tests)
Rule tests also provide an excellent way to test WAF engines and implementations to ensure they behave and execute CRS rules as expected.
The rule tests are located under tests/regression/tests
. Each CRS rule file has a corresponding directory and each individual rule has a corresponding YAML file containing all the tests for that rule. For example, the tests for rule 911100 (Method is not allowed by policy) are in the file REQUEST-911-METHOD-ENFORCEMENT/911100.yaml
.
Full documentation of the required formatting and available options of the YAML tests can be found at https://github.com/coreruleset/ftw-tests-schema/blob/main/spec.
Positive Tests
Example of a simple positive test:
- test_id: 26
desc: "Unix command injection"
stages:
- input:
dest_addr: 127.0.0.1
headers:
Host: localhost
User-Agent: "OWASP CRS test agent"
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
method: POST
port: 80
uri: "/"
data: "var=` /bin/cat /etc/passwd`"
version: HTTP/1.1
output:
log:
expect_ids: [932230]
This test will succeed if the log contains an entry for rule 932230, which would indicate that the rule in question matched and generated an alert.
It’s important that tests consistently include the HTTP header fields Host
, User-Agent
, and Accept
. CRS includes rules that detect if these headers are missing or empty, so these headers should be included in each test to avoid unnecessarily causing those rules to match. Ideally, each positive test should cause only the rule in question to match.
The rule’s description field, desc
, is important. It should describe what is being tested: what should match, what should not match, etc. It is often a good idea to use the YAML literal string scalar form, as it makes it easy to write descriptions spanning multiple lines:
- test_id: 1
desc: |
This is the first line of the description.
On the second line, we might see the payload.
NOTE: this is something important to take into account.
stages:
#...
Negative Tests
Example of a simple negative test:
- test_id: 4
stages:
- input:
dest_addr: "127.0.0.1"
method: "POST"
port: 80
headers:
User-Agent: "OWASP CRS test agent"
Host: "localhost"
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
data: 'foo=ping pong tables'
uri: '/'
output:
log:
no_expect_ids: [932260]
This test will succeed if the log does not contain an entry for rule 932260, which would indicate that the rule in question did not match and so did not generate an alert.
Encoded Request
It is possible to encode an entire test request. This encapsulates the request and means that the request headers and payload don’t need to be explicitly declared. This is useful when a test request needs to use unusual bytes which might break YAML parsers, or when a test request must be intentionally malformed in a way that is impossible to describe otherwise. An encoded request is sent exactly as intended.
The encoded_request
field expects a complete request in base64 encoding:
encoded_request: <Base64 string>
For example:
encoded_request: "R0VUIFwgSFRUUA0KDQoK"
where R0VUIFwgSFRUUA0KDQoK
is the base64-encoded equivalent of GET \ HTTP\r\n\r\n
.
Using The Correct HTTP Endpoint
The CRS project uses Albedo as the backend server for tests. This backend provides one dedicated endpoint for each HTTP method. Tests should target these endpoints to:
- improve test throughput (prevent HTML from being returned by the backend)
- specify responses when testing response rules
In general, test URIs can be arbitrary. Albedo will respond with 200 OK
and an empty body to any requests that don’t match any of the endpoints explicitly defined by Albedo. These are:
/capabilities
: describes capabilities of the used Albedo version/reflect
: used to specify the response that Albedo should return
See the Albedo documentation for further information on how to use /reflect
.
Further Guidance on Rule Writing
Leaving Audit Log Configuration Unchanged
Former versions of CRS dynamically included the HTTP response body in the audit log via special ctl
statements on certain individual response rules. This was never applied in a systematic way and, regardless, CRS should not change the format of the audit log by itself, namely because this can lead to information leakages. Therefore, the use of ctl:auditLogParts=+E
or any other form of ctl:auditLogParts
is not allowed in CRS rules.
Non-Rules General Guidelines
- Remove trailing spaces from files (if they’re not needed). This will make linters happy.
- EOF should have an EOL.
The pre-commit
framework can be used to check for and fix these issues automatically. First, go to the pre-commit website and download the framework. Then, after installing, use the command pre-commit install
so that the tools are installed and run each time a commit is made. CRS provides a config file that will keep the repository clean.
The crs-toolchain is the utility belt of CRS developers. It provides a single point of entry and a consistent interface for a range of different tools. Its core functionality (owed to the great rassemble-go, which is itself based on the brain-melting Regexp::Assemble Perl module) is to assemble individual parts of a regular expression into a single expression (with some optimizations).
Setup
With the Binary
The best way to get the tool is using one of the pre-built binaries from GitHub. Navigate to the latest release and download the package of choice along with the crs-toolchain-checksums.txt
file. To verify the integrity of the binary/archive, navigate to the directory where the two files are stored and verify that the checksum matches:
cd ~/Downloads
shasum -a 256 -c crs-toolchain-checksums.txt 2>&1 | grep OK
The output should look like the following (depending on the binary/archive downloaded):
crs-toolchain-1.0.0_amd64.deb: OK
With Existing Go Environment
⚠️ This might require an updated version of golang in your system.
If a current Go environment is present, simply run
go install github.com/coreruleset/crs-toolchain@latest
Provided that the Go binaries are on the PATH
, the toolchain can now be run from anywhere with
It should now be possible to use the crs-toolchain. Test this by running the following in a shell:
printf "(?:homer)? simpson\n(?:lisa)? simpson" | crs-toolchain regex generate -
The output should be:
(?:homer|(?:lisa)?) simpson
Adjusting the Logging Level
The level of logging can be adjusted with the --log-level
option. Accepted values are trace
, debug
, info
, warn
, error
, fatal
, panic
, and disabled
. The default level is info
.
Full Documentation
Read the built-in help text for the full documentation:
The regex
Command
The regex
command provides sub-commands for everything surrounding regular expressions, especially the “assembly” of regular expressions from a specification of its components (see Assembling Regular Expressions for more details).
Example Use
To generate a reduced expression from a list of expressions, simply pass the corresponding CRS rule ID to the script or pipe the contents to it:
crs-toolchain regex generate 942170
# or
cat regex-assembly/942170.ra | crs-toolchain regex generate -
It is also possible to compare generated expressions to the current expressions in the rule files, like so:
crs-toolchain regex compare 942170
Even better, rule files can be updated directly:
crs-toolchain regex update 942170
# or update all
crs-toolchain regex update --all
The format
sub-command reports formatting violations and actively formats assembly files:
crs-toolchain regex format --all
The util
Command
The util
command includes sub-commands that are used from time to time and do not fit nicely into any of the other groups. Currently, the only sub-command is renumber-tests
. renumber-tests
is used to simplify maintenance of the regression tests. Since every test has a consecutive number within its file, adding or removing tests can disrupt numbering. renumber-tests
will renumber all tests within each test file consecutively.
The completion
command
The completion
command can be used to generate a shell script for shell completion. For example:
mkdir -p ~/.oh-my-zsh/completions && crs-toolchain completion zsh > ~/.oh-my-zsh/completions/_crs-toolchain
How completion is enabled and where completion scripts are sourced from depends on the environment. Please consult the documentation of the shell in use.
Assembling Regular Expressions
The CRS team uses a custom specification format to specify how a regular expression is to be generated from its components. This format enables reuse across different files, explanation of choices and techniques with comments, and specialized processing.
The files containing regular expression specifications (.ra
suffix, under regex-assembly
) contain one regular expression per line. These files are meant to be processed by the crs-toolchain.
Example
The following is an example of what an assembly file might contain:
##! This line is a comment and will be ignored. The next line is empty and will also be ignored.
##! The next line sets the *ignore case* flag on the resulting expression:
##!+ i
##! The next line is the prefix comment. The assembled expression will be prefixed with its contents:
##!^ \b
##! The next line is the suffix comment. The assembled expression will be suffixed with its contents:
##!$ \W*\(
##! The following two lines are regular expressions that will be assembled:
--a--
__b__
##! Another comment, followed by another regular expression:
^#!/bin/bash
This assembly file would produce the following assembled expression: (?i)\b(?:--a--|__b__|^#!/bin/bash)[^0-9A-Z_a-z]*\(
Lines starting with ##!
are considered comments and will be skipped. Use comments to explain the purpose of a particular regular expression, its use cases, origin, shortcomings, etc. Having more information recorded about individual expressions will allow developers to better understand changes or change requirements, such as when reviewing pull requests.
Empty Lines
Empty lines, i.e., lines containing only white space, will be skipped. Empty lines can be used to improve readability, especially when adding comments.
Flag Marker
A line starting with ##!+
can be used to specify global flags for the regular expression engine. The flags from all lines starting with the flag marker will be combined. The resulting expression will be prefixed with the flags. For example, the two lines
will produce the regular expression (?i)a+b|c
.
The following flags are currently supported:
i
: ignore case; matches will be case-insensitives
: make .
match newline (\n
); this set by ModSecurity anyway and is included here for backward compatibility
Prefix Marker
A line starting with ##!^
can be used to pass a global prefix to the script. The resulting expression will be prefixed with the literal contents of the line. Multiple prefix lines will be concatenated in order. For example, the lines
##!^ \d*\(
##!^ simpson
marge|homer
will produce the regular expression [0-9]*\(simpson(?:marge|homer)
.
The prefix marker exists for convenience and improved readability. The same can be achieved with the assemble processor.
Suffix Marker
A line starting with ##!$
can be used to pass a suffix to the script. The resulting expression will be suffixed with the literal contents of the line. Multiple suffix lines will be concatenated in order. For example, the lines
##!$ \d*\(
##!$ simpson
marge|homer
will produce the regular expression (?:marge|homer)[0-9]*\(simpson
.
The suffix marker exists for convenience and improved readability. The same can be achieved with the assemble processor.
Processor Marker
A line starting with ##!>
is a processor directive. The processor marker can be used to preprocess a block of lines.
A line starting with ##!<
marks the end of the most recent processor block.
Processor markers have the following general format: <marker> <processor name> [<processor arguments>]
. For example: ##!> cmdline unix
. The arguments depend on the processor and may be empty.
The following example is intentionanlly simple (and meaningless) to illustrates the use of the markers without adding additionally confusing pieces. Please refer to the following sections for more concrete and useful examples.
##!> cmdline unix
command1
command2
##!> assemble
nested1
nested2
##!<
##!<
Processors are defined in the crs-toolchain.
Nesting
Processors may be nested. This enables complex scenarios, such as assembling a smaller expression to concatenate it with another line or block of lines. For example, the following will produce the regular expression line1(?:ab|cd)
:
##!> assemble
line1
##!=>
##!> assemble
ab
cd
##!<
##!<
There is no practical limit to the nesting depth.
Each processor block must be terminated with the end marker ##!<
, except for the outermost (default) block, where the end marker is optional.
Command Line Evasion processor
Processor name: cmdline
Arguments
unix|windows
(required): The processor argument determines the escaping strategy used for the regular expression. Currently, the two supported strategies are Windows cmd (windows
) and “unix like” terminal (unix
).
Output
One line per line of input, escaped for the specified environment.
Description
The command line evasion processor treats each line as a word (e.g., a shell command) that needs to be escaped.
Lines starting with a single quote '
are treated as literals and will not be escaped.
The special token @
will be replaced with an optional “word ending” regular expression. This can be used in the context of a shell to reduce the number of false positives for a word by requiring a subsequent token to be present. For example: python@
.
@
will match:
python<<<'print("hello")'
python <<< 'print("hello")'
@
will not match:
python3<<<'print("hello")'
python3 <<< 'print("hello")'
The special token ~
acts like @
but does not allow any white space tokens to immediately follow the preceding word. This is useful for adding common English words to word lists. For example, there are multiple executable names for “python”, such as python3
or python3.8
. These could not be added with python@
, as python
would be a valid match and create many false positives.
~
will match:
python<<<'print("hello")'
python3 <<< 'print("hello")'
~
will not match:
python <<< 'print("hello")'
The patterns that are used by the command line evasion processor are configurable. The default configuration for the CRS can be found in the toolchain.yaml
in the regex-assembly
directory of the CRS project.
The following is an example of how the command line evasion processor can be used:
##!> cmdline unix
w@
gcc~
'python[23]
aptitude@
pacman@
##!<
Assemble processor
Processor name: assemble
Arguments
This processor does not accept any arguments.
Output
Single line regular expression, where each line of the input is treated as an alternation of the regular expression. Input can also be stored or concatenated by using the two marker comments for input (##!=<
) and output (##!=>
).
Description
Each line of the input is treated as an alternation of a regular expression, processed into a single line. The resulting regular expression is not optimized (in the strict sense) but is reduced (i.e., common elements may be put into character classes or groups). The ordering of alternations in the output can differ from the order in the file (ordering alternations by length is a simple performance optimization).
This processor can also store the output of a block delimited with the input marker ##!=<
, or produce the concatenation of blocks delimited with the output marker ##!=>
.
Lines within blocks delimited by input or output markers are treated as alternations, as usual. The input and output markers enable more complex scenarios, such as separating parts of the regular expression in the assembly file for improved readability. Rule 930100, for example, uses separate expressions for periods and slashes, since it’s easier to reason about the differences when they are physically separated. The following example is based on rules from 930100:
##!> assemble
##! slash patterns
\x5c
##! URI encoded
%2f
%5c
##!=>
##! dot patterns
\.
\.%00
\.%01
##!<
The above would produce the following, concatenated regular expression:
(?:\x5c|%(?:2f|5c))\.(?:%0[0-1])?
The input marker ##!=<
takes an identifier as a parameter and associates the output of the preceding block with the identifier. No output is produced when using the input ##!=<
marker. To concatenate the output of a previously stored block, the appropriate identifier must be passed to the output marker ##!=>
as an argument. Stored blocks remain in storage until the end of the program and are available globally. Any input stored previously can be retrieved at any nesting level. Both of the following examples produce the output ab
:
##!> assemble
ab
##!=< myinput
##!> assemble
##!=> myinput
##!<
##!<
##!> assemble
ab
##!=< myinput
##!<
##!> assemble
##!=> myinput
##!<
Rule 930100 requires the following concatenation of rules: <slash rules><dot rules><slash rules>
, where slash rules
is concatenated twice. The following example produces this sequence by storing the expression for slashes with the identifier slashes
, thus avoiding duplication:
##!> assemble
##! slash patterns
\x5c
##! URI encoded
%2f
%5c
##!=< slashes
##!=> slashes
##! dot patterns
\.
\.%00
\.%01
##!=>
##!=> slashes
##!<
Definition processor
Processor name: define
Arguments
- Identifier (required): The name of the definition that will be processed by this processor
- Replacement (required): The string that replaces the definition identified by
identifier
Output
One line per line of input, with all definition strings replaced with the specified replacement.
Description
The definition processor makes it easy to add recurring strings to expressions. This helps reduce maintenance when a definition needs to be updated. It also improves readability as definition strings provide readable and bounded information, where otherwise a regular expression must be read and boundaries must be identified.
The format of definition strings is as follows:
The definition string starts with two opening braces, is followed by an identifier, and ends with two closing braces. The identifier format must satisfy the following regular expression:
An identifier must have at least one character and consist only of upper and lowercase letters a through z, digits 0 through 9, and underscore or dash.
The following example shows how to use the definition processor:
##!> define slashes [/\x5c]
regex with {{slashes}}
This would result in the output regex with [/\x5c]
.
Include processor
Processor name: include
Arguments
- Include file name (required): The name of the file to include, without suffix
- Suffix replacements (optional): Any number of two-tuples, where the first entry is the suffix to match and the second entry is the replacement. To use, write
--
after the include file name. Tuples are space separated
Output
The exact contents of the included file, including processor directives, with suffixes replaced where appropriate. The prefix and suffix markers are not allowed in included files.
Description
The include processor reduces repetition across assembly files. Repeated blocks can be put into a file in the include
directory and then be included with the include
processor comment. Include files are normal assembly files, hence include files can also contain further include directives. The only restriction is that included files must not contain the prefix or suffix markers. This is a technical limitation in the crs-toolchain.
The contents of an include file could, for example, be the alternation of accepted HTTP methods:
This could be included into an assembly file for a rule that adds an additional method:
##!> include http-headers
OPTIONS
The resulting regular expression would be (?:POS|GE)T|HEAD|OPTIONS
.
Additionally, definition directives of include files are available to the including file. This means that include files can be used as libraries of expressions. For example, an include file called lib.ra
could contain the following definitions:
##!> define quotes ['"`]
##!> define opt-lazy-wspace \s*?
These definitions could then be used in an including file as follows:
##!> include lib
it{{quotes}}s{{opt-lazy-wspace}}possible
Note that the include processor does not have a body, thus the end marker is optional.
Please see Include-Except processor for how suffix replacements work.
Include-Except processor
Processor name: include-except
Arguments
- Include file name (required): The name of the file to include, without suffix
- Exclude file names (required): One or more names of files to consult for exclusions, without suffix, space separated
- Suffix replacements (optional): Any number of two-tuples, where the first entry is the suffix to match and the second entry is the replacement. To use, end the list of exclude file names with
--
. Tuples are space separated
Output
The contents of the included file as per the include processor, but with all matching lines from the exclude file removed. Suffixes will have been replaced as appropriate.
Description
The include-except processor further improves reusability of include files by removing exact line matches found in any of the listed exclude files from the result. A use case for this scenario is remote command execution where it is desirable to have a single list of commands but where certain commands should be excluded from some rules to avoid false positives. Consider the following list of command words:
This list may be usable at paranoia level 2 or 3 but the words cat
and who
would produce too many false positives at paranoia level 1. To work around this issue, the following exclude file can be used:
The regular expression for a rule at paranoia level 1 would then be generated by the following:
##!> include-except command-list pl1-exclude-list
The processor accepts more than one exclude file, each file name separated by a space.
Additionally, the processor can be instructed to replace suffixes of entries in the include file. The use case for this is primarily that we have word lists used together with the cmdline
processor, where entries can be suffixed with @
or ~
. The same lists can be used in other contexts but then the cmdline
suffixes need to be replaced with a regular expression. The following is an example, where @
will be replaced with [\s<>]
and ~
with [^\s]
:
##!> include-except command-list pl1-exclude-list -- @ [\s<>] ~ [^\s]
""
is the special literal used to represent the empty string in suffix replacements. In order to replace a suffix with the empty string one would write, for example:
##!> include-except command-list pl1-exclude-list -- @ "" ~ ""
Suffix replacement is performed after all exclusions have been removed, which means that entries in exclude files must target the verbatim contents of the include file, i.e., some entry@
, not some entry[\s<>]
Note that the include-exclude processor does not have a body, thus the end marker is optional.
Development
We have a syntax highlight extension for Visual Studio Code that helps with writing assembly files. Instructions on how to install the extension can be found in the readme of the repository: https://github.com/coreruleset/regexp-assemble-syntax
Using the CRS Sandbox
Introducing the CRS Sandbox
We have set up a public CRS Sandbox which you can use to send attacks at the CRS. You can choose between various WAF engines and CRS versions. The sandbox parses audit logs and returns our detections in an easy and useful format.
The sandbox is useful for:
- integrators and administrators: you can test out our response in case of an urgent security event, such as the Log4j vulnerability;
- exploit developers/researchers: if you have devised a payload, you can test beforehand if it will be blocked by the CRS and by which versions;
- CRS developers/rule writers: you can quickly check if the CRS catches a (variant of an) exploit without the hassle of setting up your own CRS instance.
Basic usage
The sandbox is located at https://sandbox.coreruleset.org/.
An easy way to use the sandbox is to send requests to it with curl
, although you can use any HTTPS client.
The sandbox has many options, which you can change by adding HTTP headers to your request. One is very important so we will explain it first; this is the X-Format-Output: txt-matched-rules
header. If you add this header to your request, the sandbox will parse the WAF’s output, and return to you the matched CRS rule IDs with descriptions, and the score for your request.
Example
curl -H "x-format-output: txt-matched-rules" https://sandbox.coreruleset.org/?file=/etc/passwd
930120 PL1 OS File Access Attempt
932160 PL1 Remote Command Execution: Unix Shell Code Found
949110 PL1 Inbound Anomaly Score Exceeded (Total Score: 10)
980130 PL1 Inbound Anomaly Score Exceeded (Total Inbound Score: 10 - SQLI=0,XSS=0,RFI=0,LFI=5,RCE=5,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 10, 0, 0, 0
In this example, we sent ?file=/etc/passwd
as a GET payload. The CRS should catch the string /etc/passwd
which is on our blocklist. Try out the command in a terminal now if you like!
You can send anything you want at the sandbox, for instance, you can send HTTP headers, POST data, use various HTTP methods, et cetera.
The sandbox will return a 200 response code, no matter if an attack was detected or not.
The sandbox also adds a X-Unique-Id
header to the response. It contains a unique value that you can use to refer to your request when communicating with us. With curl -i
you can see the returned headers.
curl -i -H 'x-format-output: txt-matched-rules' 'https://sandbox.coreruleset.org/?test=posix_uname()'
HTTP/1.1 200 OK
Date: Tue, 25 Jan 2022 13:53:07 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
X-Unique-ID: YfAAw3Gq8uf24wZCMjHTcAAAANE
x-backend: apache-3.3.2
933150 PL1 PHP Injection Attack: High-Risk PHP Function Name Found
949110 PL1 Inbound Anomaly Score Exceeded (Total Score: 5)
980130 PL1 Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,XSS=0,RFI=0,LFI=0,RCE=0,PHPI=5,HTTP=0,SESS=0): individual paranoia level scores: 5, 0, 0, 0
Default options
It’s useful to know that you can tweak the sandbox in various ways. If you don’t send any X-
headers, the sandbox will use the following defaults.
- The default backend is Apache 2 with ModSecurity 2.9.
- The default CRS version is the latest release version, currently 3.3.2.
- The default Paranoia Level is 1, which is the least strict setting.
- By default, the response is the full audit log from the WAF, which is verbose and includes unnecessary information, hence why
X-Format-Output: txt-matched-rules
is useful.
Changing options
Let’s say you want to try your payload on different WAF engines or CRS versions, or like the output in a different format for automated usage. You can do this by adding the following HTTP headers to your request:
x-crs-version
: will pick another CRS version. Available values are 3.3.2
(default), 3.2.1
, nightly
(which has the latest changes which are not released) and 3.4.0-dev-log4j
(which contains an experimental rule against Log4j attacks).x-crs-paranoia-level
: will run CRS in a given paranoia level. Available values are 1
(default), 2
, 3
, 4
.x-crs-mode
: can be changed to return the http status code from the backend WAF. Default value is blocking (On
), and can be changed using detection
(will set engine to DetectionOnly
). Values are case insensitive.x-crs-inbound-anomaly-score-threshold
: defines the inbound anomaly score threshold. Valid values are any integer > 0, with 5
being the CRS default. ⚠️ Anything different than a positive integer will be taken as 0, so it will be ignored. This only makes sense if blocking
mode is enabled (the default now).x-crs-outbound-anomaly-score-threshold
: defines the outbound anomaly score threshold. Valid values are any integer > 0, with 4
being the CRS default. ⚠️ Anything different than a positive integer will be taken as 0, so it will be ignored. This only makes sense if blocking
mode is enabled (the default now).x-backend
allows you to select the specific backend web serverapache
(default) will send the request to Apache 2 + ModSecurity 2.9.nginx
will send the request to Nginx + ModSecurity 3.coraza-caddy
will send the request to Caddy + Coraza WAF.
x-format-output
formats the response to your use-case (human or automation). Available values are:- omitted/default: the WAF’s audit log is returned unmodified as JSON
txt-matched-rules
: human-readable list of CRS rule matches, one rule per linetxt-matched-rules-extended
: same but with explanation for easy inclusion in publicationsjson-matched-rules
: JSON formatted CRS rule matchescsv-matched-rules
: CSV formatted
The header names are case-insensitive.
Tip: if you work with JSON output (either unmodified or matched rules), jq
is a useful tool to work with the output, for example you can add | jq .
to get a pretty-printed JSON, or use jq
to filter and modify the output.
Advanced examples
Let’s say you want to send a payload to an old CRS version 3.2.1 and choose Nginx + ModSecurity 3 as a backend, because this is what you are interested in. You want to get the output in JSON because you want to process the results with a script. (For now, we use jq
to pretty-print it.)
The command would look like:
curl -H "x-backend: nginx" \
-H "x-crs-version: 3.2.1" \
-H "x-format-output: json-matched-rules" \
https://sandbox.coreruleset.org/?file=/etc/passwd | jq .
[
{
"message": "OS File Access Attempt",
"id": "930120",
"paranoia_level": "1"
},
{
"message": "Remote Command Execution: Unix Shell Code Found",
"id": "932160",
"paranoia_level": "1"
},
{
"message": "Inbound Anomaly Score Exceeded (Total Score: 10)",
"id": "949110",
"paranoia_level": "1"
}
]
Let’s say you are working on a vulnerability publication and want to add a paragraph to explain how CRS protects (or doesn’t!) against your exploit. Then the txt-matched-rules-extended
can be a useful format for you.
curl -H 'x-format-output: txt-matched-rules-extended' \
https://sandbox.coreruleset.org/?file=/etc/passwd
This payload has been tested against OWASP CRS
web application firewall. The test was executed using the apache engine and CRS version 3.3.2.
The payload is being detected by triggering the following rules:
930120 PL1 OS File Access Attempt
932160 PL1 Remote Command Execution: Unix Shell Code Found
949110 PL1 Inbound Anomaly Score Exceeded (Total Score: 10)
980130 PL1 Inbound Anomaly Score Exceeded (Total Inbound Score: 10 - SQLI=0,XSS=0,RFI=0,LFI=5,RCE=5,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 10, 0, 0, 0
CRS therefore detects this payload starting with paranoia level 1.
Capacity
- Please do not send more than 10 requests per second.
- We will try to scale in response to demand.
Architecture
The sandbox consists of various parts. The frontend that receives the requests runs on Openresty. It handles the incoming request, chooses and configures the backend running CRS, proxies the request to the backend, and waits for the response. Then it parses the WAF audit log and sends the matched rules back in the format chosen by the user.
There is a backend container for every engine and version. For instance, one Apache with CRS 3.2.2, one with CRS 3.2.1, et cetera… These are normal webserver installations with a WAF and the CRS.
The backend writes their JSON logs to a volume to be read by a collector script and sent to S3 bucket and Elasticsearch.
The logs are parsed, and values like User-Agent and geolocation are extracted. We use Kibana to keep an overview of how the sandbox is used, and hopefully gain new insights about attacks.
Known issues
In some cases, the sandbox will not properly handle and finish your request.
- Malformed HTTP requests: The frontend, Openresty, is itself a HTTP server which performs parsing of the incoming request. The backend servers running CRS are regular webservers such as Apache and Nginx. Either one of these may reject a malformed HTTP request with an error 400 before it is even processed by CRS. This happens for instance when you try to send an Apache 2.4.50 attack that depended on a URL encoding violation. If you receive an error 400, your request was rejected by the frontend or a backend, and it was not scanned by CRS.
- ReDoS: If your request leads to a ReDoS and makes the backend spend too much time to process a regular expression, this leads to a timeout from the backend server. The frontend will cancel the request with an error 502. If you have to wait a long time and then receive an error 502, there was likely a ReDoS situation.
Questions and suggestions
If you have any issues with the CRS sandbox, please open a GitHub issue at https://github.com/coreruleset/coreruleset/issues and we will help you as soon as possible.
If you have suggestions for extra functionality, a GitHub issue is appreciated.
Working on the sandbox: adding new backends
The following notes are handy for our team maintaining the sandbox.
To add a new backend:
- Each backend has its own IP address.
- docker-compose: copy-paste a back-end container. Give it a new unused IP address in the 10.5.0.* virtual network.
- The frontend needs to know how to reach the desired backend. There is a hardcoded list in openresty/conf/access.lua with the target IP address.
- httpd-vhosts.conf needs to be changed.
Testing the Rule Set
Well, you managed to write your rule, but now want to see if if can be added to the CRS? This document should help you to test it using the same tooling the project uses for its tests.
CRS uses go-ftw to run test cases. go-ftw is the successor to the previously used test runner ftw. The CRS project no longer uses ftw but it us still useful for running tests of older CRS versions.
Environments
Before you start to run tests, you should set up your environment. You can use Docker to run a web server with CRS integration or use your existing environment.
Setting up Docker containers
For testing, we use the container images from our project. We “bind mount” the rules in the CRS Git repository to the web server container and then instruct go-ftw to send requests to it.
To test we need two containers: the WAF itself, and a backend, provided in this case by Albedo. The docker-compose.yml
in the CRS Git repository is a ready-to-run configuration for testing, to be used with the docker compose
command.
Important
The supported platform is ModSecurity 2 with Apache httpd
Let’s start the containers by executing the following command:
docker compose -f tests/docker-compose.yml up -d modsec2-apache
[+] Running 2/2
✔ backend Pulled 2.1s
✔ ff7dc8bdd3d5 Pull complete 1.0s
[+] Running 3/3
✔ Network tests_default Created 0.0s
✔ Container tests-backend-1 Started 0.2s
✔ Container modsec2-apache Started 0.2s
Now let’s see which containers are running now, using docker ps
:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0570b291c386 owasp/modsecurity-crs:apache "/bin/sh -c '/bin/cp…" 7 seconds ago Up 7 seconds (health: starting) 80/tcp, 0.0.0.0:80->8080/tcp modsec2-apache
50704d5c5762 ghcr.io/coreruleset/albedo:0.0.13 "/usr/bin/albedo --p…" 7 seconds ago Up 7 seconds tests-backend-1
Excellent, our containers are running, now we can start our tests.
Using your own environment for testing
If you have your own environment set up, you can configure that for testing. Please follow these instructions to install the WAF server locally.
Note
Remember: The supported platform is ModSecurity 2 with Apache httpd. If you want to run the tests against nginx, you can do that too, but nginx uses libmodsecurity3, which is not fully compatible with Apache httpd + ModSecurity 2.
If you want to run the complete test suite of CRS 4.x with go-ftw, you need to make some modifications to your setup. This is because the test cases for 4.x contain some extra data for responses, letting us test the RESPONSE-*
rules too. Without the following steps these tests will fail.
To enable response handling for tests you will need to download an additional tool, albedo.
Start albedo
Albedo is a simple HTTP server used as a reverse-proxy backend in testing web application firewalls (WAFs). go-ftw relies on Albedo to test WAF response rules.
You can start albedo
with this command:
As you can see the HTTP server listens on *:8085
, you can check it using:
curl -H "Content-Type: application/json" -d '{"body":"Hello, World from albedo"}' "http://localhost:8085/reflect"
Hello, World from albedo%
Check for other features using the url /capabilities
on albedo. The reflection feature is mandatory for testing response rules.
Modify webserver’s config
For the response tests you need to set up your web server as a proxy, forwarding the requests to the backend. The following is an example of such a proxy setup.
Before you start to change your configurations, please make a backup!
Apache httpd
Put this snippet into your httpd’s default config (eg. /etc/apache2/sites-enabled/000-default.conf
):
ProxyPreserveHost On
ProxyPass / http://127.0.0.1:8000/
ProxyPassReverse / http://127.0.0.1:8000/
ServerName localhost
nginx
Put this snippet into the nginx default config (e.g., /etc/nginx/conf.d/default.conf
) or replace the existing one:
location / {
proxy_pass http://127.0.0.1:8000/;
proxy_set_header Host $host;
proxy_set_header Proxy "";
proxy_set_header Upgrade $http_upgrade;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
proxy_buffering off;
proxy_connect_timeout 60s;
proxy_read_timeout 36000s;
proxy_redirect off;
proxy_pass_header Authorization;
}
In both cases (Apache httpd, nginx) you have to change your modsecurity.conf
settings. Open that file and find the directive SecResponseBodyMimeType
. Modify the arguments:
SecResponseBodyMimeType text/plain text/html text/xml application/json
Note, that the default value does not have the MIME type application/json
.
In your crs-setup.conf
you need to add these extra rules (after the rule 900990
):
SecAction \
"id:900005,\
phase:1,\
nolog,\
pass,\
ctl:ruleEngine=DetectionOnly,\
ctl:ruleRemoveById=910000,\
setvar:tx.blocking_paranoia_level=4,\
setvar:tx.crs_validate_utf8_encoding=1,\
setvar:tx.arg_name_length=100,\
setvar:tx.arg_length=400,\
setvar:tx.total_arg_length=64000,\
setvar:tx.max_num_args=255,\
setvar:tx.max_file_size=64100,\
setvar:tx.combined_file_sizes=65535"
SecRule REQUEST_HEADERS:X-CRS-Test "@rx ^.*$" \
"id:999999,\
phase:1,\
pass,\
t:none,\
log,\
msg:'%{MATCHED_VAR}'"
Now, after restarting the web server all request will be sent to the backend. Let’s start testing.
Go-ftw
Tests are performed using go-ftw. We run our test suite automatically using go-ftw as part of a GitHub workflow. You can easily reproduce that locally, on your workstation.
For that you will need:
Installing Go-FTW
We strongly suggest to install a pre-compiled binary of go-ftw available on GitHub.
The binary is ready to run and does not require installation. On the releases page you will also find .deb
and .rpm
packages that can be used for installation on some GNU/Linux systems.
Modern versions of go-ftw
have also a self-update
command that will simplify updating to newer releases for you!
You can also install pre-compiled binaries by using go install
, if you have a Go environment:
go install github.com/coreruleset/go-ftw@latest
This will install the binary into your $HOME/go/bin
directory. To compile go-ftw from source, run the following commands:
git clone https://github.com/coreruleset/go-ftw.git
cd go-ftw
go build
This will build the binary in the go-ftw repository.
Now create a configuration file. Because Apache httpd and nginx use different log file paths, and, perhaps, different ports, you may want to create two different configuration files for go-ftw. For details please read go-ftw’s documentation.
Example .ftw.nginx.yaml
file for nginx:
logfile: /var/log/nginx/error.log
logmarkerheadername: X-CRS-TEST
testoverride:
input:
dest_addr: "127.0.0.1"
port: 8080
Example file .ftw.apache.yaml
for Apache httpd:
logfile: /var/log/apache2/error.log
logmarkerheadername: X-CRS-TEST
testoverride:
input:
dest_addr: "127.0.0.1"
port: 80
Please verify that these settings are correct for your setup, especially the port
values.
Running the test suite
Execute the following command to run the CRS test suite with go-ftw against Apache httpd:
Warning
⚠️ If go-ftw is installed from a pre-compiled binary, then you might have to use ftw
instead of the go-ftw
command.
./go-ftw run --config .ftw.apache.yaml -d ../coreruleset/tests/regression/tests/
🛠️ Starting tests!
🚀 Running go-ftw!
👉 executing tests in file 911100.yaml
running 911100-1: ✔ passed in 239.699575ms (RTT 126.721984ms)
running 911100-2: ✔ passed in 63.339213ms (RTT 69.998361ms)
running 911100-3: ✔ passed in 64.87875ms (RTT 71.368241ms)
running 911100-4: ✔ passed in 77.823772ms (RTT 81.059904ms)
running 911100-5: ✔ passed in 64.451749ms (RTT 70.403898ms)
running 911100-6: ✔ passed in 67.774327ms (RTT 73.803885ms)
running 911100-7: ✔ passed in 65.528094ms (RTT 72.64316ms)
running 911100-8: ✔ passed in 66.129563ms (RTT 73.198992ms)
👉 executing tests in file 913100.yaml
running 913100-1: ✔ passed in 71.242549ms (RTT 76.803619ms)
running 913100-2: ✔ passed in 69.999667ms (RTT 76.617714ms)
running 913100-3: ✔ passed in 70.200211ms (RTT 76.92281ms)
running 913100-4: ✔ passed in 65.856005ms (RTT 73.328341ms)
running 913100-5: ✔ passed in 66.986859ms (RTT 73.494356ms)
...
To run the test suite against nginx, execute the following:
./go-ftw run --config .ftw.nginx.yaml -d ../coreruleset/tests/regression/tests/
🛠️ Starting tests!
🚀 Running go-ftw!
👉 executing tests in file 911100.yaml
running 911100-1: ✔ passed in 851.460335ms (RTT 292.802335ms)
running 911100-2: ✔ passed in 53.748811ms (RTT 66.798867ms)
running 911100-3: ✔ passed in 49.237535ms (RTT 67.964411ms)
running 911100-4: ✔ passed in 194.935023ms (RTT 202.414171ms)
running 911100-5: ✔ passed in 52.905305ms (RTT 66.254034ms)
running 911100-6: ✔ passed in 52.597784ms (RTT 68.58854ms)
running 911100-7: ✔ passed in 51.996881ms (RTT 67.496534ms)
running 911100-8: ✔ passed in 50.804143ms (RTT 67.589557ms)
👉 executing tests in file 913100.yaml
running 913100-1: ✔ passed in 276.383507ms (RTT 85.436758ms)
running 913100-2: ✔ passed in 86.682684ms (RTT 69.89541ms)
...
If you want to run only one test, or a group of tests, you can specify that using the “include” option -i
(or --include
). This option takes a regular expression:
./go-ftw run --config .ftw.apache.yaml -d ../coreruleset/tests/regression/tests/ -i "955100-1$"
In the above case only the test case 955100-1
will be run.
If you need to see more verbose output (e.g., to look at the requests and responses sent and received by go-ftw) you can use the --debug
or --trace
options:
./go-ftw run --config .ftw.apache.yaml -d ../coreruleset/tests/regression/tests/ -i "955100-1$" --trace
./go-ftw run --config .ftw.apache.yaml -d ../coreruleset/tests/regression/tests/ -i "955100-1$" --debug
Please note again that libmodsecurity3
is not fully compatible with ModSecurity 2, some tests can fail. If you want to ignore them, you can put the tests into a list in your config:
testoverride:
input:
dest_addr: "127.0.0.1"
port: 8080
ignore:
# text comes from our friends at https://github.com/digitalwave/ftwrunner
'941190-3$': 'known MSC bug - PR #2023 (Cookie without value)'
'941330-1$': 'know MSC bug - #2148 (double escape)'
...
For more information and examples, please check the go-ftw documentation.
Also please don’t forget to roll back the modifications from this guide to your WAF configuration after you’re done testing!
Additional tips
- ⚠️ If your test is not matching, you can take a peek at the
modsec_audit.log
file, using: sudo tail -200 tests/logs/modsec2-apache/modsec_audit.log
- 🔧 If you need to write a test that cannot be written using text (e.g. binary content), we prefer using
encoded_request
in the test, using base64 encoding
Summary
Tests are a core functionality in our ruleset. So whenever you write a rule, try to add some positive and negative tests so we won’t have surprises in the future.
Happy testing! 🎉
There are many first and third party tools that help with ModSecurity and CRS development. The most useful ones are listed here. Get in touch if you think something is missing.
albedo
https://github.com/coreruleset/albedo
The backend server used by the CRS test suite. It is especially useful for testing response rules, as desired responses can be freely specified.
coraza-httpbin
https://github.com/jcchavezs/coraza-httpbin
A Coraza plus reverse proxy container for testing. Makes it possible to easily test CRS with Coraza in a similar way to testing CRS using the Apache and Nginx Docker containers.
A local CRS installation can be included using directives in a directives.conf
file like so:
Include ../coreruleset/crs-setup.conf.example
Include ../coreruleset/rules/*.conf
https://github.com/coreruleset/crs-toolchain
The CRS developer’s toolbelt. Documentation lives at crs-toolchain.
Go-FTW
https://github.com/coreruleset/go-ftw
Framework for Testing WAFs in Go. A Go-based rewrite of the original Python FTW project.
Official CRS Maintained Docker Images
ModSecurity CRS Docker Image
https://github.com/coreruleset/modsecurity-crs-docker
A Docker image supporting the latest stable CRS release on:
- the latest stable ModSecurity v2 on Apache
- the latest stable ModSecurity v3 on Nginx
msc_pyparser
https://github.com/digitalwave/msc_pyparser
A ModSecurity config parser. Makes it possible to modify SecRules en masse, for example adding a tag to every rule in a rule set simultaneously.
msc_retest (RE test)
https://github.com/digitalwave/msc_retest
An invaluable tool for testing how regular expressions behave and perform in both mod_security2
(the Apache module) and libModSecurity
(ModSecurity v3).
Regexploit
https://github.com/doyensec/regexploit
A tool for testing and finding regular expressions that are vulnerable to regular expression denial of service attacks (ReDoS).