A dry run (or practice run) is a software testing process used to make sure that a system works correctly and will not result in severe failure.
Istio has a experimental annotation istio.io/dry-run
to dry-run the policy without actually enforcing it.
The dry-run annotation allows you to better understand the effect of an authorization policy before applying it to the production traffic. This helps to reduce the risk of breaking the production traffic caused by an incorrect authorization policy.
Before you begin
Before you begin this task, do the following:
-
Deploy Zipkin for checking dry-run tracing results. Follow the Zipkin task to install Zipkin in the cluster.
-
Deploy Prometheus for checking dry-run metric results. Follow the Prometheus task to install the Prometheus in the cluster.
-
Deploy test workloads:
This task uses two workloads, httpbin
and sleep
, both deployed in namespace foo. Both workloads run with an Envoy proxy sidecar. Create the foo namespace and deploy the workloads with the following command:
- Enable proxy debug level log for checking dry-run logging results:
- Verify that
sleep
can accesshttpbin
with the following command:
If you don’t see the expected output as you follow the task, retry after a few seconds. Caching and propagation overhead can cause some delay.
Create dry-run policy
- Create an authorization policy with dry-run annotation
"istio.io/dry-run": "true"
with the following command:
You can also use the following command to quickly change an existing authorization policy to dry-run mode:
- Verify a request to path
/headers
is allowed because the policy is created in dry-run mode, run the following command to send 20 requests fromsleep
tohttpbin
, the request includes the headerX-B3-Sampled: 1
to always trigger the Zipkin tracing:
Check dry-run result in proxy log
The dry-run results can be found in the proxy debug log in the format of shadow denied, matched policy ns[foo]-policy[deny-path-headers]-rule[0]
. Run the following command to check the log:
Also see the troubleshooting guide for more details of the logging.
Check dry-run result in metric using Prometheus
- Open the Prometheus dashboard with the following command:
- In the Prometheus dashboard, search for the following metric:
- Verify the queried metric result as follows:
-
The queried metric has value
20
(you might find a different value depending on how many requests you have sent. It’s expected as long as the value is greater than 0). This means the dry-run policy applied to thehttpbin
workload on port80
matched one request. The policy would reject the request once if it was not in dry-run mode. -
The following is a screenshot of the Prometheus dashboard:
Check dry-run result in tracing using Zipkin
- Open the Zipkin dashboard with the following command:
-
Find the trace result for the request from
sleep
tohttpbin
. Try to send some more requests if you do see the trace result due to the delay in the Zipkin. -
In the trace result, you should find the following custom tags indicating the request is rejected by the dry-run policy deny-path-headers in the namespace foo:
- The following is a screenshot of the Zipkin dashboard:
Summary
The Proxy debug log, Prometheus metric and Zipkin trace results indicate that the dry-run policy will reject the request. You can further change the policy if the dry-run result is not expected.
It’s recommended to keep the dry-run policy for some additional time so that it can be tested with more production traffic.
When you are confident about the dry-run result, you can disable the dry-run mode so that the policy will start to actually reject requests. This can be achieved by either of the following approaches:
- Remove the dry-run annotation completely; or
- Change the value of the dry-run annotation to
false
.
Limitations
The dry-run annotation is currently in experimental stage and has the following limitations:
-
The dry-run annotation currently only supports ALLOW and DENY policies;
-
There will be two separate dry-run results (i.e. log, metric and tracing tag) for ALLOW and DENY policies due to the fact that the ALLOW and DENY policies are enforced separately in the proxy. You should take all the two dry-run results into consideration because a request could be allowed by an ALLOW policy but still rejected by another DENY policy;
-
The dry-run results in the proxy log, metric and tracing are for manual troubleshooting purposes and should not be used as an API because it may change anytime without prior notice.
Clean up
- Remove the namespace foo from your configuration:
- Remove Prometheus and Zipkin if no longer needed.