[ZBX-10204] Headers are not used in web page content parsing. "Retrieve only headers" allows to fill step variables. Created: 2015 Dec 26  Updated: 2018 Dec 10  Resolved: 2018 Dec 10

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Frontend (F), Server (S)
Affects Version/s: 2.2.11, 2.4.7, 3.0.0alpha5
Fix Version/s: None

Type: Problem report Priority: Major
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Duplicate Votes: 2
Labels: csrf, debugging, headers, patch, trace, variables, webmonitoring
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

Yeah, issue summary contains 2 aspects which are different (1st closer to ZBXNEXT, 2nd is pure ZBX).
And I'd not be happy if 2nd only will be fixed as ZBX because it will not resolve a use case when this problem was discovered.

I discovered it when I had to configure a web scenario to check login to a Pootle instance http://www.zabbix.org/pootle/accounts/login/
Problem is that in 1st step I have to do a GET to get a "csrftoken" cookie value and then use this cookie value in 2nd step in POST for "csrfmiddlewaretoken" variable (additionally to "csrftoken" cookie which will be preserved).

History:
In 2.2 we have implemented step-level variables and extract regexp match in ZBXNEXT-1597
In 2.4 we have implemented only headers retrieval in ZBXNEXT-282

ZBXNEXT-1597 spec (after it recently has been fixed) explicitly says :

In case of 'regex:' prefix Zabbix will try to match this regular expression with returned HTML content (without headers)

but documentation says only:

If the value part starts with regex: then the part after it will be treated as a regular expression that will search the web page and, if found, store the match in the variable.

note how specification is much more technically detailed and correct, so doc must be updated after changes!

It's confirmed in current code:

            if (ZBX_RETRIEVE_MODE_CONTENT == httpstep.retrieve_mode)
            {
                char	*var_err_str = NULL;

                /* required pattern */
                if (NULL == err_str && '\0' != *httpstep.required && NULL == zbx_regexp_match(page.data,
                        httpstep.required, NULL))
                {
                    err_str = zbx_dsprintf(err_str, "required pattern \"%s\" was not found on %s",
                                           httpstep.required, httpstep.url);
                }

                /* variables defined in scenario */
                if (NULL == err_str && FAIL == http_process_variables(httptest,
                        httptest->httptest.variables, page.data, &var_err_str))
                {
                    char	*variables;

                    variables = string_replace(httptest->httptest.variables, "\r\n", " ");
                    err_str = zbx_dsprintf(err_str, "error in scenario variables \"%s\": %s",
                                           variables, var_err_str);

                    zbx_free(variables);
                }

                /* variables defined in a step */
                if (NULL == err_str && FAIL == http_process_variables(httptest, httpstep.variables,
                        page.data, &var_err_str))
                {
                    char	*variables;

                    variables = string_replace(httpstep.variables, "\r\n", " ");
                    err_str = zbx_dsprintf(err_str, "error in step variables \"%s\": %s",
                                           variables, var_err_str);

                    zbx_free(variables);
                }

                zbx_free(var_err_str);
            }

As we can see scenario's and step's variables will be not processed if the step has "Retrieve only headers" option enabled.
But, in frontend, while "Post" and "Required string" fields become disabled, the "Variables" field is still available to be customized!
Zabbix users is absolutely unaware that even regular variables (not =regex: like) will be just ignored for this step.
So this is pure frontend bug I mentioned as 2nd aspect.
To fix it - "Variables" field should be disabled too.

But returning to the initial use case I'd ask to resolve current issue differently - include HEADERS to BODY for web page parsing and allow to specify "Required string" in frontend too.

I don't think it can break existing zabbix installations (thinking about extended value for "Required string") as most likely already searched web page content is not included to headers.
It will be also useful for cases when checking specific headers is enough without full page retrieval.



 Comments   
Comment by Oleksii Zagorskyi [ 2015 Dec 26 ]

Also, reading documentation, it looks strange and is not very obvious that scenario's variables like =regex: are processed for every step too.
I'd explicitly mention that in documentation.

Comment by richlv [ 2015 Dec 26 ]

isn't the header parsing duplicate of ZBXNEXT-2315 ?

Comment by Oleksii Zagorskyi [ 2015 Dec 26 ]

heh, it is rich! But I'd close it in favor of current report as it contains more details an a related bug described

Comment by richlv [ 2015 Dec 26 ]

but that zbxnext asks for more control over where matching should occur

Comment by Oleksii Zagorskyi [ 2015 Dec 26 ]

More control are not always required, in current case I don't see much sense for such control flexibility
Currently existing "Retrieve only headers" is enough as for control, need just give possibility to use headers.

Comment by Oleksii Zagorskyi [ 2016 May 19 ]

Also, need to make sure that http headers will be included to log with level trace, because currently they are not.
Even when in a web scenario I've set "Retrieve only headers" option - I cannot see them in the trace log currently, which looks silly and it's not possible to troubleshoot.

Thinking also about headers in the trace log when server follows redirects .... not sure it's possible, though.

Comment by Vitaly Zhuravlev [ 2017 Oct 18 ]

It looks like that this issue prevents to create simple login web scenarios in zabbix to web servers like django etc where CSRF is enabled (like OpenStack Horizon, FYI, palivoda):
Not 100% sure, but it looks that way


<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="content-type" content="text/html; charset=utf-8">
        <meta name="robots" content="NONE,NOARCHIVE">
        <title>403 Forbidden</title>
        <style type="text/css">
    html * { padding:0; margin:0; }
    body * { padding:10px 20px; }
    body * * { padding:0; }
    body { font:small sans-serif; background:#eee; }
    body>div { border-bottom:1px solid #ddd; }
    h1 { font-weight:normal; margin-bottom:.4em; }
    h1 span { font-size:60%; color:#666; font-weight:normal; }
    #info { background:#f6f6f6; }
    #info ul { margin: 0.5em 4em; }
    #info p, #summary p { padding-top:10px; }
    #summary { background: #ffc; }
    #explanation { background:#eee; border-bottom: 0px none; }
  </style>
    </head>
    <body>
        <div id="summary">
            <h1>Запрещено 
                <span>(403)</span>
            </h1>
            <p>Ошибка проверки CSRF. Запрос отклонён.</p>
            <p>Вы видите это сообщение, потому что данный сайт требует, чтобы при отправке форм была отправлена и CSRF-cookie. Данный тип cookie необходим по соображениям безопасности, чтобы убедиться, что ваш браузер не был взломан и не выполняет от вашего лица действий, запрограммированных третьими лицами.</p>
            <p>Если вы настроили свой браузер таким образом, чтобы он не передавал или не хранил cookie, пожалуйста, включите эту функцию вновь, по крайней мере для этого сайта, или для запросов, чьи домен и порт совпадают с доменом и портом текущей страницы.</p>
        </div>
        <div id="info">
            <h2>Help</h2>
            <p>Reason given for failure:</p>
            <pre>
    CSRF cookie not set.
    </pre>
            <p>In general, this can occur when there is a genuine Cross Site Request Forgery, or when
  
                <a
  href="https://docs.djangoproject.com/en/1.8/ref/csrf/">Django's
  CSRF mechanism</a> has not been used correctly.  For POST forms, you need to
  ensure:
            </p>
            <ul>
                <li>Your browser is accepting cookies.</li>
                <li>The view function passes a 
                    <code>request</code> to the template's
                    <a
    href="https://docs.djangoproject.com/en/dev/topics/templates/#django.template.backends.base.Template.render">
                        <code>render</code>
                    </a>
    method.
                </li>
                <li>In the template, there is a 
                    <code>{% csrf_token
    %}</code> template tag inside each POST form that
    targets an internal URL.
                </li>
                <li>If you are not using 
                    <code>CsrfViewMiddleware</code>, then you must use
                    <code>csrf_protect</code> on any views that use the
                    <code>csrf_token</code>
    template tag, as well as those that accept the POST data.
                </li>
            </ul>
            <p>You're seeing the help section of this page because you have 
                <code>DEBUG =
  True</code> in your Django settings file. Change that to
                <code>False</code>,
  and only the initial error message will be displayed.
            </p>
            <p>You can customize this page using the CSRF_FAILURE_VIEW setting.</p>
        </div>
    </body>
</html>
Generated at Tue Jun 24 06:29:01 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.