0%

Google Summer of Code: Timeseries query language extension for PCP

PR links(Introcued below, both have been merged)

This blog is just a simple note for the project Timeseries query language extension during this summer.

My proposal link

Brief introduce to this project. (by @mgoodwin and @nathans)

Performance Co-Pilot timeseries are series of time-stamped values gathered centrally from hosts making performance data available. This data could be gathered for many metrics, at high frequency, and from many hosts. It is potentially high volume data, and searching it efficiently (querying) at speed is a non-trivial problem.
The Performance Co-Pilot timeseries query language is designed to allow fast querying based on metric names and labels. A command line utility and a REST API are available from pmseries and the pmproxy daemon.
Internally, the PCP query language makes use of the Redis distributed data store and its native timeseries features. The pmseries command line utility provides low-level access to the language.

This project will extend the existing query language with:

  1. statistical functions (sum, mean, average, standard deviation, histogram binning, top-N, N-th percentile)
  2. rate conversion function for counter metrics
  3. scale and unit conversion functions
  4. mathematical functions (abs, floor, log, sqrt, round)
  5. binary operators for numeric metrics (addition, subtraction, division, multiplication, exponentiation)

Grammar

The grammar description file is located on src/libpcp_web/src/query_parser.y. It’s a yacc file.

$query\to vector$

Extended grammar

$non-terminal\ symbol, {\bf terminal\ symbol}$

  • $vector\to func\ {\bf EOS}$
  • $val$_$vec \to {\bf name} \{exprlist\}[timelist]$
  • $val$_$vec \to {\bf name}[timelist]$
  • $func\to {\bf rate}(val$_$vec)$
  • $func\to {\bf rate}(func)$
  • $func\to {\bf noop}(val$_$vec)$
  • $func\to {\bf noop}(func)$
  • $func\to {\bf max}(val$_$vec)$
  • $func\to {\bf max}(func)$
  • $func\to {\bf min}(val$_$vec)$
  • $func\to {\bf min}(func)$
  • $func\to {\bf rescale}(val$_$vec, {\bf string})$, where string is a units string
  • $func\to {\bf rescale}(func, {\bf string})$, where string is a units string
  • $func\to {\bf abs}(val$_$vec)$
  • $func\to {\bf abs}(func)$
  • $func\to {\bf floor}(val$_$vec)$
  • $func\to {\bf floor}(func)$
  • $func\to {\bf log}(val$_$vec)$
  • $func\to {\bf log}(func)$
  • $func\to {\bf log}(val$_$vec, {\bf number})$
  • $func\to {\bf log}(func, {\bf number})$
  • $func\to {\bf sqrt}(val$_$vec)$
  • $func\to {\bf sqrt}(func)$
  • $func\to {\bf round}(val$_$vec)$
  • $func\to {\bf round}(func)$
  • $func\to arithmetic$_$expression$
  • $arithmetic$ _ $expression\to val$ _ $vec+val$ _ $vec$
  • $arithmetic$ _ $expression\to val$ _ $vec+func$
  • $arithmetic$ _ $expression\to func+val$ _ $vec$
  • $arithmetic$ _ $expression\to val$ _ $vec-val$ _ $vec$
  • $arithmetic$ _ $expression\to val$ _ $vec-func$
  • $arithmetic$_$expression\to func-val$ _ $vec$
  • $arithmetic$ _ $expression\to val$ _ $vec*val$ _ $vec$
  • $arithmetic$ _ $expression\to val$ _ $vec*func$
  • $arithmetic$ _ $expression\to func*val$ _ $vec$
  • $arithmetic$ _ $expression\to val$_$vec/val$ _ $vec$
  • $arithmetic$ _ $expression\to val$ _ $vec/func$
  • $arithmetic$ _ $expression\to func/val$ _ $vec$

Status

Storing time series values&pmDesc into paser tree’s nodes

Store the series values returned by Redis into the corresponding node in node_t tree, for the further functions’ computation.

Changes in data structures

In src/libpcp_web/src/query.c

  • Modify constant QUERY_PHASES from 6 to 7.
    The new phase series_query_funcs() is for storing timeseries values into the corresponding node. And the 7th phase series_query_funcs_report_values() is for functions’ computation and reporting.

In src/libpcp_web/src/query.h

  • Add a new structure struct series_value_set in order to store time series values. And add this struct into struct node. In series_value_set: A value_set has a series_values array. Each element in series_values array is a timeseries with its samples and SID. A timeseries may has more than one samples stored in array series_sample. Each element in array series_sample is a set of instances’ values.

Writing functions

Git branch link
This branch has been merged. But need a large amount of tests!
Still, there are a large part in statistical functions is unfinieshed. (sum, mean, average, standard deviation, histogram binning, top-N, N-th percentile)

Changes in data structures

In src/libpcp_web/src/query.h

  • Add a new structure typedef struct timing to struct node for storing timing information of arithmetic expressions.

Meta-data updation part

Git branch link
This branch has been merged.

Changes in data structures

In src/libpcp_web/src/query.c

  • Modify constant QUERY_PHASES from 7 to 8.
    The new added phase is for querying metric names by SIDs, this situation will happend when the [metric.name] in the query statment is GLOB i.e. kernel.all.*

Timeline

Early June


  • In function series_solve, add a new phase series_query_funcs before phase series_query_report_values. Querying to Redis and storing timeseries values are all done in function series_query_funcs. series_query_report_values only to only to report the result value from root node.
  • static int series_process_func: Called by series_query_funcs. Traverse the parser tree and once a node is an values-type node which need to query to Redis to gather actual series values, do querying and store the values into this node.
  • static void series_node_prepare_time: Called by series_process_func when a child of a function-type node need to query to Redis. For each series-id stored in the child node, first store the SID into this node’s np->value_set.series_values[i].sid = sid, then query to Redis with every SID together with a timewindow.
  • static void series_node_prepare_time_reply: Called by series_node_prepare_time. After Redis giving back a reply to a specific SID with a timewindow, count how many sample are in this reply. Check if this reply is correct and call series_values_store_to_node to store these samples into the correspoding node. Then call series_node_get_desc to get the pmSeriesDesc for this SID. Finally, update np->value_set.num_series, this variable is to record how many series’ values have been stored for this node.
  • static void series_node_get_desc called by series_node_prepare_time_reply, prepare to query pmSeriesDesc for a specific SID.
  • static void series_node_get_desc_reply called by series_node_get_desc. After Redis givng back a pmSeriesDes infromation for a specific SID. Store it into the corresponding parser tree node.
  • static void series_values_store_to_node: Called by series_node_prepare_time_reply. Check through a specific series’ samples replied from Redis and extract the instances’ values then call series_instance_store_to_node to store the instances’ values into these node. This function is roughly same as series_values_reply.
  • static int series_instance_store_to_node: Called by series_values_store_to_node. The instance’ value store in the variable value. Store this value into np->value_set.series_values[idx_series].series_sample[idx_sample].series_instance[idx_instance].
  • static void freeSeriesQueryNode, called by freeSeriesGetQuery, just to free.
  • static int series_calculate called by series_query_report_values, in this function all time series values have been stored into nodes. Therefore we can directly calculate values of a node accroding to the semantics of this node.

Late June


  • series_calculate_rate: Please check the rate() in the Functions part below.
  • series_calculate_max: Please check the min/max() in the Functions part below.
  • series_calculate_min: Please check the min/max() in the Functions part below.

Early July


  • Mathematical functions’ grammars.
  • series_calculate_rescale(node_t *np): Call this for each N_RESCALE type node. The left child node of L_RESCALE-type node should contains a set of time series values. And the right child node should be L_SCALE, which contains the target units information. rescale() should only accept metrics with semantics instant. Compare the consistencies of 3 time/space/count dimensions between the pmUnits of input and metrics to be modified.
  • series_pmAtomValue_conv_str(): Called by series_calculate_rescale, extract series values acroding to the series’ type.
  • compare_pmUnits(pmUnits *a, pmUnits *b): Compare whether three dimentions of two units are the same.
  • series_extract_type(char *typeStr): Convert a type string to constant value in code e.g. “u32” to PM_TYPE_U32.
  • series_extract_value(int type, sds str, pmAtomValue *oval): Convert string to value accroding to the type e.g. int, unsigned int, double etc.
  • series_abs_pmAtomValue(int type, pmAtomValue *val): According to the type take abs() over the *val.
  • series_calculate_abs(node_t *np): Take abs() over the series values stored in *np.
  • series_floor_pmAtomValue(int type, pmAtomValue *val): According to the type take floor() over the *val.
  • series_calculate_floor(node_t *np): Take floor() over the series values stored in *np.
  • series_log_pmAtomValue(int itype, int *otype, pmAtomValue *val, int is_natural_log, double base): According to the itype take log() over the *val, set *otype to PM_TYPE_DOUBLE. If is_natural_log=0 then the base is setted by client, otherwise base=e.
  • series_calculate_log(node_t *np): Take log() over the series values stored in *np.
  • series_sqrt_pmAtomValue(int itype, int *otype, pmAtomValue *val): According to the itype take sqrt() over the *val, set *otype to PM_TYPE_DOUBLE.
  • series_calculate_sqrt(node_t *np): Take sqrt() over the series values stored in *np.
  • series_round_pmAtomValue(int type, pmAtomValue *val): Take round() over the series values stored in *np.
  • series_calculate_round(node_t *np): Take round() over the series values stored in *np.

Late July


  • Update tests in qa/1886
  • series_expr_canonical(node_t *np): Recover query statement accoding to the root node *np, generate the canonical expression.
  • series_set_function_expr_callback
  • series_set_function_expr
  • series_function_hash(unsigned char *hash, node_t *np): generate JASON format expression information and hash it.
  • Fix some previous bugs

Early August


  • Merge branch demo_add_rate into master
  • Arithmetic expressions’ grammars.
  • Write descriptor reevaluate when querying descriptor for some SIDs. Check this branch . But this reevaluating method has been canceled now.
  • compare_pmUnits_dim(pmUnits *a, pmUnits *b): check the consistency of three dimensions between *a and *b.
  • pmStrSem(sds sem_str): parse the semantics string sem_str into int type.
  • series_calculate_binary_check(): Take descriptors checking for arithmetic expressions.
  • calculate_plus(int *type, pmAtomValue *l_val, pmAtomValue *r_val, pmAtomValue *res): Take addition over *l_val and *r_val.
  • calculate_minus(int *type, pmAtomValue *l_val, pmAtomValue *r_val, pmAtomValue *res): Take subtraction over *l_val and *r_val.
  • calculate_star(int *type, pmAtomValue *l_val, pmAtomValue *r_val, pmAtomValue *res): Take multiplication over *l_val and *r_val.
  • calculate_slash(int *type, pmAtomValue *l_val, pmAtomValue *r_val, pmAtomValue *res): Take division over *l_val and *r_val.
  • series_calculate_order_bianry(): Take binary operation over two input oprands with their type, semantics and units.
  • series_bianry_meta_update(): Update series’ descriptors after an arithmetic expression operation.
  • series_calculate_plus(node_t *np): Take addition over the series values stored in *np.
  • series_calculate_minus(node_t *np): Take subtraction over the series values stored in *np.
  • series_calculate_star(node_t *np): Take multiplication over the series values stored in *np.
  • series_calculate_slash(node_t *np): Take division over the series values stored in *np.

Late August


  • Write blogs.
  • series_node_get_metric_name(): Make a Redis command to query for a metric name with a SID.
  • series_node_get_metric_name_reply(): Check the Redis reply contains metric name.
  • series_store_metric_name(): called by series_node_get_metric_name_reply, store the metric name corresponding to a specific SID into node_t.
  • series_redis_hash_expression
  • series_query_mapping(): prepare metric names mapping
  • series_query_mapping_callback(): same as above
  • check_compatibility(pmUnits *units_a, pmUnits *units_b): Check the compatibility between two units from two series with the same metric name.
  • series_compatibility_convert(): Use the larger scale and convert the values with the smaller scale.
  • extract_series_node_desc(): Extract the descriptor of a series in a Redis reply. Only work for functions’ expressions query. Set fields “indom” and “pmid” directly to “511.0” and “511.0.0” respectively.
  • series_hmset_function_desc(): HMSET the descriptor of a fabricated SID. This descripotr will be HMGET by pmSeriesDescs().
  • series_hmset_function_desc_callback(): Check if the HMSET above command works well.

Functions

noop():

To check if the functions have been parsed well.

rate()

Compute rate between samples for each metric. The number of samples in result is one less than the original samples. Only the metric with semantic counter can apply rate().

  • After rate(), reduce the dimension of Time by one. The type remains unchanged and the semantic become PM_SEM_INSTANT.

Sample:

1
pmseries 'rate(kernel.all.pswitch[count:5])'

TODO: type checking errors report. metadata updates.

min/max()

Compare and pick the maximal/minimal instance value(s) among samples for each metric.

TODO: Type checking and error report. Metadata updates. ~

Be careful that the results of min/max can ${\bf not}$ be computed by rate(). How to achieve this? Maybe after min/max the semantic in metadate of this metric should be updated to None. But in the other hand, the number of samples is just one and if we apply rate() the number of samples will become zero, even rate(max()) is meaningless but it will not cause problems…. But it’s better to forbid rate(max())

rescale()

Rescale the units for the input metric values. Only support metric with semantic instant.

  • After rescale, the units of each metric will be updated and the type will be updated to PM_TYPE_DOUBLE.

Sample:

1
2
3
pmseries 'rescale(kernel.all.uptime[count:2], "min")'
pmseries 'rescale(kernel.all.uptime[count:2], "sec")'
pmseries 'rescale(rescale(kernel.all.uptime[count:2], "min"), "hour")'

mathematical functions

TODO: for each node update the series_desc after taking math funcitons!

  • The types and units of results remain the same.
  • The semantics of mathematical functions will become PM_TYPE_DOUBLE.

Are there some series values they are negtive numbers? for testing abs()

Sample:

1
2
3
4
5
pmseries 'abs(kernel.all.load[count:2])'
pmseries 'log(log(kernel.all.pswitch[count:2],3),2)'
pmseries 'log(kernel.all.load[count:2], 77)'
pmseries 'floor(kernel.all.load[count:2])'
pmseries 'round(kernel.all.load[count:2])'

arithmetic expression

The semantic checks and rules of arithmetic expression follow the description in pmregisterderived(3).

  • Two operands each has only one metric series can take arithmetic expression.
  • The number of series, samples and instances of two operands should be identical.
  • If both operands have the semantics of a counter, then only addition or subraction is allowed
  • If the left operand is a counter and the right operand is not, then only multiplication or division are allowed, or if the left operand is not a counter and the right operand is a counter, then only multiplication is allowed.
  • If the semantics of both operands is not a counter then the result will have semantics PM_SEM_INSTANT unless both operands are PM_SEM_DISCRETE in which case the result is also PM_SEM_DISCRETE. Otherwise, the result will have semantics PM_SEM_COUNTER.
  • If both operands have a dimension of Count/Time/Space and the scales are not the same, use the larger scale and convert the values of the operand with the smaller scale. The result is promoted to type PM_TYPE_DOUBLE.
  • For addition and subtraction all dimensions for each of the operands and result are identical.
  • For multiplication, the dimensions of the result are the sum of the dimensions of the operands.
  • For division, the dimensions of the result are the difference of the dimensions of the operands.
  • If the operand has not been rescaled, the type of the result is determined by the types of the operands, as per the following table which is evaluated from top to bottom until a match is found.
Operand Types Operator Result Type
either is PM_TYPE_DOUBLE any PM_TYPE_DOUBLE
any division PM_TYPE_DOUBLE
either is PM_TYPE_FLOAT any PM_TYPE_FLOAT
either is PM_TYPE_U64 any PM_TYPE_U64
either is PM_TYPE_64 any PM_TYPE_64
either is PM_TYPE_U32 any PM_TYPE_U32
otherwise (both are PM_TYPE_32) any PM_TYPE_32
  • Specifically, for subtraction, if both operands are unsigned types and the results may lead to negative number, the reusults will become ‘no value’

Sample: (Lack testing)

1
'kernel.all.uptime[count:2] + kernel.all.uptime[count:2]'

Fabricated SID

Canonical expression generation

For a query statement we will generate a ‘canonical’ expression: discard whitespace, metadata qualifiers, and time specification in the query statement. Since the expression is for reevaluating the metadata of the result of function’s expression, and the only factor that affects metadata changes is what the metric is, we can access this by the metric name.

Sample:

query statement canonical expression
rate(kernel.all.load{hostname!~”prefix*”&&instance.name==”1 minute”}[count:10]) rate(metric.name==”kernel.all.load”)
(multiple metrics case) rate(disk.dev.*[count:10]) rate(metric.name==”disk.dev.write_bytes”)
rate(metric.name==”disk.dev.write_merge”)
rate(metric.name==”disk.dev.write”)

Metadata updating

A descriptor has the following fields:

1
2
3
4
5
6
7
8
typedef  struct  pmSeriesDesc {
sds indom; /* dotted-pair instance domain identifier */
sds pmid; /* dotted-triple metric identifier */
sds semantics; /* pmSemStr(3) metric semantics */
sds source; /* source identifier, from whence we came */
sds type; /* pmTypeStr(3) metric type */
sds units; /* pmUnitsStr(3) metric units */
} pmSeriesDesc;

We can see the the fields “semantics”, “type” and “units” are computed during every function take action. In the end of a query, we generate the “canonical expression” for every SID. Then generate:

1
2
3
4
{
"series":"expr",
"expr":"[canonical expression]"
}

And hash it as a “fabricated” SID for this expression. Then leave “source” unchanged (in arithmetic expression case always the left most operand’s source will be the results’ source), set “pmid” and “indom” to 511.0.0 and 511.0 repectively. Finally, hmset “pcp:desc:series:[hash value]” with the descriptor into Redis.

Some notes

  • series_solve in query.c: An overview of the phases of processing a time series query statement.
  • SHA1 computation related function pmwebapi_metric_hash

Issues

  1. Run gdb --args pmseries 'rate(kernel.all.load)[count:2]', we can see error occurs during calling void freeReplyObject(void *r). Seems that I free a Redies reply twice. (But where?)

  2. Not supoort {metadata qualifiers} in my remote host. See also point 3 in Issues.

  3. Reported by mgoodwin:

    1
    2
    3
    4
    pcp-kyoma[demo_add_rate]$ pmseries -a c3d11a94d2475f002c9279a50e3a884de8b96f3e
    c3d11a94d2475f002c9279a50e3a884de8b96f3e
    pmseries: [Error] no descriptor for series identifier c3d11a94d2475f002c9279a50e3a884de8b96f3e
    Segmentation fault (core dumped)
  4. Sometime the daemon pmlogger will be killed by unknown reason. Or each pmlogger has a life period? [May 28th] Answered by Nathans:”yes, we do start pmlogger with a fixed life-span by default (~24hrs)”

  5. what does series_query_end_phase do? [June 8th] Answered by mgoodwin:“It decrements the ref count and if zero, moves onto the next phase of the query, see seriesPassBaton. ”

  6. Query statements in the format as pmseries -Dquery 'rate(kernel.all.load{hostname=="VM-0-4-ubuntu"}[count:2])' (add the {metadata qualifiers})does not work on my remote host… It only returns

    1
    2
    pmseries: [Error] no descriptor for series identifier 7d94835a09744d4cea95f06c7feb45c33ada9168
    pmseries: [Error] no descriptor for series identifier 5a835a09744d4cea95f06c7feb45c33ada91685a

The code tries to query 2 SID which do not exist. But this works on mgoodwin’s host.(Why?) I guess this error may occur in the phase series_query_eval? Not sure….

[July 22th]Solved by mgoodwin. To fix, flush redis:

sudo systemctl stop pmlogger pmproxy pmcd

sudo redis-cli flushall

sudo systemctl start pmcd pmproxy pmlogger

  1. The following warnings appear when applying float types variables to double floor(double x) and double sqrt(double x). Keep these warning here to remind me thinking about will them cause problems or not.
1
2
3
4
5
6
7
8
9
10
11
query.c:2331:15: warning: implicit declaration of function ‘floor’ [-Wimplicit-function-declaration]
val->f = floor(val->f);
^
query.c:2331:15: warning: incompatible implicit declaration of built-in function ‘floor’
query.c:2331:15: note: include ‘<math.h>’ or provide a declaration of ‘floor’
query.c: In function ‘series_sqrt_pmAtomValue’:
query.c:2397:15: warning: implicit declaration of function ‘sqrt’ [-Wimplicit-function-declaration]
val->d = sqrt(res);
^
query.c:2397:15: warning: incompatible implicit declaration of built-in function ‘sqrt’
query.c:2397:15: note: include ‘<math.h>’ or provide a declaration of ‘sqrt’
  1. The function log(), it can not support some whitespace insertion, like:
    1
    pmseries 'log(log(kernel.all.pswitch[count:2],3))'

is acceptable but

1
pmseries 'log(log(kernel.all.pswitch[count:2], 3))'

will raise error. Because the parser take part log(log(kernel.all.pswitch[count:2] in the query statement as a series identifier. This may be fixed by rewriting grammars.

Discussion

  1. About the sum(), should we take sum between samples OR between instances? And mentors suggest that we can have a sum_label() function e.g. if we like to sum all values among different hosts we can apply sum_label(average(kernel.all.pswitch[count:30]), "hostname"). Not very clear how this going…
  2. About the meta-data updation after functions has been executed. We can not update the original SID of metrics (Since the metadata of this metric has been changed e.g. rate(kernel.all.pswitch_rate))but maybe we need to fabricate a SID for this new metric with new metadata e.g. rate() should cause metric with counter semantics to become ‘instant’. And the units should go from ‘count’ to count/second, i.e. the time dimension needs to be decremented.

Thanks

I would like to express my sincere gratitude to my mentors @mgoodwin and @nathans, for their helpful suggestions, patient explanations to my questions, teaching me git-based workflow and every details of PCP. And thanks to @Miroslav Foltýn for debugging for me. I am deeply grateful for their assistance in completing this project. While there were some unanticipated areas of difficulty in the project, I brain-stormed some solutions with my mentors and I am now nearing completion on those additional items as well.
I believe I will make further commits on this project and work with them.
Also thanks to the vibrant and inclusive PCP community, new developers are always welcome to join here.