Executable Scenario Management

Logo

The ExSce Management is an approach to store, query and generate test scenarios for ROS-based multi-robot systems. Provenance data about test scenarios and their executions is modeled using PROV and stored on a property graph. Runtime information is obtained from recorded bag files. Metamorphic testing is used to generate new scenarios and validate the system's requirements.

View the Project on GitHub hbrs-sesame/exsce_management

Queriable Scenario Execution

By storing the provenance data in a property graph, we can easily query scenarios and their execution. In this section, we describe the relevant queries for the observed outputs, getting and generating oracles, and clustering. Other auxiliary queries are not included here. The queries are written in Cypher, the query language used by Neo4j. Note that the serialization of the PROV data into Neo4j is determined by prov-db-connector, the library we use in our implementation. Therefore the syntax shown in the queries in this section reflects some of the design choices of the library (e.g. the property meta:identifier_original). In future work, we plan to write our own Neo4j adapter to simplify the syntax for our intended use case.

Observed outputs

The first step is to get the execution data ($runs) of a particular scenario. Listing 1 shows the Cypher query used to do this.

Listing 1: Given a $scenario_id, return all the runs associated with this scenario.

MATCH (s)
WHERE s.`prov:type` = "exsce:ConcreteScenario" AND s.`meta:identifier_original`= $scenario_id
MATCH (a:Activity)
WHERE a.`prov:type` = "exsce:run"
MATCH (s)<-[:used]-(a)
RETURN a.`meta:identifier_original` AS run_id 

Remember that we store execution data for the overall run and for each path segment. Listing 2 gets the result for a particular robot for the overall run, while Listing 3 does so for a particular path segment. The former queries a single $run_id, while the latter can also query metrics from a list of runs(for aggregation).

Listing 2: Get the output metrics of type $metric_type for a $run_id and a particular $robot.

MATCH (n)
WHERE n.`prov:type` = "exsce:output_metric" AND n.`run:metricType` = $metric_type
WHERE $run_id CONTAINS n.`exsce:run` AND n.`meta:identifier_original` CONTAINS $robot 
MATCH (n)-[g:wasGeneratedBy]->(a:Activity)-[:used]->(s) 
WHERE s.`prov:type` = "exsce:ConcreteScenario"
RETURN n.`prov:value` AS value

To get the observed output of a path segment (Listing 3), and given a list of $runs, get all the output metrics of type $metric_type. For each of the runs, find the start and end pose for the exsce:action that generated the output metric. Return the average and standard deviation of all the output metrics for each pair of poses.

Listing 3: Query a $metric_type for a path segment in a list of $runs

UNWIND $runs AS run_id
MATCH (n WHERE n.`prov:type` = "exsce:output_metric" AND n.`run:metricType` = $metric_type)
WHERE run_id CONTAINS n.`exsce:run` 
MATCH (n)-[g:wasGeneratedBy]->(a:Activity)
WHERE a.`prov:type` = "exsce:action"
MATCH (a) -[s:wasStartedBy]-> (p1)
WHERE p1.`prov:type` = "geom:Pose" AND p1.`meta:identifier_original` = $pose_id_a
MATCH (a) -[e:wasEndedBy]-> (p2)
WHERE p2.`prov:type` = "geom:Pose" AND p2.`meta:identifier_original` = $pose_id_b
WITH n.`prov:value` AS value
RETURN avg(value) AS average, stDevP(value) AS std_dev

A special case for aggregation is the waypoints_visited metric, where we want to be able to query a list of runs, but which is not stored in individual path segments. Following a similar pattern as the queries above, this query (Listing 4) checks for the waypoints_visited output metric that matches a specific $scenario_id. The $metric_id variable matches the robot ID, e.g. tiago1:waypoints_visited. The query returns the average of waypoints visited by a single robot.

Listing 4: Query waypoints_visited by a $robotfor a list of $runs

UNWIND $runs AS run_id
MATCH (n:Entity)
WHERE n.`prov:type` = "exsce:output_metric" AND n.`run:metricType` = "waypoints_visited"
WHERE run_id CONTAINS n.`exsce:run` 
MATCH (n)-[:wasGeneratedBy*]->(a:Activity)-[:used]->(s)
WHERE s.`prov:type` = "exsce:ConcreteScenario"
MATCH (s WHERE s.`meta:identifier_original` = $scenario_id)
MATCH (n WHERE n.`meta:identifier_original`= $metric_id)
WITH n.`prov:value` AS value
RETURN avg(value) AS average

Finally, Listing 5 shows how to query the mission_duration metric. Given a list of run IDs $runs, get all the output metrics of type mission_duration that match each run ID, and return the average value and standard deviation of all the matching output metrics.

Listing 5: Query mission_duration for a list of $runs

UNWIND $runs AS run_id
MATCH (n WHERE n.`prov:type` = "exsce:output_metric" AND n.`run:metricType` = "mission_duration")
WHERE run_id CONTAINS n.`exsce:run` 
MATCH (n)-[g:wasGeneratedBy]->(a:Activity)-[:used]->(s)
WHERE s.`prov:type` = "exsce:ConcreteScenario" AND s.`meta:identifier_original` CONTAINS $scenario_id
WITH n.`prov:value` AS value
RETURN avg(value) AS average, stDevP(value) AS std_dev

Oracles

Get oracle config

Listing 6 shows how to get the oracles for a $scenario_id which is a usage relationship containing the baseline, type of output relation, the metric that it’s for, as well as the ros:Topics used as data sources.

Listing 6: Cypher query to get the baseline oracle

MATCH (n WHERE n.`prov:type` = "exsce:ConcreteScenario")
MATCH (n)<-[u:used]-(o) 
WHERE o.`prov:type`="exsce:oracle" AND o.`meta:identifier_original` CONTAINS $scenario_id
MATCH (o)-[v:used]->(t)
WHERE t.`prov:type`="ros:Topics"
WITH o.`oracle:robot` AS robot, u.`oracle:limit` AS oracle_limit, u.`oracle:value` AS value, u.`oracle:package` AS package, u.`oracle:relationship` AS rel_name, u.`oracle:tolerance` AS tolerance, o.`meta:identifier_original` AS oracle_id,t.`meta:identifier_original` AS topic, o.`oracle:metric` AS metric, n.`meta:identifier_original` AS scenario, u.`oracle:delta` AS delta
RETURN  scenario, oracle_id, oracle_limit, value, package, rel_name, tolerance, topic, robot, metric, delta

Update baseline

For a baseline oracle of $scenario_id for a specific $metric and $robot, set the baseline to $value and $tolerance.

Listing 7: Cypher query to update the base values of the baseline oracle

MATCH (n WHERE n.`prov:type` = "exsce:ConcreteScenario")
MATCH (n)<-[u:used]-(o) 
WHERE o.`prov:type`="exsce:oracle" AND o.`meta:identifier_original` CONTAINS $scenario_id
WHERE u.`oracle:type` = "baseline" AND o.`oracle:metric` = $metric AND o.`oracle:robot` = $robot
SET u += $baseline

Clustering

The queries described here enable us to create datasets to apply clustering algorithms, e.g., using Scipy’s hierarchical clustering or the sklearn libraries.

In addition to the observed outputs, we consider the following structural properties of the scenarios to compute the distances between scenarios required for linkage. The metrics discussed here are not an exhaustive list, but rather exemplify how to use existing data in the PROV database. The evaluation of these similarity metrics for the scenario distance in the clusters will be studied as part of WP8.

Robot similarity

To measure how similar the robots in two scenarios are, we look at the properties of the robots in the scenario. The similarity metric takes into account the difference in the number of robots, and, for this particular case study, two differences in hardware relevant for the navigation tasks, namely the base type and wether a robot has a torso or if it just has is a mobile base. We normalize these metrics by the maximum number of robots in all the scenarios stored in the PROV database.

Let us assume $r_1$ and $r_2$ are the sets of robots for the two scenarios we want to compare, $s_1$ and $s_2$.

First, we compute the difference between the number of robots of the two scenarios as the absolute value of the difference of their cardinality:

\[d_1 = |\#(r_1) - \#(r_2)|\]

Next, we compute the sum of the difference of the number of robots with the same mobile base. Using set notation, this is the cardinality of the symmetric difference for each subset of robots with each base type $b$:

\[d_2 = \sum_{b=0}^b \#(r_{1_b} \triangle r_{2_b})\]

where

\[r_{i_b} = \{r|r \text{ is a robot with base type } b \text{ in scenario } s_i \}\]

Finally, we compute the sum of the difference of the number of robots with and without a torso $t$. Similar to the equation above, this is the sum of the cardinalities of the symmetric difference of the subsets with and without torso:

\[d_3 = \sum_{j=0}^j \#(r_{1_{t=j}} \triangle r_{2_{t=j}})\]

where $r_{i_{t=0}} \subseteq r_i \text{ for scenario } s_i$ and $r_{i_{t=0}}$ indicates that $r \in r_{t=0}$ where $r_{t=0}$ is the subset of robots without a torso, and $r_{t=1}$ is the subset of robots with one.

Listing 8 shows how to query the robots of a scenario, together with the necessary information about its hardware to compute the similarity based on the base type and the presence of a torso in the PAL robots.

Listing 8: Query the PAL robots used in a $scenario_id, the types of mobile base they use and whether they have a torso

MATCH (n WHERE n.`prov:type` = "exsce:ConcreteScenario" AND n.`meta:identifier_original`= $scenario_id)
MATCH (n) -[:hadMember]-> (r:Entity)-[:hadMember]->(hw:Entity)
WHERE r.`prov:type` = "exsce:robot" AND hw.`prov:type` = "hardware-config"
RETURN r.`meta:identifier_original` AS robot_id, r.`robot:type` AS robot_type, hw.`pal:base_type` as base_type

Path segments

Next, we determine how similar are the navigation tasks in a scenario by looking at the path segments. We group the path segments in three groups: those that are present in both scenarios, those that begin and end in the same area, and the rest. We normalize this metric by dividing the result by two times the largest number of segments in a scenario, which represents the worst case scenario where no path segments are common to both scenarios.

\[d_4 = \sum f(x)\]

where

\[f(x)= \begin{cases} 0, & \text{if}\ \text{path segment x is present in } s_1 \text{ and } s_2\\ 0.5 & \text{if}\ \text{path segment x starts and ends in the same area of a path segment in the other scenario} \\ 1, & \text{otherwise} \end{cases}\]

For each scenario, we use Listing 9 to get the path segments.

Listing 9: Querying the pair of points for each path segment in a scenario

MATCH (n WHERE n.`prov:type` = "exsce:ConcreteScenario" AND n.`meta:identifier_original`= $scenario_id)
MATCH (n)<-[:used]-(r:Activity)
WHERE r.`prov:type` = "exsce:run"
MATCH (a) -[s:wasStartedBy]-> (p1 where p1.`prov:type` = "geom:Pose")
MATCH (a) -[e:wasEndedBy]-> (p2 where p2.`prov:type` = "geom:Pose")
WHERE a.`exsce:run` = r.`exsce:run`
RETURN p1.`meta:identifier_original` AS pose_2, p2.`meta:identifier_original` AS pose_2

Metrics

For the metric similarity distances, we can use the queries in Observed Outputs to obtain the results for each run. Exscept for the waypoints_visited metric, we normalize each metric by dividing it by its largest measurement on all runs in the PROV database. The number of visited waypoints is expressed as a percentage for each robot, and averaged for each run.

For the safety metric violations we consider two additional metrics: the duration of the safety violations (e.g. how much time the robot exceeded the max_velocity limit) and the number of “hotspots” or places where these violations occurr in the environment.

Listing 10 shows the query required to obtain the total time a metric $metric_id was violated in a $run_id.

Listing 10: Querying the total amount of time a safety metric was violated

UNWIND $runs AS run_id
MATCH (n where n.`prov:type` = "exsce:output_metric" AND n.`run:metricType` = "violation_duration" AND n.`exsce:metric` = $metric_id)
WHERE run_id CONTAINS n.`exsce:run`
MATCH (n)-[g:wasGeneratedBy]->(a:Activity)-[:used]->(s where s.`prov:type` = "exsce:ConcreteScenario" AND s.`meta:identifier_original` CONTAINS $scenario_id)
WITH n.`prov:value` as value
RETURN avg(value) AS average, stDevP(value) AS std_dev

The similarity metric is the difference of the total duration of the safety violation for metric $m$, where $t_{m_1}$ and $t_{m_2}$ are the durations for $s_1$ and $s_2$, respectively:

\[d_5 = | t_{m_1} - t_{m_2}|\]

We can also use spatial clustering for the violations to identify “hotspots” where violations occurr in the environment. As a similarity metric, we use the difference of the number of violation hotspots between scenarios, assuming $h_1$ and $h_2$ are the sets of hotspots for $s_1$ and $s_2$, respectively:

\[d_6 = |\#(h_1) - \#(h_2)|\]

The query to obtain the number of clusters is shown in Listing 11, where $hotspot_type matches the metric we are interested in, e.g., max_velocity.

Listing 11: Querying the number of hotspots where metrics are violated

UNWIND $runs AS run_id
MATCH (n:Entity where n.`prov:type` = "exsce:hotspot" AND n.`hotspot:type` = $hotspot_type )
WHERE run_id CONTAINS n.`exsce:run`
MATCH (n)-[:wasGeneratedBy*]->(a:Activity)-[:used]->(s where s.`prov:type` = "exsce:ConcreteScenario")
WHERE s.`meta:identifier_original` = $scenario_id
RETURN run_id, count(n) AS hotspot_qty