-
-
Notifications
You must be signed in to change notification settings - Fork 377
Analysis: Graph input ordering
- The PostgreSQL rows are a set, so the output of any algorithm should not depend on the ordering of the input rows.
- Moreover, a function is a ONE to ONE mapping and not ONE to MANY, which is a relation.
- However, on executing sample queries on the already-implemented function, it has been found that most of the pgRouting functions fail to satisfy this requirement.
- The expected behavior is that the output of every algorithm should be the same, irrespective of the ordering of the input rows -
edges_sql.
The problem arises probably because the pgRouting graph is created from the edges_sql rows "sequentially", therefore the set-like nature of the rows is lost when the graph gets created. So, for two different ordering of rows, the different graph gets created internally, and boost algorithm gives a different result for the different graph.
The solution to this problem is as follows:
- Add a pgTAP test for every function that tests whether the same set of rows are returned irrespective of initial ordering. Add a todo wherever the tests fail.
- Fix the code of the functions which fail the test, by creating a function insert_edges_sorted which first sorts the edges of the graph, in ascending order of id, source, target, and then inserts the edges. Therefore, the same graph will get created for any ordering of input rows.
Order the rows in a particular order before creating the graph, preferably in ascending order of the id column, and then in ascending order of the source column (if two or more rows have the same id) and then in ascending order of the target column (if two or more rows have the same id and source values).
The equivalent SQL statement for this ordering is:
ORDER BY id, source, targetSo, the following function can be created in the (include/cpp_common/pgr_base_graph.hpp)[https://github.com/pgRouting/pgrouting/blob/master/include/cpp_common/pgr_base_graph.hpp] file, and it can be called before inserting the edges in the graph, so that the edges always have a fixed ordering before the graph gets created:
void sort_edges(std::vector<T> &edges) {
std::sort(edges.begin(), edges.end(),
[](const T &lhs, const T &rhs) -> bool {
return lhs.target < rhs.target;
});
std::stable_sort(edges.begin(), edges.end(),
[](const T &lhs, const T &rhs) -> bool {
return lhs.source < rhs.source;
});
std::stable_sort(edges.begin(), edges.end(),
[](const T &lhs, const T &rhs) -> bool {
return lhs.id < rhs.id;
});
}This C++ code does the sorting in the way mentioned above, equivalent to the SQL statement.
Stable sorts preserve the physical order of semantically equivalent values. Therefore, first, normal sorting is done on the target column, and then the stable sort is done on the source column (preserving the relative order of two equal values in the source column). Then, the stable sort is done on the id column (preserving the relative order of two equal values in the id column) to get the final result.