Plan Testing Helper
This is a small utility that I developed for testing my type checking logic. I think it might be useful for other people as well so I have refactored a bit to make it more generic.
Use cases
Here are steps that I do for type checking:-
- Construct a plan
- Run type-checking logic against the plan
- Construct the expected plan
- Compare structures of the actual plan and the expected plan.
Here are steps that one might do for query parser:-
- Given a query string, construct the plan.
- Construct the expected plan
- Compare two plans
Here for testing plan optimizer:-
- Construct a plan
- Run optimizer
- Construct the expected plan
- Compare structures of the actual plan and the expected plan.
What can be facilitated?
So there are two common bits from above use cases:-
- Construct the expected plan
- Compare two plans
Construct a plan
What is Dot Language?
Dot language is a text graph description language. There are three main object types: node, edge, and graph. All of them can have custom attributes.
( See http://en.wikipedia.org/wiki/DOT_language )
Sample Dot graph
digraph plan1 {
load [color="black"]
load -> distinct -> split -> splitOut1 [style=dotted] ;
split -> splitOut2 ;
splitOut1 -> cross ;
splitOut2 -> cross ;
}Note: "digraph" dictates that this is a description of directed graph which is the domain we're interested in.
Note: "load [color="black"]" is attaching an attribute to the node. This is optional.
By extending Dot a bit, we can encode our logical plan in the following format:-
digraph graph1 {
node [schema="field1: int, field2: float"] /* "node" means the following nodes will inherit this attribute */
load [key="1", type="LOLoad"] ;
distinct [key="2", type="LODistinct"] ;
split [key="3", type="LOSplit"] ;
splitout1 [key="4", type="LOForEach"] ;
splitout2 [key="5", type="LOForEach"]
cross [key="6", type="LOCross", schema="field1: int, field2: float, field3: int, field4: float"] ;
load -> distinct -> split -> splitOut1 ;
split -> splitOut2 ;
splitOut1 -> cross ;
splitOut2 -> cross ;
}And this can be translated to a plan using a loader class (API will be provided)
Compare two plans
I will provide API like this:-
/***
* This abstract class is a base for plan comparer
*/
public abstract class PlanStructuralComparer<E extends Operator,
P extends OperatorPlan<E>> {
/***
* This method does structural comparison of two plans based on:-
* - Graph connectivity
*
* The current implementation is based on simple key-based
* vertex matching.
*
* @param plan1 the first plan
* @param plan2 the second plan
* @param messages where the error messages go
* @return
*/
public boolean structurallyEquals(P plan1, P plan2, StringBuilder messages) ;
/***
* Same as above in case just want to compare but
* don't want to know the error messages
* @param plan1
* @param plan2
* @return
*/
public boolean structurallyEquals(P plan1, P plan2) ;
}A subtype which is interested in type information would look like this:-
/***
* This class is used for LogicalPlan comparison
*/
public class LogicalPlanComparer
extends PlanStructuralComparer<LogicalOperator, LogicalPlan> {
/***
* This method does naive structural comparison of two plans.
*
* Things we compare :-
* - Things compared in the super class
* - Types of matching nodes
* - Schema associated with each operator
*
* @param plan1
* @param plan2
* @param messages
* @return
*/
@Override
public boolean structurallyEquals(LogicalPlan plan1,
LogicalPlan plan2,
StringBuilder messages) {
// Stage 1: Compare connectivity
if (!super.structurallyEquals(plan1, plan2, messages)) return false ;
// Stage 2: Compare node types
if (isMismatchNodeType(plan1, plan2, messages)) return false ;
// Stage 3: Compare schemas
if (isMismatchSchemas(plan1, plan2, messages)) return false ;
// else
return true ;
}
Dot Trick
One can plot a graph written in Dot language by just doing like:-
dot -Tpng dot1.dot > dot1.png
Or alternatively,
dotty dot1.dot
NOTE: You need graphviz installed on your machine to do these things.
Here is a sample graph generated from the given sample.
Current Status & Issues
- Working code will be available in 1-2 days (Today = 26th May)
- Doesn't work with inner plans yet. Inner plans may have to be constructed separately.
Appendix
The API:-
OperatorPlanLoader - This class is an abstract base class for loading a plan from Dot
public abstract class OperatorPlanLoader<E extends Operator,
P extends OperatorPlan<E>> {
/***
* This method is used for loading an operator plan encoded in Dot format
* @param dotContent the dot content
* @param clazz the plan type to be created
* @return
*/
public P load(String dotContent) {
/***
* This method has be overridden to instantiate the correct node type
*
* @param node
* @param plan
* @return
*/
protected abstract E createOperator(Node node, P plan) ;
}Structures captured from Dot (Before being converted to plan):-
/***
* This represents graph structure in DOT format
*/
public class DotGraph {
public String name;
public List<Edge> edges = new ArrayList<Edge>() ;
public List<Node> nodes = new ArrayList<Node>() ;
public Map<String, String> attributes = new HashMap<String,String>() ;
public DotGraph(String name) {
this.name = name ;
}
}/***
* This represents a node in DOT format
*/
public class Node {
public String name ;
public Map<String, String> attributes = new HashMap<String,String>() ;
}/**
* This represents an edge in DOT format.
* An edge in DOT can have attributes but we're not interested
*/
public class Edge {
public String fromNode ;
public String toNode ;
}