Implementing SQL Authorization Support for Derby dblook
- Project: Implementing SQL Authorization Support for Derby dblook
Description: This project is carried out as a part of the Google Summer of Code 2009 program. The detailed project proposal can be found at http://wiki.apache.org/general/HiranyaJayathilaka/gsoc2009/derby-dblook-proposal. Project is carried out by Hiranya Jayathiala (firstname.lastname@example.org) and is mentored by Dag Wanvik of the Apache Derby team. Please feel free to edit this wiki page and contribute to the project in any means you wish.
We intend to use a directed graph as the means of tracking and representing the dependencies among various persistent objects in a database. Let's call it the dblook dependency graph. When dblook is fired off against a database, it would create the dependency graph in memory and 'walk' the graph steb-by-step while producing DDL statements required to reconstruct those objects.
In addition to the ddblook dependency graph we'll use a secong graph to capture the interrelationships among roles. Let's call it the role dependency graph. The role dependency graph would represent roles associated with a database by a set of vertices. If a role Q is dependent on role P (ie role Q requires granting of role P) then there would be a directed edge from P to Q (P --> Q) in the graph. The data necessary to construct this graph will be mainly fetched from the SYSROLES table.
All persistent objects of a database would be vertices in the dependency graph. If A and B are two persistent objects in a database and if B is a dependent object of A then there will be a directed edge from A to B in the dependency graph (A --> B). Information required to construct the dependency graph will be fetched from the system tables (specially from the SYSDEPENDS table which effectively captures all such dependencies among persistent objects). The graph construction algorithm should also associate each vertex with a database user as the owner. (See Database Object Owner Discovery Algorithm) In addition it should capture all the permissions granted on each object and the users/roles involved with those permissions.
Once we have the full dblook dependency graph in memory we can create all the roles required (should be done as the dbo). Then by looking at the role dependency graph we need to grant roles to other roles, users and PUBLIC. Currently only the database owner can properly perform this action so it should be done here.
When dblook finally walks the dependency graph it will need to produce an authentication statement prior to producing the actual DDL statement required to create each object. For an example if there is an object O associated with the user U in the graph, dblook will first output a statement necessary to create the required database connection for user U before producing the DDL statement for O. After producing the creation statement necessary grant statements should be created related to the object. Basically all the permissions related to the object should be granted to the associated users. Walking the graph should start from a node which does not have any in bound edges (ie it does not depend on any other object). As the graph walk continues objects are removed from the graph along with all edges incident on them. The walk continues until the graph's set of vertices is empty.
Dependency Graph Design
The datastructures used to store the elements of the dependency graph should be lightweight. This graph will be held in memory and its size will be proportional to the number of persistent objects in the database. Therefore larger the database, larger the dependency graph. We need to be conservative in memory usage to efficiently deal with large databases.
Each vertex in the graph should store a set of in bound edges, set of out bound edges, an associated user ID (might also need to store authentication credentials of the user) and a set of permissions.
Database Object Owner Discovery Algorithm
- Look up the relevant system table for the object and find the schema ID of the schema (say schemaID) to which the object belongs (eg: if the object is a view lookup SYSVIEWS)
- Look up the SYSSCHEMAS table and find the authorization ID associated with schemaID
Dependency Graph Construction Algorithm
- for each persistent object O in the database
V <-- find vertex for O from graph
- if (V == null)
V <-- create new vertex for O
- Add V to the graph
- end if
S <-- find set of providers for O
- for each persistent object T in S
W <-- find vertex for T from graph
- if (W == null)
W <-- create new vertex for T
- Add W to the graph
- end if
- Create edge from W to V
Role Dependency Graph Construction Algorithm
Dependency Graph Walk Algorithm