Things to Know

The first thing to know about CloudStack's Data Access Layer is that it is represented by the VOs in the cloud-engine-schema project.  The underlying database is just one particular implementation of that schema.  What this means is that a VO may be part of a table or a VO may be a join of multiple tables.  The software above the VO layer should not care.  To it, the schema is the VO.

The second thing to know about CloudStack's Data Access Layer is that it utilizes Java Persistence annotations.  Thus, any type of ORM, such as Hibernate should work with some tweaks.  In fact, we recommend using ORM like Hibernate for any type of API access to the schema.  However, there are some caveats so please read through this whole section before rushing to convert everything.

The third thing to know about CloudStack's Data Access Layer is that the current set of VOs in cloud-engine-schema should not be used for API access.  This set of VOs is a normalized view of the relations in CloudStack.  It is meant to be used by CloudStack's job processing engine.  When access through the API, the CloudStack schema should be much more descriptive.  For example, the VM VO contains the account database id but the VM Resource/Response should contain the account name and account UUID but not the account database id.  Thus, VO != Resource/Response.  There has been an effort in cloud-server to create database views that define the CloudStack Resource/Response but it is not complete and does not cover all of CloudStack's Resources/Responses yet.

Where should you look for examples?

Given the above, there's a natural direction on where to look for examples of how to add database access to CloudStack.

  • If you are writing core cloud-engine (orchestration) code or subsystem plugin code (such as Nicira) and you need access to the database, look at the VOs and DAOs in cloud-engine-schema.
  • If you are writing code for API access and your code is not involved in the orchestration and provisioning of hardware resources, then you should look at the VOs and DAOs in cloud-server, particularly the ones that ends with JoinVO.  If you can convert these JoinVOs to use a generic ORM implementation so that JPQL can be used to form complicated searches, that's even better.  I look forward to someone contributing in this area. 

DAO

CloudStack chose to implement its database access layer using DAO because CloudStack at it's core is a clustered job processing engine and direct access to the schema is desired. CloudStack also wanted to enforce the following database usage patterns:

  • When a first class entity is deleted in CloudStack, it should not be deleted from the database until an admin runs an expunge process on the database.  This allows mistakes to be recovered easily.  It also can be used to comply with data retention policies.  However, when a entity is deleted, the business logic should no longer see this entity in its operations.  This pattern is enforced when a field in the VO is annotated with @Column(name=GenericDao.REMOVED_COLUMN).  When this happens, the DAO layer automatically changes delete operations to instead update the removed column with a time stamp but all search operations will not return rows with that column set.
  • When a first class entity is created in CloudStack, it should record when it was created for auditing purposes.  This is a fail-safe to more robust auditing mechanisms implemented in the business logic.  With this information, an admin, as a last resort, can look at CloudStack logs around the creation time to figure out what might have happened.  This pattern is enforced when a field in the VO is annotated with @Column(name=GenericDao.CREATE_COLUMN).  When this happens, the DAO layer automatically inserts a timestamp into the column.  We don't need to rely on business logic to remember to fill in this information.
  • Perform object mapping but not relation mapping.  The problem with the relation mapping part of most ORM implementation is that it is very easy for a developer to bring the entire database into memory by specifying a few @OneToMany annotation and eager fetches.  CloudStack wants to work directly on the database but yet provide the ease of use and mistake free benefits of object mapping.  Therefore, DAO does not support annotations such as @OneToMany and @FetchType.
  • Separate SQL generation from database connection.  DAO does not manage database connections nor the databases that's being connected to.  That's directed by the business logic.  This eases support for database replication and partitioning strategies.
  • No support for nested transactions.
  • Auto release of database connection, PreparedStatements and Results.
  • Auto rollback of transactions if not committed.
  • Auto release of db locks.
  • Protection against SQL injection attacks.

CloudStack DAO layer is implemented by four different components

  • GenericDao and GenericDaoBase which automatically provides a set of methods to use.
  • @DB annotation and TransactionContextBuilder to do auto release of Connection, PreparedStatement, and Results.
  • Transaction to manage database connection and database transactions.
  • SearchBuilder to create searches based on the VO.

Writing a new VO and DAO

// Interface for the VO if you need it.
// Notice that in the interface, there are no
// set methods.  When you give an VO to other
// code to consume, you should only give the
// interface and it should be a read-only
// object.
public interface Test {
    String getText();
    String getCreated();
}


// VO....@Entity
@Table(name="test")
public class TestVO implements Test { // Assuming an interface was set
    @Column(name="id")
    private long id;

    @Column(name="text")
    String text;

    @Column(name=GenericDao.CREATED_COLUMN)
    @Temporal(value=TemporalType.TIMESTAMP)
    private Date created;


    // Noticed that the removed field doesn't have a getter and setter
    // It is set and used by the DAO code.  You can add a getter
    // if you have some use for it but there's no point for a setter.
    @Column(name=GenericDao.REMOVED_COLUMN)
    @Temporal(value=TemporalType.TIMESTAMP)
    private Date removed;


    public TestVO(String text) { // This is the normal constructor
        this.text = text;
    }


    protected TestVO() {  // protected constructor needed for reflection
    }

    // getters and setters must follow the Java
    // convention of putting get/set in front of
    // the field name.
    public String getText() {
        return text;
    }


    public void setText(String text) {
        this.text = text;
    }

    // In case you need to have a getter
    // that does not use the standard prefixes (getXXX or isXXX),
    // then you should annotate that method with @Column(name="<columnName>")
    // to indicate which column this method refers to

    @Column(name="text")
    public String retrieveText() {
        return text;
    }


    // Noticed there's no need for a setter for created as it is
    // set by the DAO code automatically.
    public Date getCreated() {
        return created;
    }
}


// DAO... (can be skipped if there's no need for interface)
public interface TestDao extends GenericDao<TestVO, Long> {
}


// Impl...
@Local(value={HostDao.class}) // If the interface has been specified.
public class TestDaoImpl extends GenericDaoBase<TestVO, Long> implements GenericDao<TestVO, Long> {
    protected TestDaoImpl() {
    }
}

By creating files similar to the above, you will be able to use the methods inherited from GenericDao to find a row by database id, perform searches, delete rows, etc.  You don't really have to write anything else if that's sufficient to access your new table.  Please note the comments in the classes, summarized below:

  • If your VO is used by other modules, you should declare an interface for it and use the interface to communicate with other modules.  
  • The interface should not carry any setters as it should be only consumed and not modified by other modules.
  • The getter and setter methods must follow Java convention by putting "get"/"set" before the field name.  The DAO code uses this to find the field that's been modified when you update your VO object.
  • There is no need to declare an interface for your DAO implementation if you're not planning for it to be used by other modules.  There's a specific use case for this that I will talk about below.
  • There is no need to have setter methods for the removed field and created field as they are filled in by the DAO code automatically.
  • For the removed and created field, it must use the constants declared in GenericDao.  If you put in @Column(name="removed"), then the DAO code considers that a field your code will update.
  • The protected default constructors are needed for reflection to instantiate the class.

Annotations supported by the DAO implementations

Annotation Name

Notes

Table

Used as specified by the Java Persistence Spec

Column

Used as specified by the Java Persistence Spec

Enumerated

Used as specified by the Java Persistence Spec

Transient

Used as specified by the Java Persistence Spec

Id

Used as specified by the Java Persistence Spec

Entity

Used as specified by the Java Persistence Spec

TableGenerator

Used as specified by the Java Persistence Spec

Temporal

Used as specified by the Java Persistence Spec

Inheritance

Used as specified by the Java Persistence Spec

DiscriminatorColumn

Used as specified by the Java Persistence Spec

PrimaryKeyJoinColumn

Used as specified by the Java Persistence Spec

Encrypt

This is a CloudStack specific annotation.  DAO encrypts that particular field before saving it.

StateMachine

This is a CloudStack annotation.  It uses database atomic operations to update the state of the entity.  It works with the state machine defined in the utils package.

Usage

Using Dao
public class Foo {
    @Inject TestDao _testDao;  // Use injection to inject the DAO.  Don't new it.

    public String getText(long id) {
        TestVO vo = _testDao.findById(id);  // findById is provided by GenericDaoBase
        return vo.getText();
    }
}
Updating VOs
public class Foo {
    @Inject TestDao _testDao;

    public void updateText(long id, String text) {
        TestVO vo = _testDao.findById(id);
        vo.setText(text);     // Just call set method and DAO layer understands which fields were modified.
        _testDao.update(vo);
    }
}
Searching

Two types of search constructs are provided: SearchBuilder and SearchCriteria.  SearchBuilder is a compile time and load time construct of the search query.  It is equivalent to writing a search query like "SELECT * FROM test WHERE text=?", where the '?' is intended to be filled out during runtime.  Writing it with SearchBuilder allows the entire construct to be checked during compile time and load time to find errors early.  At runtime, SearchBuilder generates a SearchCriteria and SearchCriteria can be used to fill in the '?'.  SearchBuilder is preferred over SQL query because any refactoring via context-sensitive editors will automatically change your SearchBuilder but will not change your SQL query.

public class Foo {
    @Inject TestDao _testDao;
    protected SearchBuilder<TestVO> TestSearch;
    public Foo() { 
        // SELECT * FROM test WHERE text=? AND created<? 
        // The entity() method in SearchBuilder actually returns the VO itself 
        // which you can then use the getter methods 
        TestSearch = _testDao.createSearchBuilder(); 
        TestVO entity = TestSearch.entity(); 
        TestSearch.and("bar", entity.getTest(), SearchCriteria.Op.EQ) 
                  .and("time", entity.getCreated(), searchCriteria.Op.LT).done(); 
    }

    public List<? extends Test> findTest(String text, Date date) { 
        SearchCriteria<TestVO> sc = TestSearch.create(); 
        sc.setParameters("bar", text);  // Note "bar" is the parameter name set in the SearchBuilder 
        sc.setParameters("time", date); // Note "time" is the parameter name set in the SearchBuilder 
        return _testDao.listAll(sc); 
   }
}

In the above usage, SearchBuilder is instantiated during construction time and SearchCriteria is used during runtime.  SearchCriteria can also be retrieved from the DAO during runtime and write more ad-hoc queries.

The following code fragment shows how to construct a join using SearchBuilder:

// This performs
// SELECT * FROM HOST INNER JOIN host_tag ON host_tag.host_id=host.id WHERE host.type=? AND host.pod_id=?
// AND host.data_center_id=? AND host.cluster_id=? AND host.status=? AND resource_state=? AND host_tag.tag=?

SearchBuilder<HostTagVO> hostTagSearch = _hostTagsDao.createSearchBuilder();
HostTagVO tagEntity = hostTagSearch.entity();
hostTagSearch.and("tag", tagEntity.getTag(), SearchCriteria.Op.EQ);

SearchBuilder<HostVO> hostSearch = createSearchBuilder();
HostVO entity = hostSearch.entity();
hostSearch.and("type", entity.getType(), SearchCriteria.Op.EQ);
hostSearch.and("pod", entity.getPodId(), SearchCriteria.Op.EQ);
hostSearch.and("dc", entity.getDataCenterId(), SearchCriteria.Op.EQ);
hostSearch.and("cluster", entity.getClusterId(), SearchCriteria.Op.EQ);
hostSearch.and("status", entity.getStatus(), SearchCriteria.Op.EQ);
hostSearch.and("resourceState", entity.getResourceState(), SearchCriteria.Op.EQ);
hostSearch.join("hostTagSearch", hostTagSearch, entity.getId(), tagEntity.getHostId(), JoinBuilder.JoinType.INNER);
hostSearch.done();

  The following code fragment shows how to construct queries that asks for specific fields or special SQL functions using the GenericSearchBuilder:

// This is equivalent to
// SELECT COUNT(*) FROM host WHERE data_center_id=? AND type=? AND status=?
// In this case, note that it is using GenericSearchBuilder which allows you
// to specify the return type (long in this case).


GenericSearchBuilder<HostVO, Long> CountRoutingByDc;
CountRoutingByDc = createSearchBuilder(Long.class); 
CountRoutingByDc.select(null, Func.COUNT, null); 
CountRoutingByDc.and("dc", CountRoutingByDc.entity().getDataCenterId(), SearchCriteria.Op.EQ); 
CountRoutingByDc.and("type", CountRoutingByDc.entity().getType(), SearchCriteria.Op.EQ); 
CountRoutingByDc.and("status", CountRoutingByDc.entity().getStatus(), SearchCriteria.Op.EQ); 
CountRoutingByDc.done();
Using Transactions

The CloudStack Data Access Layer also forces a certain semantics on how to use transactions and database connections.  In the above examples, the code never has to ask for a database Connection, PreparedStatement, or Result.  It is handled already so that you only have to concentrate on constructing the SQL for your business logic.   However, what if you need a database transaction or really need to write your own handling of SQL.  To use that you have to make use of the CloudStack Transaction.  The CloudStack Transaction is more than a database transaction; it's really a database usage context.  The following example illustrates the usage.

public class Foo {
    @Inject TestDao _testDao;

    @DB // This is important
    protected void updateText(long id, String text) { // The method can not be private
        Transaction txn = Transaction.currentTxn(); // This retrieves the Transaction context but does not start a DB connection.
        // ... You can add any code here
        txn.start();    // DB transaction actually starts here
         TestVO vo = _testDao.lockRow(id, true);
         if (vo.getText() == null) {
            vo.setText(text);
        }
        _testDao.update(vo);
        txn.commit();   // DB transaction committed here.
    }


    @DB
    protected void updateTextComplicated(long id, String text) {  // Look at this example if you want to control your own rollback.
        Transaction txn = Transaction.currentTxn();
        // ... You can add any code here
        boolean committed = true;
        TestVO vo = null;
        txn.start();
        try {
            vo = _testDao.lockRow(id, true);
            if (vo.getText() == null) {
                vo.setText(text);
            }
            _testDao.update(vo);
            txn.commit();
        } catch(Exception e) {
            // Log the exception
            committed = false;
            txn.rollback();
        }


        if (!committed) {
             // Handle not committed.
        }
    }
}

The above example made the retrieval and update of the TestVO object an atomic operation.  In order to do that, you must label your method with the @DB annotation.  For those of you familiar with writing DB code, you'll noticed that there is no rollback in case of error.  Those type of things are taken care of by the @DB annotation.  Once the code leaves the scope of this method, an intercepter that understands the @DB annotation will check for and rollback any transactions.  The same intercepter also releases any PreparedStatement and Result objects and database Connection objects.  

Handling Nested Transactions

So what happens if your code starts a database transaction and calls another method that also starts a database transaction before it commits the database transaction.  CloudStack does not want to support nested transaction concepts.  In this case, the two transactions are actually merged into one big transaction and the results are not committed until the outer most transaction is committed.  If there's a rollback, the entire set is rolled back. 

Handling Locking

CloudStack Data Access Layer supports two types of locking.  The first is a database row lock.  That is shown in the above example with the lockRow() calls.  You can also lock rows returned in a search.  See GenericDao for these types of method support.  Database row lock should only be used when you are locking only for database operations.  For example, in the above example, the code wants to do an atomic test and set on the text field so therefore database lock should be used.

The second type of locking is a lock table lock.  These operations are also listed as methods that ends in "InLockTable" in the GenericDao interface.  These locks are for operations that require external resources that may take a long time.  In these situations, database row locks are not sufficient as database row locks can time out and it also decreases scalability by locking up the rows and not allowing concurrent database operations while the external resource operation completes.  In these situations, you should acquire locks in the lock table.  Please note that the @DB annotation does not automatically release locks in the lock table.  This is by design.  Sinc lock table locks are used when performing external resource operations, it is likely that the thread that releases the lock is not the same as the thread that acquires the lock (think async operations).  Therefore, @DB does not auto release lock table locks when the method goes out of scope.

  • No labels