Differences between revisions 3 and 4
Revision 3 as of 2009-09-20 23:38:08
Size: 2570
Editor: localhost
Comment: converted to 1.6 markup
Revision 4 as of 2009-09-25 23:56:39
Size: 3377
Editor: AlanGates
Comment:
Deletions are marked like this. Additions are marked like this.
Line 28: Line 28:

        /**
         * Get the default load function for this metadata service. This
         * will be called by SQL to determine the right load function for
         * the metadata service it is connected to.
         * @return class name of the default load function for this interface.
         */
        String getLoaderClass();

        /**
         * Get the default storage function for this metadata service. This
         * will be called by SQL to determine the right storage function for
         * the metadata service it is connected to.
         * @return class name of the default storage function for this interface.
         */
        String getStorageClass();

Line 48: Line 66:

== Changes ==
Septemer 25 2009
 * Added getLoaderClass and getStorageClass to interface, Gates.

Proposed Design for Pig Metadata Interface

With the introduction of SQL, Pig needs to be able to communicate with external metadata services. These communications will include such operations as creating, altering, and dropping databases, tables, etc. It will also include metadata queries, such as requests to show available tables, etc. DDL operations of these sorts will be beyond the scope of the proposed metadata interfaces for load and storage functions. However, Pig should not be tightly tied to a single metadata implementation. It should be able to work with Owl, Hive's metastore, or any other metadata source that is added to Hadoop. To this end, this document proposes an interface for operating with metadata systems. Different metadata connectors can then be implemented, one for each metadata system.

Interface

This interface will allow users to find information about tables, databases, etc. in the metadata store. For each call, it will pass the portion of the syntax tree relavant to the operation to the metadata connector. These structures will be versioned.

    /**
     * An interface to encapsulate DDL operations.
     */
    interface MetadataDDL {
        void createTable(CreateTable ct) throws IOException;
        void alterTable(AlterTable at) throws IOException;  // includes add and drop partition
        void dropTable(DropTable dt) throws IOException;
        SQLTable[] showTables(Database db) throws IOException;  // info returned in SQLTable includes info on partitions

        void createDatabase(CreateDatabase cd) throws IOException;
        void alterDatabase(AlterDatabase ad) throws IOException;
        void dropDatabase(DropDatabase dd) throws IOException;
        SQLDatabase[] showDatabases() throws IOException;

        /**
         * Get the default load function for this metadata service.  This 
         * will be called by SQL to determine the right load function for
         * the metadata service it is connected to.
         * @return class name of the default load function for this interface.
         */
        String getLoaderClass();

        /**
         * Get the default storage function for this metadata service.  This 
         * will be called by SQL to determine the right storage function for
         * the metadata service it is connected to.
         * @return class name of the default storage function for this interface.
         */
        String getStorageClass();


    }

Accessing Global Metadata From SQL

Pig will be configured to work with one global metadata source for a given set of SQL operations. This configuration will be via Pig's configuration file. It will specify the URI of the server to use and the implementation of !MetadataDDL to use with this server.

Accessing Global Metadata from Pig Latin

Pig Latin will not support a call to metadata within the language itself. Instead, it will support the ability to invoke a Pig SQL DDL command. This SQL will then be sent to the SQL parser and dispatched through the metadata service as before.

    A = load ...
    ...
    SQL {"create table myTable ..."};
    store Z into 'myTable' using OwlStorage();

Changes

Septemer 25 2009

  • Added getLoaderClass and getStorageClass to interface, Gates.

MetadataInterfaceProposal (last edited 2009-09-25 23:56:39 by AlanGates)