This page contains details about the hive design and architecture. A brief technical report about Hive is available at hive.pdf.

Figure 1 system_architecture.png

Hive Architecture

Figure 1 shows the major components of Hive and its interactions with Hadoop. As shown in that figure, the main components of Hive are:

Figure 1 also shows how a typical query flows through the system. The UI calls the execute interface to the Driver(step 1 in Figure 1). The Driver creates a session handle for the query and sends the query to the compiler to generate an execution plan(step 2). The compiler gets the necessary metadata from the metastore(steps 3 and 4). This metadata is used to typecheck the expressions in the query tree as well as to prune partitions based on query predicates. The plan generated by the compiler(step 5) is a DAG of stages with each stage being either a map/reduce job, a metadata operation or an operations on hdfs. For map/reduce stages, the plan contains map operator trees(operator trees that are executed on the mappers) and a reduce operator tree(for operations that need reducers). The execution engines submits these stages to appropriate components(steps 6, 6.1, 6.2 and 6.3 steps). In each task(mapper/reducer) the deserializer associated with the table or intermediate outputs is used to read the rows from hdfs files and these are passed through the associated operator tree. Once the output is generated, it is written to a temporary hdfs file though the serializer(this happens in the mapper in case the operation does not need a reduce). The temporary files are used to provide data to subsequent map/reduce stages of the plan. For DML operations the final temporary file is moved to the tables location. This scheme is used to ensure that dirty data is not read(file rename being an atomic operation in hdfs). For queries, the contents of the temporary file are read by the execution engine directly from hdfs as part of the fetch call from the Driver(steps 7, 8 and 9).

Hive Data Model

Data in Hive is organized into:

Apart from primitive column types(integers, floating point numbers, generic strings, dates and booleans), Hive also supports arrays and maps. Additionally, users can compose their own types programatically from any of the primitives, collections or other user defined types. The typing system is closely tied to the serde(Serailization/Deserialization) and object inspector interfaces. User can create their own types by implementing their own object inspectors and using these object inspectors they can create their own serdes to serialize and deserialize their data into hdfs files). These two interfaces provide the necessary hooks to extend the capabilities of Hive when it comes to understanding other data formats and richer types. Builtin object inspectors like ListObjectInspector, StructObjectInspector and MapObjectInspector provide the necessary primitives to compose richer types in an extensible manner. For maps(associative arrays) and arrays useful builtin functions like size and index operators are provided. The dotted notation is used to navigate nested types e.g. a.b.c = 1 looks at field c of field b of type a and compares that with 1.

Metastore

Motivation

Meta Store store provides two important but often over looked features of a data warehouse: data abstraction and data discovery. Without the data abstractions provided in Hive, user has to provide information about data formats, exractors and loaders along with the query. In Hive, this information given during table creation and reused everytime the table is referenced. This is very similar to the traditional warehousing systems. The second functionality, data discovery, enables users to discover and explore relevant and specific data in the warehouse. Other tools can be built using this metadata to expose and possibly enhance the information about the data and its availability. Hive accomplishes both of these features by providing a metdata repository that is tightly integrated with the Hive query processing system so that data and metadata are in sync.

Metadata Objects

Metastore Architecture

Metastore is an object store with a database or file backed store. The database backed store is implemented using ORM solution\cite{jpox}. The prime motivation for storting this in a relational database is queriability of metad data. Some disadvantages of using a separate data store for metadata instead using HDFS are synchronization and scalability issues. Additionally there is no clear way to implement an object store on top of HDFS due to lack of random updates to files. Coupled with this and the advantages of queriability of relational store made our approach a sensible one. Meta Store can be configured to be used in couple of ways: remote and embedded. In remote mode, meta store is a Thrift\cite{thrift} service. This mode is useful for non-Java clients. In embedded mode, Hive client directly connects to underlying meta store using JDBC. This mode is useful because it avoids another system that needs to be maintained and monitored. Both of these modes can co-exist.

Metastore Interface

Metastore provides Thrift interface\cite{msapi} to manipulate and query Hive metadata. Thrift provides bindings in many popular languages. Third party tools can use this interface to integrate Hive metadata into other business metadata repositories.

Hive Query Language

HiveQL is an SQL-like query language for Hive. It mostly mimics SQL syntax for creation of tables, loading data into tables and querying the tables. HiveQL also allows users to embed their custom map-reduce scripts. These scripts can be written in any language using a simple row-based streaming interface -- read rows from standard input and write out rows to standard output. This flexibility comes at a cost of a performance hit caused by converting rows from and to strings. However, we have seen that users do not mind this given that they can implement their scripts in the language of their choice. Another feature unique to HiveQL is multi-table insert. In this construct, users can perform multiple queries on the same input data using a single HiveQL query. Hive optimizes these queries to share the scan of the input data, thus increasing the throughput of these queries several orders of magnitude. We omit more details due to lack of space. For a more complete description of the HiveQL language see the language manual.

Compiler

Optimizer

More plan transformations are performed by the optimizer. The optimizer is an evolving component. Currently, it is rule-based and performs the following: column pruning, and predicate pushdown. However, the infrastructure is in place, and there is work under progress to include other optimizations like map-side join. The optimizer can be enhanced to be cost-based. The sorted nature of output tables can also be preserved and used later on to generate better plans. The query can be performed on a small sample of data to guess the data distribution, which can be used to generate a better plan. The plan is a generic operator tree, and can be easily manipulated.

Execution

Conclusion

Hive/Design (last edited 2009-09-20 23:53:56 by localhost)