Blur Data Model

Blur is a table based query system. So within a single shard cluster there can be many different tables, each with a different schema, shard size, analyzers, etc. Each table contains Rows. A Row contains a row id (Lucene StringField internally) and many Records. A record has a record id (Lucene StringField internally), a family (Lucene StringField internally), and many Columns. A column contains a name and value, both are Strings in the API but the value can be interpreted as different types. All base Lucene Field types are supported, Text, String, Long, Int, Double, and Float.

Starting with the most basic structure and building on it.

Columns

Columns contain a name and value, both are strings in the API but can be interpreted as an Integer, Float, Long, Double, String, or Text. All Column types default to Text and will be analyzed during the indexing process.

  Column {"name" => "value"}

Records

Record contains a Record Id, Family, and one or more Columns

  Record {
    "recordId" => "1234",
    "family" => "family1",
    "columns" => [
      Column {"column1" => "value1"}
      Column {"column2" => "value2"}
      Column {"column2" => "value3"}
      Column {"column3" => "value4"}
    ]
  }

NOTE: The column names do not have to be unique within the Record. So you can treat multiple Columns with the same name as an array of values. Also the order of the values will be maintained.

Rows

Rows contain a row id and a list of Records.

  Row {
    "id" => "r-5678",
    "records" => [
       Record {
        "recordId" => "1234",
        "family" => "family1",
        "columns" => [
          Column {"column1" => "value1"}
          Column {"column2" => "value2"}
          Column {"column2" => "value3"}
          Column {"column3" => "value4"}
        ]
      },
      Record {
        "recordId" => "9012",
        "family" => "family1",
        "columns" => [
          Column {"column1" => "value1"}
        ]
      },
      Record {
        "recordId" => "4321",
        "family" => "family2",
        "columns" => [
          Column {"column16" => "value1"}
        ]
      }
    ]
  }

RowsAndRecords (last edited 2013-06-10 15:17:30 by AaronMcCurry)