Index a DB table directly into Solr

Step 1 : Edit your solrconfig.xml to add the request handler

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
  <str name="config">data-config.xml</str>
</lst>
</requestHandler>

Step 2 : Create a data-config.xml file as follows and save it to the conf dir

<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost/dbname" 
              user="user-name" 
              password="password"/>
  <document>
    <entity name="id" 
            query="select id,name,desc from mytable">
    </entity>
  </document>
</dataConfig>

Step 3 : Ensure that your solr schema (schema.xml) has the fields 'id', 'name', 'desc'. Change the appropriate details in the data-config.xml

Step 4: Drop your JDBC driver jar file into the <solr-home>/lib directory .

Step 5 : Run the command http://solr-host:port/solr/dataimport?command=full-import. Keep in mind that every time a full-import is executed the index is cleaned up. If you do not wish that to happen add clean=false. For example: http://solr-host:port/solr/dataimport?command=full-import&clean=false

Index the fields in different names

Step: 1 Change the data-config as follows :

<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost/dbname" 
              user="user-name" 
              password="password"/>
  <document>
    <entity name="id" 
            query="select id,name,desc from mytable">
       <field column="id" name="solr_id"/>
       <field column="name" name="solr_name"/>
       <field column="desc" name="solr_desc"/>
    </entity>
  </document>
</dataConfig>

Step 2 : This time the fields will be written to the solr fields 'solr_id', 'solr_name', solr_desc'. You must have these fields in the schema.xml. Step 3 : Run the command http://solr-host:port/dataimpor?command=full-import

Index data from multiple tables into Solr

Step: 1 Change the data-config as follows :

<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost/dbname" 
              user="user-name" 
              password="password"/>
  <document>
    <entity name="outer" 
            query="select id,name,desc from mytable">
       <field column="id" name="solr_id"/>
       <field column="name" name="solr_name"/>
       <field column="desc" name="solr_desc"/>
       <entity name="inner"
               query="select details from another_table where id ='${outer.id}'">
              <field column="details" name="solr_details"/> 
       </entity>
    </entity>
  </document>
</dataConfig>

Step 2: The schema.xml should have the solr_details field

Step 3: Run the full-import command

DIHQuickStart (last edited 2010-03-04 20:24:26 by DuncanIrvine)