The official documentation has moved to http://docs.couchdb.org — The transition is not 100% complete, but http://docs.couchdb.org should be seen as having the latest info. In some cases, the wiki still has some more or older info on certain topics inside CouchDB.

You need to be added to the ContributorsGroup to edit the wiki. But don't worry! Just email any Mailing List or grab us on IRC and let us know your user name.

Dump an Oracle Table to CouchDB using Python

Since CouchDB requires unique keys to access documents, importing data from an SQL database is an incitement to deep thought.

If you went to the trouble of creating a single unique key in your SQL database, you may be inclined to use it as your couchdb _id. Hard to go wrong there. However, for certain groups of documents (say a collection of blog posts), simply using an IsoFormattedDateAsDocId is pretty handy and can greatly simplify writing views in both Javascript and Python.

The first two examples expect a unique key to be extracted from the SQL database, the third example lets CouchDB create the unique key. Of the three, the first and last are the most useful. The second example is for static or "throwaway tables" that have very few values, but might be a pain to just hand-copy (ex. config data).

You'll need the python couchdb module from http://code.google.com/p/couchdb-python/ , plus whatever DBAPI2.0 compliant module you use to connect to your favorite SQL database. These examples use Oracle, but it should be trivial to convert.

Unique Values in SQL Table Converted to Individual Documents

Cro-Magnon simple python script which will take the contents of an SQL table and place it into CouchDB.

This particular example depends on the SQL table having one unique field. The values in the unique field become individual documents in the CouchDB database.

This is probably not what you want. :-)

   1 #!/usr/bin/env python
   2 import couchdb, cx_Oracle, ConfigParser, os, datetime
   3 
   4 # Convert a list of (sql tables, unique identifiers) into couchDB documents
   5 # Simply figure out which db you wish to import from, and a series of tables,
   6 # with unique fields you'd like to have them
   7 # indexed by, put them into the list of tuples below and let fly.
   8 # I store all my config stuff in an ini file locally.  you'll have to handle
   9 # your own connection to your db.
  10 # This is pretty quick-and-dirty and I don't recommend it for any sort of
  11 # production anything at all! :-)
  12 # Should work for any db api 2 compliant sql database.
  13 # Script guaranteed 100% slower than christmas.
  14 
  15 db_name = 'mydatabase'
  16 table_names = [ ('TABLE_NAME0', 'UNIQUE_FIELD'),
  17                 ('TABLE_NAME1', 'UNIQUE_FIELD'),
  18                 ('TABLE_NAME2', 'UNIQUE_FIELD') ]
  19 
  20 class GrabbyMitts(object):
  21     def __init__( self, db_name ):
  22         config = ConfigParser.ConfigParser()
  23         config.optionxform = str
  24         config.read( [ "sqlgen.ini", os.path.expanduser("~/.sqlgen.ini" ) ] )
  25 
  26         # oracle connection
  27         self.connection = cx_Oracle.Connection( config.get("Oracle", "login") )
  28 
  29         # couchdb location
  30         self.couch = couchdb.Server( "http://localhost:5984/" )
  31         try:
  32             self.db = self.couch.create( db_name )
  33         except:
  34             self.db = self.couch[ db_name ]
  35 
  36     def description( self ):
  37         # get a description of a given table
  38         # returns the "header" information in list
  39         query = "select * from %s where 1=0"%self.table_name
  40         cursor = cx_Oracle.Cursor( self.connection )
  41         cursor.execute( query )
  42         description = [ i[0] for i in cursor.description ]
  43         cursor.close()
  44         return description
  45 
  46     def uniques( self ):
  47         # unique value in sql table to create couchdb document ID
  48         cursor = cx_Oracle.Cursor( self.connection )
  49         query = "select %s from %s"%( self.mykey, self.table_name )
  50         cursor.execute( query )
  51         myuniques = [ i[0] for i in cursor.fetchall() if i[0] ]
  52         cursor.close()
  53         return myuniques
  54 
  55     def updateCouch( self, table_and_key ):
  56         # populate or update couchdb documents using sql table and unique
  57         # identifier
  58         self.table_name, self.mykey = table_and_key
  59         cursor = cx_Oracle.Cursor( self.connection )
  60         documents = []
  61         header = self.description()
  62 
  63         query = """
  64             select %s
  65             from %s
  66             where %s=:myunique"""""%( ", ".join( header ),
  67                                     self.table_name,
  68                                     self.mykey )
  69         cursor.prepare( query )
  70 
  71         for myunique in [ { "myunique": i } for i in self.uniques() ]:
  72             cursor.execute( None, myunique )
  73             entry = dict( zip( header, cursor.fetchone() ) )
  74             # mop up datetime objects as they occur, since json cries foul.
  75             # will probably need to convert to unix epochal time
  76             for k, v in entry.items():
  77                 if isinstance( v, datetime.datetime ):
  78                     entry[ k ] = str( v )
  79 
  80             if str( myunique[ 'myunique' ] ) not in self.db:
  81                 self.db[ str( myunique[ 'myunique' ] ) ] = entry
  82             else:
  83                 doc = self.db[ str( myunique[ 'myunique' ] ) ]
  84                 doc.update( entry )
  85                 self.db[ str( myunique[ 'myunique' ] ) ] = doc
  86 
  87         cursor.close()
  88 
  89 if __name__ == "__main__":
  90     gm = GrabbyMitts( db_name )
  91     for table_name in table_names:
  92         gm.updateCouch( table_name )

Unique Values in SQL Table Converted to a Single Document

Differs from above in that it places all table data into a single document.

select username, shoe_size, nostril_count, owns_weather_ballon from humans;

username | shoe_size | nostril_count | owns_weather_balloon

cletus        | 10            | 3                    | y

becomes:

   1 db[ 'humans' ]{'username': 'cletus','shoe_size': 10,'nostril_count': 3,'owns_weather_balloon': 'y' }

This probably isn't what you want either! Script does no checking to make sure that your particular value is unique.

   1 # HEY! THIS VERSION CAN MAKE A MOTHER-HUGE DOCUMENT! KNOW WHAT YOU ARE DOING!!!
   2 # Convert a list of (sql tables, unique identifiers) into a single couchDB document
   3 # Switch out with updateCouch() above
   4 
   5     def updateCouch( self, table_and_key ):
   6         # populate or update couchdb documents using sql table and unique
   7         # identifier
   8         # HEY! THIS CAN MAKE A MOTHER-HUGE DOCUMENT! KNOW WHAT YOU ARE DOING!!!
   9         self.table_name, self.mykey = table_and_key
  10 
  11         cursor = cx_Oracle.Cursor( self.connection )
  12         documents = []
  13         header = self.description()
  14 
  15         query = """
  16             select %s
  17             from %s
  18             order by %s"""""%( ", ".join( header ),
  19                                     self.table_name,
  20                                     self.mykey )
  21 
  22         cursor.execute( query )
  23         results = dict(
  24                 str( row[0] ), dict( zip( header, row ) )
  25                   for row in cursor.fetchall()
  26                 )
  27 
  28         # clean up any datetime fields
  29         for row in results:
  30             for field, value in results[ row ].items():
  31                 if isinstance( value, datetime.datetime ):
  32                     results[ row ][ field ] = str( value )
  33         self.db[ self.table_name ] = results
  34 
  35         cursor.close()

All Values from SQL Table Imported, Let CouchDB Assign Keys

This is most likely what you want!

   1 #!/usr/bin/env python
   2 # -*- coding: utf-8 -*-
   3 # You can get this from https://github.com/lysdexia/Oracle2CouchDB
   4 #################################################################
   5 # USAGE: Oracle2CouchDB
   6 # Edit the o2c.ini file, in the [oracle] section, supply your oracle
   7 # authentication values, the oracle table to be cloned and an optional
   8 # larger cursor_arraysize value. (A larger arraysize will often be 
   9 # advantageous when using fetchmany())
  10 # supply corresponding data for couchdb in the [couch] section
  11 #################################################################
  12 
  13 import cx_Oracle, datetime, sys, types, urllib2, ConfigParser, types
  14 from subprocess import Popen, PIPE
  15 import couchdb, couchdb.design
  16 couchdb.json.use('cjson')
  17 
  18 class OraSuck(object):
  19     def get_connection(self, db):
  20         self.connection = cx_Oracle.Connection(tns)
  21         self.cursor = cx_Oracle.Cursor(self.connection)
  22 
  23     def header(self, table_name):
  24         self.cursor.execute("select * from %s where 1=0"%table_name)
  25         return [i[0].lower() for i in self.cursor.description]
  26 
  27     def sample(self):
  28         rows = []
  29         hdr = self.header(oratable)
  30         cmd = """select %s from %s"""%(", ".join(hdr), oratable)
  31 
  32         # if we have no specified arraysize, we'll use the fetchmany() default
  33         # of fifty records.
  34         if cursor_arraysize:
  35             self.cursor.arraysize = int(cursor_arraysize)
  36         self.cursor.prepare(cmd)
  37         self.cursor.execute(cmd)
  38         while True:
  39             rows = [dict(zip(hdr, r))
  40                     for r in self.cursor.fetchmany()]
  41             for row in rows:
  42                 for key in row:
  43                     # try to convert DATE to standard ISO format
  44                     if hasattr(row[key], "isoformat"):
  45                         row[key] = row[key].isoformat()
  46             yield rows
  47 
  48 class CouchBlow(object):
  49 
  50     def connect(self):
  51         couch = couchdb.Server(couch_server)
  52 
  53         if not all([username, password]):
  54             sys.exit("You'll need a username and password")
  55 
  56         couch.resource.credentials = (username, password)
  57         try:
  58             db = couch[oratable]
  59         except:
  60             db = couch.create(oratable)
  61         return db
  62 
  63     # use ora.sample() generator to load db
  64     def load(self):
  65         get_rows = ora.sample()
  66         while get_rows:
  67             rows = get_rows.next()
  68             for row in rows:
  69                 try:
  70                     self.db.save(row)
  71                     print (row["searchtime"])
  72                 except couchdb.http.ResourceConflict, error:
  73                     err = "%s: %s"%(error.message)
  74                     print ("%s: %s"%(row["searchtime"], err))
  75 
  76 if __name__ == "__main__":
  77     config = ConfigParser.ConfigParser()
  78     config.optionxform = str
  79     config.read("o2c.ini")
  80 
  81     couch_server = config.get("couch", "couch_server")
  82     username = config.get("couch", "username")
  83     password = config.get("couch", "password")
  84 
  85     oratable = config.get("oracle", "oratable")
  86     oradb = config.get("oracle", "oradb")
  87     tns = "@".join([config.get("oracle", "tns"), oradb])
  88     cursor_arraysize = config.get("oracle", "cursor_arraysize")
  89 
  90     cb = CouchBlow()
  91     cb.db = cb.connect()
  92 
  93     ora = OraSuck()
  94     ora.get_connection(oradb)
  95 
  96     cb.load()

Example .ini file

   1 [oracle]
   2 tns = scott/tiger
   3 oradb = mydb
   4 oratable = mytable
   5 cursor_arraysize = 256
   6 
   7 [couch]
   8 couch_server = http://my.couchdb:5984
   9 username = dragon_lady
  10 password = what part of get thee gone do you not understand

DumpOracleDbToCouchDbPython (last edited 2011-11-23 15:15:28 by JanLehnardt)