Tuesday, August 12, 2014

Using pymssql's executemany to bulk insert

pymssql allows multi inserts using the executemany method.

executemany expects 2 params the insert statement and a list of tuples with the values to be inserted.

The insert statement should take the following form:

insertStr = "INSERT INTO table_name (col1,col2,col3) VALUES ($d, $s, $s) 

the string after the VALUES keyword represents the types of the respective columns

The list of tuples can take the following form:

valuesArray = [ (1,'a1','b1'),(2,'a2','b2'),(3,'a3','b3'),(4,'a4','b4'),(5,'a5','b5')]

So now these two parameters may be used as follows:

cursor.executemany(insertStr,valuesArray)

The example below shows a realistic example of this method may be used. Pay special attention to the dynamic generation of the list of tuples.

import psycopg2

valuesArray = []
                
for row in rows:
 values = []
 for i in range(len(row)):
   try:
    values.append(row[i].split(":")[1][0:])                          
                            
   except Exception as e:
    #return error message
    print "Error:" + str(e)

 #convert the list with the rows values to a tuple and add it to the list of lists
 valuesArray.append(tuple(values))

insertStr = "INSERT INTO table_name (col1,col2,col3) VALUES ($d, $s, $s) 

try:

 #establish connection with source DB
 con = pymssql.connect('sqlhost', 'sqluid', 'sqlpwd',  'sqldb')

 with con:
  #Execute insert statement
  cur = con.cursor()
  cur.executemany(insertStr,valuesArray)
  con.commit()

except Exception as e:
  #return error message
  print "Error:" + str(e)

finally:
  #Gracefully end connection        
  if con:
   con.close()



No comments:

Post a Comment