Wednesday, August 13, 2014

Enabling connection to PostgreSql 9.3 from remote machines

To enable connections from a remote machine in PostgreSql 9.3 you only need to update the pg_hba.conf file.

Add a new line to the file as following:

host      all       all         XXX   255.255.255.255  trust

XXX = the ip address of the computer that your application is running on.  If your ip address is say 192.168.123.250 it is ok to enter it in pg_hba.conf as 192.168.123.0.  The .0 acts a wildcard in case your
computer is assigned a new ip address at a later time (hopefully it will be in the same range as the original...)

As a side note a message posted on postgres.org had mentioned about the need to update tcpip_socket = false to true in postgresql.conf.
http://www.postgresql.org/message-id/1087704118.23312.35.camel@localhost.localdomain

However I did not find such as entry in the postgresql.conf for PostgreSql 9.3; further more did not have the need for this to enable remote access in PostgreSql 9.3

Tuesday, August 12, 2014

Using pymssql's executemany to bulk insert

pymssql allows multi inserts using the executemany method.

executemany expects 2 params the insert statement and a list of tuples with the values to be inserted.

The insert statement should take the following form:

insertStr = "INSERT INTO table_name (col1,col2,col3) VALUES ($d, $s, $s) 

the string after the VALUES keyword represents the types of the respective columns

The list of tuples can take the following form:

valuesArray = [ (1,'a1','b1'),(2,'a2','b2'),(3,'a3','b3'),(4,'a4','b4'),(5,'a5','b5')]

So now these two parameters may be used as follows:

cursor.executemany(insertStr,valuesArray)

The example below shows a realistic example of this method may be used. Pay special attention to the dynamic generation of the list of tuples.

import psycopg2

valuesArray = []
                
for row in rows:
 values = []
 for i in range(len(row)):
   try:
    values.append(row[i].split(":")[1][0:])                          
                            
   except Exception as e:
    #return error message
    print "Error:" + str(e)

 #convert the list with the rows values to a tuple and add it to the list of lists
 valuesArray.append(tuple(values))

insertStr = "INSERT INTO table_name (col1,col2,col3) VALUES ($d, $s, $s) 

try:

 #establish connection with source DB
 con = pymssql.connect('sqlhost', 'sqluid', 'sqlpwd',  'sqldb')

 with con:
  #Execute insert statement
  cur = con.cursor()
  cur.executemany(insertStr,valuesArray)
  con.commit()

except Exception as e:
  #return error message
  print "Error:" + str(e)

finally:
  #Gracefully end connection        
  if con:
   con.close()



Thursday, August 7, 2014

How to copy dict object by value and not reference with Python

In Python when a copy of a dict object is made, the copy is made via reference.

For example

a = {"key1":"value1"}
b = a
a["key1"] = "value2"
print a
>> {"key1":"value2"}
print b
>> {"key1":"value2"}

However if we need to copy by reference we can use the:
copy.copy(x)
Return a shallow copy of x.
copy.deepcopy(x)
Return a deep copy of x.

 Methods from the library "copy".

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
  • shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
  • deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
The example above could now be rewritten like below:
 
import copy

a = {"key1":"value1"}
b = copy.deepcopy(a)
a["key1"] = "value2"
print a
>> {"key1":"value2"}
print b
>> {"key1":"value1"}

Monday, August 4, 2014

Using Pymongo Update to perform an Insert and Update

Was working on a module that would iterate through a row list that came back from a query and attempt to insert or update the specific columns of the returned rows to a collection on Mongo. Originally I was planning to use count({"id":row[id]}) > 0 to check for the existence of the given record in Mongo, and then execute a insert of push as needed. However I stumbled the update method which is part of pymongos collection level operators that allows me to execute all three actions, check for existence, insert or push, with one call.

So my original code to insert or update was as shown below:
 #check if row exist
 if coll.count({"MemberId": row["idclient"]}) > 0:
 #Insert row into the Collection
 coll.insert(details) else:
 #update the existing row
 coll.push({"MemberId": str(row["idclient"])}, { $push: { "field1": str(row["field1"]), "field2": str(row["field2"])} })
 As observable from above the push call requires us to decompose the details dictionary object into its elements to update.
However with the push I can skip these three lines and execute the same operation with one call: coll.update(key, details, True)
 I have to set up what key to use and also have to set the upsert flag to be true, then an update() (upsert) operation is performed and any existing document with that "_id" is overwritten. Otherwise an insert() operation is performed.
 See below for the link on stack overflow that provided the motivation for this and the documentation on MongoDB.org that explain the workings of the update method.

http://stackoverflow.com/questions/2801008/mongodb-insert-if-not-exists

http://api.mongodb.org/python/current/api/pymongo/collection.html