Tuesday, April 02, 2013

Solved: MongoDB - How to add Binary data to a document using pymongo

I'm trying to add data to a MongoDB document. My data is tuples of doubles, and I'm adding to an array of them. This is a good example of how to add binary data to a mongodb document using pymongo. My doc looks like this (presume I have created it already):

$ mongo
mongos> use kevintest1
switched to db kevintest1
mongos> show collections
metricValue
system.indexes
mongos> db.metricValue.find()
{ "_id" : ObjectId("5159f82f64524c06f5cb1208"), "mtid" : 2, "seqno" : 0, "vals" : [ ] }
mongos> mtid2 = db.metricValue.findOne({ "mtid" : 2})
{
    "_id" : ObjectId("5159f82f64524c06f5cb1208"),
    "mtid" : 2,
    "seqno" : 0,
    "vals" : [ ]
}
mongos> mtid2
{
    "_id" : ObjectId("5159f82f64524c06f5cb1208"),
    "mtid" : 2,
    "seqno" : 0,
    "vals" : [ ]
}
mongos> mtid2.bsonsize()
Tue Apr  2 17:23:57 TypeError: mtid2.bsonsize is not a function (shell):1
mongos> Object.bsonsize(mtid2)
54
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
54

NOW, I take a break, and in a different window, I run the following program:
#!/usr/bin/python
import struct
import bson
import pymongo
from pymongo import Connection
from bson import Binary
from pprint import pprint, pformat
conn = pymongo.Connection('myboxname', 7333)
db = conn['kevintest1']
mvc = db.metricValue
print "have mvc.find: %s" % ([x for x in mvc.find()])
mybuffer = struct.pack("dd", 65535.7, 65535.8)
print "binary info to add:  %s" % (pformat(struct.unpack("dd", mybuffer)))
retval = mvc.update({ 'mtid': 2, 'seqno': 0}, { '$push': { 'vals' : Binary(mybuffer) }}, w=1)
print "retval of update: %s" % (retval)
print "After update, have mvc.find: %s" % ([x for x in mvc.find()]) 
 
So, I run it.  Then, I go back to the mongos window, and each time I run it, I check the bson size:

mongos> db.metricValue.findOne({ "mtid" : 2})
{
    "_id" : ObjectId("5159f82f64524c06f5cb1208"),
    "mtid" : 2,
    "seqno" : 0,
    "vals" : [
        BinData(0,"ZmZmZvb/70CamZmZ+f/vQA==")
    ]
} 
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
78
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
102
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
126
 
There you have it. 24 bytes per inserted pair of doubles. Without packing, it is 30 bytes, so I would be saving 24/30=12/15=4/5 => 20%.

No comments: