Friday, December 13, 2013

Python version of iostat.c

Ok, so I have a problem. I'm trying to create metrics for a Linux system, an insert them into a local database. Constraints:
  • Runs on lots of machines;
  • machines may be heavily loaded;
  • Sometimes the kick-off time is delayed beyond a 1-minute standard (typically due to load);
  • I don't like having long-running subprocesses: they might stop/fail and I'd have to restart, they use memory, etc.;
  • I'm doing this in Python;
  • I can store state between runs in a pickle file;
  • I want to replicate the existing fields coming out of vmstat -s and iostat -x -D n (where n is number of seconds of sample size);
  • I want the values of these fields to match likewise.
So, I need to replicate iostat.c in python. I can get absolute numbers from iostat and vmstat and stat, and do the math myself, storing state from the last time I ran it and subtracting to get a diff.

Problem 1: Where is the source code for iostat.c ? In ubuntu at least (hoping RHEL / CENTOS is similar) it turns out it's in the sysstat package. I found systat source at: http://freecode.com/projects/sysstat.

Inside this package, there's source code for iostat as a file named iostat.c. I have yet to find it online, so here it is:
General plan:
  1. Read current values from /proc/stats, /proc/diskstats, and vmstat -s
  2. read values from previous run from disk
  3. find diffs
  4. use diffs to compute values needed
  5. write current values to disk for next time.
This will involve significant coding, don't know if I'll have space to post it all here...


Sunday, October 27, 2013

SOLVED: Ubuntu installation of Canon MX452 Inkjet Printer

Unlike many printers and Linux, this install went simply.
  • Unbox printer.
  • remove orange packing tape.
  • unbox power cords and usb cable, install.
  • open front pull-down cover, and one under it - push down on grey loops gently and put in inkjet cartridges.
  • Put paper in at bottom, only hold 50 sheets or so.
  • turn on, wait.
  • open terminal window, type sudo ls and enter pw.
  • open browser, get download from http://support-sg.canon-asia.com/contents/SG/EN/0100515301.html
  • in terminal, cd ~/Downloads
  • tar -xzvf cnijfilter*
  • cnijfilter*
  • On printer, turn off and on again just in case.
  • sudo ./install.sh
  • follow prompts accepting defaults.
  • in browser, open google.com and print open page as test. Should hear printer working.
Done.

Tuesday, October 22, 2013

Optimizing Python - getting data out of memcache with struct.unpack

So, I have this Memcache data store that holds timestamps and values from a monitoring application. Since each memcache key corresponds to an hour's data, I only need to store 2 bytes for the number of seconds past the hour. I don't care about duplicate data being stored, but on retrieval I'd like to eliminate it if it exists.

Input data is: (ts,val), (ts, val), ... encoded using Python's struct.pack command. The ts (timestamp) is (as noted) packed with format h (unsigned int). The val (value) is a floating point number of 4 bytes, packed with format f.

The original version of this encoding was:

    def OLD_rawDataToTsVals(self, timeOffset, raw):
        tsVals = []
        while raw:
            ts, val, raw = raw[:2], raw[2:6], raw[6:]
            ts = timeOffset + struct.unpack('h', ts)[0]
            val = struct.unpack('f', val)[0]
            tsVals.append((ts, val))
        return tsVals
I found this version:
  • ran really slowly;
  • didn't eliminate duplicate values;
  • would really choke the longer the input data (as in 33,000 datapoints in an hour).

For the second version, I knew I had to stop with the copying of the data string over and over again, which I knew was eating major cycles.

Doing some math, I figured out I could iterate over the string, extracting each element and converting the two parts to python numbers.

    def rawDataToTsVals(self, timeOffset, raw):
        tsVals = []
        for i in range(0, len(raw), 6):
            rawtime = raw[i:i+2]
            ts = timeOffset + struct.unpack('h', rawtime)[0]
            val = struct.unpack('f', raw[i+2:i+6])[0]
            tsVals.append((ts, val))
        return tsVals

This was better timewise, but didn't remove duplicate data. I made the 'seen it yet' test occur even before the conversion to (int, float), which saved a bit of time doing useless conversions.

    def rawDataToTsVals(self, timeOffset, raw):
        tsVals = []
        seenTimes = set()
        for i in range(0, len(raw), 6):
            rawtime = raw[i:i+2]
            if rawtime in seenTimes:
                continue
            seenTimes.add(rawtime)
            ts = timeOffset + struct.unpack('h', rawtime)[0]
            val = struct.unpack('f', raw[i+2:i+6])[0]
            tsVals.append((ts, val))
        return tsVals

Yet, it was STILL TOO SLOW. Where was the time going? I timed the various parts and found the slow bit was the conversion to int/float. That unpack was happening a lot and the time added up.

I tried the following but FAILED.

        # BAD DON'T USE ** BAD DON'T USE **
        elems  = rawLen / 6.0  # 6 bytes per - 2=time + 4=data.
        intElems = int(elems)
        if (elems != intElems):
            self.log.warning("elems non-integer: len: %s" % (rawLen))
            return []
        unp = struct.unpack("hf"*intElems, raw)
        # BAD DON'T USE ** BAD DON'T USE **

The above fails because if we pack these things together, there's a word-alignment problem that unpack is unable to cope with. It would have to be something like (int, zeroes, float) to make the float align on a word boundary.

But, I couldn't give up, this had to work better. So, I extract all the ints, string those together and unpack them, then do the same thing with the floats.

HERE IS THE FINAL VERSION:

    def rawDataToTsVals(self, timeOffset, raw):
        tsVals = []
        seenTimes = set()
        try:
            rawLen = len(raw)
            times = ""
            vals  = ""
            for i in range(0, rawLen, 6):
                rawtime = raw[i:i+2]
                if rawtime in seenTimes:
                    continue
                times += rawtime
                vals  += raw[i+2:i+6]
            timesList = struct.unpack('h'*(len(times)/2), times)
            valsList  = struct.unpack('f'*(len(vals)/4),  vals)
            assert len(timesList) == len(valsList), "Lens of times and vals unequal, t=%s, v=%s" % (len(timesList), len(valsList))
            for i in range(0, len(timesList)):
                tsVals.append((timeOffset+timesList[i], valsList[i]))
            #self.log.debug("unpacked %d vals, len ts %s." % (len(timesList), len(tsVals)))
        except:
            tb = traceback.format_exc()
            self.log.debug("tb in rawDataToTsVals(): rawSize: %s, %s" % (rawLen, tb))
            pass
        
        if 0:  # debugging
            self.log.debug("tsvals: %s" % ( tsVals))
        return tsVals

Unpacking all the h's at the same time, and likewise the floats, makes everything align, and since it's one function call to struct, is very fast.

Enjoy!

Monday, September 09, 2013

Alternate Fortune Cookie Sayings

I had 'Panda Express' Chinese take-out for lunch, and found the fortune cookes to be too boring. Thus, I've created some of my own, in the hope they liven things up a bit. Feel free to reply with some, if you're inspired to do so.
  • Avoid doing new things, you might get hurt.
  • Your opinions are frequently wrong.
  • Now is not a good time to invest.
  • Your face betrays you.
  • Other people work harder than you do.
  • Your life has no meaning.
  • Your lucky number is Zero.
  • Avoid having opinions, you might be wrong.
  • Do not finish any projects tomorrow.
  • Avoid doing things that require too much thought.
  • Your efforts are doomed to failure.
  • Ask people for help, pleading stupidity.
  • Insult the nationality of all new people.
  • Your smile looks pathetic, only use it when begging.
  • Your kids will always ignore you.
  • Things are always as they seem.
  • Bad news is coming, in large amounts.
  • Fear your loved ones.
  • Sunrises bode not well for your financial future.
  • Steal everything before your friends stop liking you.
  • Wear red on Tuesdays to avoid a dishonorable death.
  • Cancel all your credit cards before it's too late.
  • Your wishes of last Wednesday will never come true.
  • Buy new underwear before they find out.
  • Several former friends are conspiring against you.
  • Never order fries with that.
  • People's clothes indicate if they like you.
  • Someone knows what you did that time.
  • Invest only in companies starting with the letter F.
  • Learn how to cook roadkill safely. Soon.
  • To see far, sleep on your roof.
  • Learn how to chop the head off an attacker.
  • As of yesterday, praying became useless.
  • The Antichrist was born last week.
  • To kill the bugs, wash all your clothes on hot.
  • Your spouse and your best friend have a secret.
  • When you see the flash, duck, it might help.
  • Horseback riding will be useless in 3 years.
  • Farm animals like you until they smell you.
  • Unplug your lamps and toaster before it's too late.
  • A phone call will soon make you cry loudly.
  • Fear everything, it's safer that way.
  • Put your affairs in order before tomorrow night.
  • Give away your posessions, but don't leave a note.
  • Contemplate everything, but don't commit, ever.
  • Physical pain can help you make hard decisions.
  • To change the inside, change the inside.
  • Lock up your women. 
Have any others?

Tuesday, August 13, 2013

Example of how to set JournalCommitInterval in MongoDB

I have found no way yet to find the current value of journalCommitInterval.

But, I have found how to set the journalCommitInterval.

We've got our mongodb set up to use a local disk for our journal, and data resides on a SAN-mounted mount point.

We'd like to set our journalCommitInterval up to a larger value to cut down on the IOPS to our local disk. This might have the added benefit of helping us run faster.

I've run the following command against a mongos, and it complains:

mongos> db.adminCommand({ "setParameter" : 1, "journalCommitInterval": 499});
{ "ok" : 0, "errmsg" : "journaling is off" }
When I run this against a specific daemon, it works up to a value of 499 (setting it to 500 generates a complaint/error):
shard-000:PRIMARY> db.adminCommand({ "setParameter" : 1, "journalCommitInterval": 499});
{ "ok" : 1 }
I have posed the follwoing questions to 10Gen:
  • Question 1: do I need to run this against each mongod individually?
  • Answer 1: Yes.
  • Question 2: does this persist across restarts of the daemon?
  • Answer 2: No. That daemon's journalCommitInterval returns to its default when a daemon are restarted. Thus, it's better to set it in the config file you have defined for the shard. Or, if you define it via command line, do something like:
    /path/to/mongod --journalCommitInterval=499 ...
    
I have just created a minor request in core-server to expose the current value so we can verify it has been set correctly on all shards. If you agree this would be a handy feature to have, please watch and VOTE on the case, https://jira.mongodb.org/browse/SERVER-10508

Thursday, August 08, 2013

Solved: Installing Linux on Acer Aspire V5-122P-0643 Quadcore AMD laptop

So, my wife wanted an Acer Aspire laptop because of its size, primarily. The form factor was just right for her. But, it came with Windows 8, which she really hates (she loves Ubuntu Linux and Libre Office Writer). So, I downloaded Ubuntu Linux 13.04 Raring Ringtail and installed it on this Aspire V5 122P 0643 box. Complications?

1. windows 8 doesn't like to give up control of the BIOS. Going to system settings and boot from USB key (which I'd already put Ubuntu's install image on via another box).

2. Once installed, it wanted to reboot. I did so.

3. VAST multi-day hassles due to black screen after boot, then prompting for user login: instead of going to the graphical user login page familiar to all Ubuntu users. I messed around for a long time trying various things. Running startx failed with a message about no screens. Trying to reboot with some setting for recovery mode got me success once, but I couldn't reproduce it.

The command to show all hw installed revealed the video chipset is AMD Radeon 8280. This appears to be the only laptop, or in fact any device whatsoever, that uses this chipset, though Ubuntu seems to think it's on some desktops somewhere (in their compatibility pages).

4. FINALLY, solved the problem. On my other laptop, navigated to find the download for AMD's proprietary driver, and found the download destination.

Logged into Ubuntu using the user created during installation. Did 'sudo ls' and reentered my password so I had sudo privs without prompting. Then, downloaded the AMD driver from the link on this page: http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx

the download link was http://www2.ati.com/drivers/linux/amd-driver-installer-catalyst-13-4-linux-x86.x86_64.zip, so I retyped it at the command prompt as:

$ wget http://www2.ati.com/drivers/linux/amd-driver-installer-catalyst-13-4-linux-x86.x86_64.zip

and it downloaded. I

then did: unzip amd* and then did: chmod 755 amd* then agreed to everything and installed it, rebooted per instructions, and voila! IT WORKED!

So, the answer is that the fglrx drivers (automatically installed as part of the amd drivers download, methinks) are the correct solution here.

The touch screen does work , but is of limited usefulness since the icons are so small in a normal linux desktop screen, as they should be. After all , windows 8 sucks at trying to unite the phone with the desktop environment, because they're disparate platforms and everyone except Microsoft seems to know this.

Enjoy!

Friday, July 19, 2013

How to Disable password prompt during ssh login with key failure

How to Disable password prompt during ssh login with key failure

I wanted to login to a bunch of servers and test whether I could get in, or if I got error messages.

After messing around a while, I worked it into:

for n in `cat allhosts.txt`; do echo "host: ${n}"; ssh -oConnectTimeout=2 -oKbdInteractiveAuthentication=no -oPasswordAuthentication=no -oStrictHostKeyChecking=no -oChallengeResponseAuthentication=no myuser@${n} echo '----------'; done

Thursday, June 20, 2013

Question on StackOverflow: Best Unittest for Python programs using OptionParser() ?

Just posted this on stackoverflow. http://stackoverflow.com/questions/17223530/python-how-do-i-usefully-unittest-a-method-invoking-optionparser

MagicMock Returns different values each call

Recent discoveries about mock objects!

I'm trying to create a Python unittest that exercises a class method with a time() call in it.

That is I have this:

class DumObj(object):
def methodOne(self, intime):
while time.time() < intime:
... do stuff ...
so, I refactored this to:

class DumObj(object):
def nowTime(self):
return time.time()
def methodOne(self, intime):
while self.nowTime() < intime:
... do stuff ...
and have to test methodOne. How to get it to go through the loop once (or twice)?

I tried setting the return value to the output of a function:

import unittest
from mock import Mock, MagicMock
class TestDumObj(unittest.TestCase)
def test_methodOne(self):
d = DumObj()
retvals=[1,2,3,4,5]
def mf():
ret = retvals.pop()
return ret
d.nowTime = MagicMock(return_value=mf()) d.methodOne()

Doesn't work. Here's the interactive version:

>>> from mock import MagicMock
>>> retvals = [ 1, 2, 3, 4, 5 ]
>>> def mse(*args, **kwargs):
... ret = retvals.pop()
... return ret
...
>>> mm = MagicMock(return_value=mse())
>>> mm()
5
>>> mm()
5

Damn! I want MagicMock to return different values each time it's called. I want to have an array of return values for MagicMock, and each call it returns the next one. How to do this? Turns out it's using the side_effect param. Continuing from above:

>>> mm = MagicMock(side_effect=mse)
>>> mm()
4
>>> mm()
3
Yay! Solved: MagicMock returns different values each call. Now, tie it into the overall test:

import unittest
from mock import Mock, MagicMock
class TestDumObj(unittest.TestCase)
def test_methodOne(self):
d = DumObj()
retvals=[1,2,3,4,5]
def mse():
ret = retvals.pop()
return ret
d.nowTime = MagicMock(side_effect=mse) d.methodOne()

Works!

Tuesday, May 14, 2013

Drop-Thru Code Considered Harmful

Recently, I was describing my frustrations with some Python code that had been written by a now-departed co-worker.  I describe this code as "Drop-Thru Code".  The main characteristic is that it uses module scope for a non-trivial number of variables.

Module scope is functionally almost-global.  That is, it's so close to global you'll want to use it, but it's far enough away that you'll end up shooting yourself in the foot.

Consider this code snippet:

#!/bin/env python2.6
import os
import sys
varname = 33
vardict = { 'a' : 44 }
# other imports here
print "Starting!"
def something():
print "thing", varname
print "next"
something()
print "end"

This is what I call drop-thru code.  All the variables are global, and the call to something() will fail because varname is out of scope.

Contrast this with what I prefer, well-encapsulated code:

#!/bin/env python2.6
import os
import sys

class SomeThing(object):
    def __init__(self):
        self.varname = 33 
    def something(self):
        print "thing", self.varname
    def main(self):
        self.vardict = { 'a' : 44 }
        # other imports here
        print "Starting!"
        print "next"
        self.something()
        print "end" 

st = SomeThing()
st.main()

Nothing is global, it's obvious what the scope for things is.  Clean, beautiful, easy to understand.

There are lots of examples of scope problems.  I created this one for my team so they understood my frustration:

outside = 1
def function1():
    try:
        print "f1 outside: %s" % (outside)
        outside += 1
    except:
        print "no outside in function1."
print "a outside: %s" % (outside)
function1()                                     
print "b outside: %s" % (outside)
def function2():
    global outside
    try:
        print "f2 outside: %s" % (outside)
        outside += 1
    except:
        print "no outside in function2"
function2()
function2()
class Dum(object):
    def __init__(self):
        print "dum instantiated."
    def main(self):
        print "Dum main outside: %s" % (outside)
    def changeOutside(self, inval):
        outside = inval
        print "Dum changed: outside: %s" % (outside)
    def changeGlobalOutside(self, inval):
        global outside
        outside = inval
        print "Dum changed: outside: %s" % (outside)
d = Dum()
print "c outside: %s" % (outside)
d.main()
print "d outside: %s" % (outside)
d.changeOutside(33)
print "e outside: %s" % (outside)
d.main()
d.changeGlobalOutside(44)
d.main()
print "f outside: %s" % (outside)
______________________________________________________
output:
krice4@zaphod:~/checkouts/userSandboxes/krice4$ python scopeTest.py 
a outside: 1
no outside in function1.
b outside: 1
f2 outside: 1
f2 outside: 2
dum instantiated.
c outside: 3
Dum main outside: 3
d outside: 3
Dum changed: outside: 33
e outside: 3
Dum main outside: 3
Dum changed: outside: 44
Dum main outside: 44
f outside: 44

In short, Drop-Thru Code is considered harmful.  It allows for lots of scope problems that show up as bugs and frustrations later.  So, avoid them.  Put all the vars you can in a class and invoke that class, you'll be glad you did.  IMHO.

Friday, May 10, 2013

Subversion pre-commit hook script - Python files: prevent tabs, verify svn properties

Here's a precommit hook script I've modified from one I've used before.  I hope it comes in handy.  I'm going to try to submit it to the dev group of subversion itself for inclusion in the contrib/hook-scripts directory.

#!/bin/env python

"""
    pre-commit hook script that does several things:
    * prevents committing any python file containing a tab character.
    * checks if there are tabs in the source file and warns if so;
    * aborts if incorrect properties of eol-style and keywords 'id'.
"""
import sys
import os
import traceback
from optparse import OptionParser

#sys.stderr.write("NOTE:  pre-commit hook script enabled - checks for tabs, svn eol-style and id properties...\n")

def command_output(cmd):
    " Capture a command's standard output. "
    import subprocess
    return subprocess.Popen(
        cmd.split(), stdout=subprocess.PIPE).communicate()[0]

def files_changed(look_cmd):
    """ List the files added or updated by this transaction.

        "svnlook changed" gives output like:
          U   trunk/file1.cpp
          A   trunk/file2.py
    """
    def filename(line):
        return line[4:]

    def added_or_updated(line):
        return line and line[0] in ("A", "U")

    retval = []
    for line in command_output(look_cmd % "changed").split("\n"):
        if added_or_updated(line):
            retval.append(filename(line))
    #sys.stderr.write("files changed: %s" % (retval))
    return retval

def file_contents(filename, look_cmd):
    " Return a file's contents for this transaction. "
    return command_output("%s %s" % (look_cmd % "cat", filename))

def file_get_properties(filename, look_cmd):
    propslines = command_output("%s %s" % (look_cmd % "proplist -v", filename))
    res = {}
    for line in propslines.split('\n'):
        line = line.strip()
        if not line:
            continue
        k, v = line.split(' : ')
        res[k] = v
    return res

def contains_tabs(filename, look_cmd):
    " Return True if this version of the file contains tabs. "
    return "\t" in file_contents(filename, look_cmd)

def check_py_files(look_cmd):
    " Check Python files in this transaction are tab-free. "
   
    def is_py_file(fname):
        return os.path.splitext(fname)[1] == ".py"
   
    py_files_with_tabs    = set()
    py_files_bad_eolstyle = set()
    py_files_bad_exec     = set()
    py_files_bad_keywords = set()
    for ff in files_changed(look_cmd):
        if not is_py_file(ff):
            continue
        if contains_tabs(ff, look_cmd):
            py_files_with_tabs.add(ff)
        props = file_get_properties(ff, look_cmd)
       if props.get('svn:special'):
            sys.stderr.write("file %s has svn:special flag, probably a symlink. don't check other props." % ff)
            continue
        eolstyle = props.get('svn:eol-style')
        #sys.stderr.write("props: %s\neolstyle: '%s'\n" % (props, eolstyle))
        if eolstyle in ('native', 'LFFFFF'):
            py_files_bad_eolstyle.add(ff)
        execut   = props.get('svn:executable')
        if execut not in ['ON', '*']:
            py_files_bad_exec.add(ff)
        keywords = props.get('svn:keywords')
        if (not keywords) or ('Id' not in keywords.split()):
            py_files_bad_keywords.add(ff)

    preventCommit = False
    if len(py_files_with_tabs) > 0:
        sys.stderr.write("The following files contain tabs:\n%s\n"                                                                              % "\n".join(py_files_with_tabs))
        preventCommit = True
    if len(py_files_bad_exec) > 0:
        sys.stderr.write("The following py files are missing 'executable' property, committing anyway, but please fix this:\n%s\n"              % "\n".join(py_files_bad_exec))
        # note, do not prevent commit over this, just warn.
    if len(py_files_bad_keywords) > 0:
        sys.stderr.write("The following files don't have keywords property set to 'Id' at least.  Please fix this before committing:\n%s\n"     % "\n".join(py_files_bad_keywords))
        preventCommit = True
    if len(py_files_bad_eolstyle) > 0:
        sys.stderr.write("The following files don't have svn propset svn:eol-style 'LF', please do so before committing:\n%s\n"                 % "\n".join(py_files_bad_eolstyle))
        preventCommit = True

    return preventCommit

def main():
    usage = """usage: %prog REPOS TXN
        Run pre-commit options on a repository transaction."""

    parser = OptionParser(usage=usage)
    parser.add_option("-r", "--revision",
                      help="Test mode. TXN actually refers to a revision.",
                      action="store_true", default=False)
    errors = 0
    try:
        (opts, (repos, txn_or_rvn)) = parser.parse_args()
        look_opt = ("--transaction", "--revision")[opts.revision]
        look_cmd = "svnlook %s %s %s %s" % (
            "%s", repos, look_opt, txn_or_rvn)
        errors += check_py_files(look_cmd)
    except:
        parser.print_help()
        errors += 1
        sys.stderr.write("Pre-commit hook traceback: %s" % (traceback.format_exc()))
    return errors

if __name__ == "__main__":
    sys.exit(main())

Wednesday, May 08, 2013

Setting up Graphite / Django - MemoryError in Glyph

I'm setting up Graphite (under Django, of course) on a new box, and just saw the following error:

Exception Type: MemoryError
/opt/graphite/webapp/graphite/render/glyph.py in getExtents, line 221

specifically on the line: 
 F = self.ctx.font_extents()
 
turns out the problem was that I didn't have fonts installed.  New box, right?  Yah.
So, did apt-get install:
 
apt-get install cairo
apt-get install pycairo
apt-get install bitmap-fonts
 
The last one took a while.  But, now it's up and going!
 
 
BTW, the total list of apt-get I ran to get Graphite running on Centos-6.1 with python 2.6 was as follows:
 
apt-get install subversion
apt-get install httpd
apt-get install Django 
apt-get install mod_wsgi
apt-get install memcached
apt-get install python-devel
apt-get install  python-tools python-setuptools python-memcached python-pycurl
easy_install django-tagging
easy_install -U distribute
apt-get install libmysqlclient-dev
apt-get install mysql-libs
easy_install python-mysqldb
apt-get install MySQL-python
apt-get install cairo
apt-get install pycairo
apt-get install bitmap-fonts
 
Enjoy!

Oh, and if you are not satisfied with the quality of this posting, Complaints should be addressed to: 
 
Complaints Dept.
British Airways
Ingrams Drive 
Redditch, B79 5UT  
UNITED KINGDOM 
 
 

Wednesday, April 17, 2013

Too Smart for their Britches

I just ran into a bit of bash script that did this:

conflist=`cat www-conf/* |sort -u |sed ':a;N;$!ba;s/\n/ /g'`

On first glance, it's getting cat'ing all the files in a dir, sorting, and returning the result.  BUT WAIT, there's a funky sed script in there.

Sure, I get the 's/\n/ /g' -> get rid of intervening newlines.  But what of the other stuff?

It turns out, some smart-ass decided to code up some fancy schmantzy "I'm Smarter Than You" thumb-to-nose action.   How it works, courtesy of http://www.grymoire.com/Unix/Sed.html :

  • The semicolons separate commands.
  • The :a is a tag, so we can 'goto a' later.
  • The N command says append lines together including their newline character;
    •  The "n" command will print out the current pattern space (unless the "-n" flag is used), empty the current pattern space, and read in the next line of input. The "N" command does not print out the current pattern space and does not empty the pattern space. It reads in the next line, but appends a new line character along with the input line itself to the pattern space.
  • the $!ba command says, 'unless you're the last line of the file, branch to a (goto a).
    •   An easier way is to use the special character "$," which means the last line in the file.

So, the goal was to collapse all this data into a sorted list without newlines.  Note that if I echo $conflist, I see "a b c d e..." just like if I had left out those fancy branches.

What an utter waste of 10 minutes to figure this out, and another 10 to write this up for future reference and helping anyone else.

Questions:
  • Will I ever use this construct?  NO.  I'll find some other way to do it.  If I even need to, given Bash collapses things onto one line with an echo anyway.
  • Do I actively dislike the person who wrote this?  YES.  I don't know who it was, specifically.  But, it doesn't matter, I know what I need to know:  He wasted my time, and was what I would consider the professional equivalent of a braggart.  Code should be easy to read as a primary goal.
Note also, thanks for help:  http://www.catonmat.net/blog/sed-one-liners-explained-part-one/


Tuesday, April 02, 2013

Solved: MongoDB - How to add Binary data to a document using pymongo

I'm trying to add data to a MongoDB document. My data is tuples of doubles, and I'm adding to an array of them. This is a good example of how to add binary data to a mongodb document using pymongo. My doc looks like this (presume I have created it already):

$ mongo
mongos> use kevintest1
switched to db kevintest1
mongos> show collections
metricValue
system.indexes
mongos> db.metricValue.find()
{ "_id" : ObjectId("5159f82f64524c06f5cb1208"), "mtid" : 2, "seqno" : 0, "vals" : [ ] }
mongos> mtid2 = db.metricValue.findOne({ "mtid" : 2})
{
    "_id" : ObjectId("5159f82f64524c06f5cb1208"),
    "mtid" : 2,
    "seqno" : 0,
    "vals" : [ ]
}
mongos> mtid2
{
    "_id" : ObjectId("5159f82f64524c06f5cb1208"),
    "mtid" : 2,
    "seqno" : 0,
    "vals" : [ ]
}
mongos> mtid2.bsonsize()
Tue Apr  2 17:23:57 TypeError: mtid2.bsonsize is not a function (shell):1
mongos> Object.bsonsize(mtid2)
54
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
54

NOW, I take a break, and in a different window, I run the following program:
#!/usr/bin/python
import struct
import bson
import pymongo
from pymongo import Connection
from bson import Binary
from pprint import pprint, pformat
conn = pymongo.Connection('myboxname', 7333)
db = conn['kevintest1']
mvc = db.metricValue
print "have mvc.find: %s" % ([x for x in mvc.find()])
mybuffer = struct.pack("dd", 65535.7, 65535.8)
print "binary info to add:  %s" % (pformat(struct.unpack("dd", mybuffer)))
retval = mvc.update({ 'mtid': 2, 'seqno': 0}, { '$push': { 'vals' : Binary(mybuffer) }}, w=1)
print "retval of update: %s" % (retval)
print "After update, have mvc.find: %s" % ([x for x in mvc.find()]) 
 
So, I run it.  Then, I go back to the mongos window, and each time I run it, I check the bson size:

mongos> db.metricValue.findOne({ "mtid" : 2})
{
    "_id" : ObjectId("5159f82f64524c06f5cb1208"),
    "mtid" : 2,
    "seqno" : 0,
    "vals" : [
        BinData(0,"ZmZmZvb/70CamZmZ+f/vQA==")
    ]
} 
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
78
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
102
mongos> Object.bsonsize(db.metricValue.findOne({ "mtid" : 2}))
126
 
There you have it. 24 bytes per inserted pair of doubles. Without packing, it is 30 bytes, so I would be saving 24/30=12/15=4/5 => 20%.

Tuesday, March 26, 2013

Solved: Pymongo findAndModify example collection object not callable


I just blew an hour with a silly mistake (or two!)

I have an MongoDB collection with a bunch of documents I'm accessing via pymongo.  The docs have data in an array.  In the latest doc, I want to append an element to the array.  So, I use $push, this is normal.  Finding the right doc to modify, this was harder.

Below is the terminal session where I did this.  I made several mistakes, I hope you find it useful to see them.

The biggest one is that THERE IS NO findAndModify in pymongo.  It's called 'find_and_modify()'.   I kept getting the dreaded error: 

TypeError: 'Collection' object is not callable. If you meant to call the 'findAndUpdate' method on a 'Database' object it is failing because no such method exists.

This was reasonable - there is no such function. 

Other mistakes below might be useful, too, in that I think they're probably common.

[esm@myboxname:scripts]$ python
Python 2.6.6 (r266:84292, Jul 20 2011, 10:22:43) 
[GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pymongo import Connetion
>>> c = Connection('myboxname', 40000)
>>> mm = c.mymassivecollection
>>> ret = mm.delme2.insert({'metrictypeid': 20, 'start':1000, 'vals':[3,4]});
>>> ret
ObjectId('5150c9bd64524c04169fd612')
>>> mm.delme2.insert({'metrictypeid': 20, 'start':1010, 'vals':[3,4]});
ObjectId('5150c9cd64524c04169fd613')
>>> mm.delme2.insert({'metrictypeid': 20, 'start':1020, 'vals':[3,4]});
ObjectId('5150c9d264524c04169fd614')
>>> mm.delme2.insert({'metrictypeid': 20, 'start':1030, 'vals':[3,4]});
ObjectId('5150c9d764524c04169fd615')
>>> mm.delme2.insert({'metrictypeid': 10, 'start':1030, 'vals':[3,4]});
ObjectId('5150c9dd64524c04169fd616')
>>> mm.delme2.find()

>>> [x for x in mm.delme2.find()]
[{u'vals': [3, 4], u'start': 1000, u'_id': ObjectId('5150c9bd64524c04169fd612'), u'metrictypeid': 20}, {u'vals': [3, 4], u'start': 1010, u'_id': ObjectId('5150c9cd64524c04169fd613'), u'metrictypeid': 20}, {u'vals': [3, 4], u'start': 1020, u'_id': ObjectId('5150c9d264524c04169fd614'), u'metrictypeid': 20}, {u'vals': [3, 4], u'start': 1030, u'_id': ObjectId('5150c9d764524c04169fd615'), u'metrictypeid': 20}, {u'vals': [3, 4], u'start': 1030, u'_id': ObjectId('5150c9dd64524c04169fd616'), u'metrictypeid': 10}]
>>> import pprint
>>> pprint.pprint([x for x in mm.delme2.find()])
[{u'_id': ObjectId('5150c9bd64524c04169fd612'),
  u'metrictypeid': 20,
  u'start': 1000,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9cd64524c04169fd613'),
  u'metrictypeid': 20,
  u'start': 1010,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9d264524c04169fd614'),
  u'metrictypeid': 20,
  u'start': 1020,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9d764524c04169fd615'),
  u'metrictypeid': 20,
  u'start': 1030,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9dd64524c04169fd616'),
  u'metrictypeid': 10,
  u'start': 1030,
  u'vals': [3, 4]}]
>>> mm.runCommand( { 'findAndModify':'delme2', 'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1401, in __call__
    self.__name)
TypeError: 'Collection' object is not callable. If you meant to call the 'runCommand' method on a 'Database' object it is failing because no such method exists.
>>> mm.delme2.runCommand( { 'findAndModify':'delme2', 'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1405, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'runCommand' method on a 'Collection' object it is failing because no such method exists.
>>> mm.command( { 'findAndModify':'delme2', 'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/database.py", line 395, in command
    msg, allowable_errors)
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/helpers.py", line 144, in _check_command_response
    raise OperationFailure(msg % details["errmsg"])
pymongo.errors.OperationFailure: command {'sort': {'start': 1}, 'query': {'metrictypeid': 20}, 'findAndModify': 'delme2', 'update': {'$push': {'vals': 5}}, 'new': True} failed: no such cmd: sort
>>> mm.command( { 'findAndModify':'delme2', 'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/database.py", line 395, in command
    msg, allowable_errors)
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/helpers.py", line 144, in _check_command_response
    raise OperationFailure(msg % details["errmsg"])
pymongo.errors.OperationFailure: command {'sort': {'start': 1}, 'query': {'metrictypeid': 20}, 'findAndModify': 'delme2', 'update': {'$push': {'vals': 5}}, 'new': True} failed: no such cmd: sort
>>> mm.delme2.findAndModify({'query': 'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
  File "", line 1
    mm.delme2.findAndModify({'query': 'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
                                                    ^
SyntaxError: invalid syntax
>>> mm.delme2.findAndModify({'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1405, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'findAndModify' method on a 'Collection' object it is failing because no such method exists.
>>> mm.delme2.findAndModify({query: {metrictypeid:20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'metrictypeid' is not defined
>>> mm.delme2.findAndModify({'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1405, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'findAndModify' method on a 'Collection' object it is failing because no such method exists.
>>> mm.delme2.findAndModify({'query': {'metrictypeid':20}, 'sort': { 'start' : 1}, 'update' : { '$push' : { 'vals' : 5 } }, 'new': True});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1405, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'findAndModify' method on a 'Collection' object it is failing because no such method exists.
>>> mm.delme2.find()

>>> pprint([x for x in mm.delme2.find()])
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'module' object is not callable
>>> pprint.pprint([x for x in mm.delme2.find()])
[{u'_id': ObjectId('5150c9bd64524c04169fd612'),
  u'metrictypeid': 20,
  u'start': 1000,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9cd64524c04169fd613'),
  u'metrictypeid': 20,
  u'start': 1010,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9d264524c04169fd614'),
  u'metrictypeid': 20,
  u'start': 1020,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9d764524c04169fd615'),
  u'metrictypeid': 20,
  u'start': 1030,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9dd64524c04169fd616'),
  u'metrictypeid': 10,
  u'start': 1030,
  u'vals': [3, 4]}]
>>> mm.delme2.findAndModify({'metrictypeid':20}, { '$push' : { 'vals' : 5 }});
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1405, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'findAndModify' method on a 'Collection' object it is failing because no such method exists.
>>> mm.delme2.findAndUpdate()
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1405, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'findAndUpdate' method on a 'Collection' object it is failing because no such method exists.
>>> mm.findAndUpdate()
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1401, in __call__
    self.__name)
TypeError: 'Collection' object is not callable. If you meant to call the 'findAndUpdate' method on a 'Database' object it is failing because no such method exists.
>>> db
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'db' is not defined
>>> mm.delme2.find_and_modify({'metrictypeid':20}, { '$push' : { 'vals' : 5 }});
{u'vals': [3, 4], u'start': 1000, u'_id': ObjectId('5150c9bd64524c04169fd612'), u'metrictypeid': 20}
>>> pprint.pprint([x for x in mm.delme2.find()])
[{u'_id': ObjectId('5150c9cd64524c04169fd613'),
  u'metrictypeid': 20,
  u'start': 1010,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9d264524c04169fd614'),
  u'metrictypeid': 20,
  u'start': 1020,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9d764524c04169fd615'),
  u'metrictypeid': 20,
  u'start': 1030,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9dd64524c04169fd616'),
  u'metrictypeid': 10,
  u'start': 1030,
  u'vals': [3, 4]},
 {u'_id': ObjectId('5150c9bd64524c04169fd612'),
  u'metrictypeid': 20,
  u'start': 1000,
  u'vals': [3, 4, 5]}]
>>> mm.delme2.find_and_modify(query={'metrictypeid':20}, update={'$push':{'vals':6}}, sort={'start':1},new=True);
/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py:1364: DeprecationWarning: Passing mapping types for `sort` is deprecated, use a list of (key, direction) pairs instead
  DeprecationWarning)
{u'vals': [3, 4, 5, 6], u'start': 1000, u'_id': ObjectId('5150c9bd64524c04169fd612'), u'metrictypeid': 20}
>>> mm.delme2.find_and_modify(query={'metrictypeid':20}, update={'$push':{'vals':6}}, sort=['start',1],new=True);
Traceback (most recent call last):
  File "", line 1, in 
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/collection.py", line 1357, in find_and_modify
    kwargs['sort'] = helpers._index_document(sort)
  File "/path/not/important/here/pylibs/lib/python2.6/site-packages/pymongo-2.4.1-py2.6-linux-x86_64.egg/pymongo/helpers.py", line 68, in _index_document
    for (key, value) in index_list:
ValueError: too many values to unpack
>>> mm.delme2.find_and_modify(query={'metrictypeid':20}, update={'$push':{'vals':6}}, sort=[['start',1]],new=True);
{u'vals': [3, 4, 5, 6, 6], u'start': 1000, u'_id': ObjectId('5150c9bd64524c04169fd612'), u'metrictypeid': 20}
>>> mm.delme2.find_and_modify(query={'metrictypeid':20}, update={'$push':{'vals':7}}, sort=[['start',1]],new=True);
{u'vals': [3, 4, 5, 6, 6, 7], u'start': 1000, u'_id': ObjectId('5150c9bd64524c04169fd612'), u'metrictypeid': 20}
>>> mm.delme2.find_and_modify(query={'metrictypeid':20}, update={'$push':{'vals':7}}, sort=[['start',-1]],new=True);
{u'vals': [3, 4, 7], u'start': 1030, u'_id': ObjectId('5150c9d764524c04169fd615'), u'metrictypeid': 20}


Friday, March 08, 2013

Trying to build Python and a set of Python libraries on an AIX 6.1 box.

Starting with both gcc and IBM xlc compilers handy, this will be a set of notes, as opposed to a strict recipe. 

Have in /opt/freeware/bin:  gcc, make, tar.

Python notes:

    AIX:    A complete overhaul of the shared library support is now in
            place.  See Misc/AIX-NOTES for some notes on how it's done.
            (The optimizer bug reported at this place in previous releases
            has been worked around by a minimal code change.) If you get
            errors about pthread_* functions, during compile or during
            testing, try setting CC to a thread-safe (reentrant) compiler,
            like "cc_r".  For full C++ module support, set CC="xlC_r" (or
            CC="xlC" without thread support).
   
    AIX 5.3: To build a 64-bit version with IBM's compiler, I used the
            following:
   
            export PATH=/usr/bin:/usr/vacpp/bin
            ./configure --with-gcc="xlc_r -q64" --with-cxx="xlC_r -q64" \
                        --disable-ipv6 AR="ar -X64"
            make
So, I know a couple of things from previous adventures here.  First, I need working libraries for a variety of things:
* readline
* sqlite3
* zlib

LIBPATH=/opt/freeware/lib:/opt/freeware/lib64:/usr/lib:/usr/local/lib:/opt/recon/dcm/platforms/AIX/lib:/opt/recon/lib:

I've run into problems when compiling with xlc. The tests don't pass, and I get an exit status 11.  yuck.  More later.

MongoDB - How to Easily Pre-Split Chunks

Okay, so I'm doing some performance testing on MongoDB, spinning up a new configuration, and then throwing lots of data at it to see what the characteristics are.

Early on, I discovered that database locking was severely limiting performance.  MongoDB locks the database/collection (as of MongoDB 2.2.3) during every write operation.  However, it only locks it for the specific shard being written to.  So, massive numbers of writes dictates large numbers of shards, to minimize the lock/unlock bottleneck.  That's the theory.

In practice, this indeed proved correct.  We changed from a running (on 4 boxes - 2 primaries and 2 replicas) with 2 shards to running with 48 shards.  YES, they all magically cooperate nicely and share memory and CPU equally.  Actually, the boxes were all 196 GB memory w/ 32 cores each, so we weren't limited on machine capacity.  YMMV.

BTW, at peak operations, we're only seeing CPU of 50% per daemon, so we're not CPU limited. 

But, during my tests, I had to spin up first 4 shards, then 8, then 24, then 48.  Waiting for each test to settle into a steady state took time.  Each shard had to be active and equally used.  Likewise, there had to be enough data for it to be reading and writing at the same time, where the data was read out to send to the replicas.

The startup was handicapped by there being all activity on one shard, and not the others.  I would start things off, and all the activity would just be on 1 shard.  Since I was maxing out the capacity, there wasn't any spare IO available to do the splits and balancing.

To fix this, I found that I could add chunk split-points ahead of time, a process called 'pre-splitting chunks'.  To do this:
  • Turn off the balancer.
  • Run the command 'split' with a middle value, multiple times according to your number of shards.  
  • Turn on the balancer again.

I had 48 shards on a key with range 000-999, so I pretended I had 50 shards and gave each shard 20.

No, you don't have to stop the database from doing anything, like failover to secondaries or anything like that.

So, my commands were:

$ mongo --port 99999999 --host myhost
> use config
> sh.stopBalancer();
> use admin
> db.runCommand({split: "myDb.myCollection" middle:{"key1":000,'key2':1}});

> db.runCommand({split: "myDb.myCollection" middle:{"key1":020,'key2':1}});
> ... (46+ more of these)
> use config
> sh.startBalancer();



NOTE:  Always leave the balancer off more than 1 minute or it doesn't count.  That is, if you're 'bouncing' the balancer, leave it off for a full 60+ seconds or the bounce doesn't do anything.

Friday, January 04, 2013

MongoDB Shard Stats One-Liner

I've been testing, and I want to get the shard stats in one line.  That is I want this:

    krice@zaphod:~/$ mongo --host boxname303.example.com --port 40000 --eval "sh.status()"
    MongoDB shell version: 2.0.6
    connecting to: boxname303.example.com:40000/test
    --- Sharding Status ---
      sharding version: { "_id" : 1, "version" : 3 }
      shards:
            {  "_id" : "shard-00",  "host" : "shard-00/boxname301.example.com:30001,boxname303.example.com:30000" }
            {  "_id" : "shard-01",  "host" : "shard-01/boxname302.example.com:30101,boxname304.example.com:30100" }
            {  "_id" : "shard-02",  "host" : "shard-02/boxname301.example.com:30201,boxname303.example.com:30200" }
            {  "_id" : "shard-03",  "host" : "shard-03/boxname302.example.com:30301,boxname304.example.com:30300" }
            {  "_id" : "shard-04",  "host" : "shard-04/boxname301.example.com:30401,boxname303.example.com:30400" }
            {  "_id" : "shard-05",  "host" : "shard-05/boxname302.example.com:30501,boxname304.example.com:30500" }
            {  "_id" : "shard-06",  "host" : "shard-06/boxname301.example.com:30601,boxname303.example.com:30600" }
            {  "_id" : "shard-07",  "host" : "shard-07/boxname302.example.com:30701,boxname304.example.com:30700" }
            {  "_id" : "shard-08",  "host" : "shard-08/boxname301.example.com:30801,boxname303.example.com:30800" }
            {  "_id" : "shard-09",  "host" : "shard-09/boxname302.example.com:30901,boxname304.example.com:30900" }
            {  "_id" : "shard-10",  "host" : "shard-10/boxname301.example.com:31011,boxname303.example.com:31010" }
            {  "_id" : "shard-11",  "host" : "shard-11/boxname302.example.com:31111,boxname304.example.com:31110" }
            {  "_id" : "shard-12",  "host" : "shard-12/boxname301.example.com:31211,boxname303.example.com:31210" }
            {  "_id" : "shard-13",  "host" : "shard-13/boxname302.example.com:31311,boxname304.example.com:31310" }
            {  "_id" : "shard-14",  "host" : "shard-14/boxname301.example.com:31411,boxname303.example.com:31410" }
            {  "_id" : "shard-15",  "host" : "shard-15/boxname302.example.com:31511,boxname304.example.com:31510" }
      databases:
            {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
            {  "_id" : "megamaid",  "partitioned" : true,  "primary" : "shard-00" }
                    megamaid.metricValue chunks:
                                    shard-05        2
                                    shard-13        2
                                    shard-10        2
                                    shard-15        2
                                    shard-01        2
                                    shard-08        2
                                    shard-04        2
                                    shard-07        2
                                    shard-12        2
                                    shard-03        2
                                    shard-09        2
                                    shard-06        2
                                    shard-11        2
                                    shard-14        2
                                    shard-00        2
                                    shard-02        5
                            too many chunks to print, use verbose if you want to force print
            {  "_id" : "test",  "partitioned" : false,  "primary" : "shard-14" }
    krice4@zaphod:~/$



But, I want just the numbers so I can see if I'm balanced.  Here's the one-liner:

alias shardbal="mongo --host boxname.example.com --port 40000 --eval \"sh.status()\" | egrep 'shard\-[0-9]{2}[^\"/]' | awk '{print \$2}' | tr '\n' ',' | xargs echo \"Shard Balance is: \" "
krice4@zaphod:~/$ shardbal
Shard Balance is:  4,4,4,3,4,4,4,4,4,4,4,4,4,4,4,4,