Wednesday, January 07, 2015

Custom Python Decorators Considered Harmful

In line with the classic post about GOTO being harmful, I must extend this to include the 'Decorators' language feature of Python.

"Just because you can do a thing doesn't mean you should do a thing."

Decorators, IMHO, as a general rule, are Evil in the classical sense: they tempt you with easy riches, but there are complications and ambiguities down the road that make life difficult.

I recently read a blog post about someone advocating for creating a @retry decorator that would automatically retry a function a certain number of times until it returned success.

As decorated, the method would fetch a URL a max number of times or until it got a certain HTTP success code. I've seen such methods used before at various places, but have found them to be VERY troublesome.

Test driven (TDD) requires we create a method that can be tested through all its code paths, or at least most of them. Decorators make writing tests more complicated as well.

So, here's my reasoning:

  • The code without a decorator, retrying n times is very obvious in function.
  • Presumably one would want to use a decorator so the same decorator could be used on multiple functions, yet the undecorated version can be parameterized easily, also.
  • Modifying this code will happen an industry average of 6 more times;
  • Each time it's modified, the person reading it will have to understand not just loops, but decorators and how they can go wrong;
  • Some of the people who come after you, who modify this code, will do it incorrectly and this will result in more time spent;
  • Writing unit tests for decorated code is usually far more difficult, since how do you test a decorator without testing the code it decorates?
  • People will be tempted to extend this concept and make more decorators that do far more things (database operations, disk writes, etc.) and this sets a bad precedent that decorators are in frequent use.
  • Some decorators are fine: @staticmethod, @classmethod, @property.  These are common and useful.  Further, some libraries like Twisted have some predefined that are useful.

As a general rule, wherever I see custom decorators used, I'm sorely tempted to yank them for the future good of humanity. Of course, I'm not free to do that a lot of the time, but it's a temptation.

TL;DR version: If you have custom decorated code, yank if possible; Don't write custom decorators, it makes for crazymaking code.

I think frequently people write this stuff just to prove they're smarter than I am. Perhaps that's true. But, I'm smart enough to realize that the simplest possible code, even if it takes more lines, is the best possible code.

-- Kevin J. Rice

9 years in Python and counting, and loving it (except rampant use of decorators and dependency injection).

Saturday, December 13, 2014

Pylint Broken on install: "control_pragmas = {'disable', 'enable'}"

So, wanting better Python code and having used pychecker for years and years, I was recently pointed to pylint again as a tool-of-choice.

Installing it was simple:

$ sudo pip install pylint
Downloading/unpacking pylint
Downloading pylint-1.4.0-py2.py3-none-any.whl (412kB): 412kB downloaded
Downloading/unpacking astroid>=1.3.2 (from pylint)
Downloading astroid-1.3.2-py2.py3-none-any.whl (163kB): 163kB downloaded
Downloading/unpacking six (from pylint)
Downloading six-1.8.0-py2.py3-none-any.whl
Downloading/unpacking logilab-common>=0.53.0 (from pylint)
Downloading logilab-common-0.63.2.tar.gz (196kB): 196kB downloaded
Running setup.py (path:/tmp/pip_build_root/logilab-common/setup.py) egg_info for package logilab-common
/usr/lib64/python2.6/distutils/dist.py:266: UserWarning: Unknown distribution option: 'test_require'
warnings.warn(msg)
package init file './test/__init__.py' not found (or not a regular file)
Downloading/unpacking unittest2>=0.5.1 (from logilab-common>=0.53.0->pylint)
Downloading unittest2-0.8.0-py2.py3-none-any.whl (94kB): 94kB downloaded
Downloading/unpacking argparse (from unittest2>=0.5.1->logilab-common>=0.53.0->pylint)
Downloading argparse-1.2.2-py2.py3-none-any.whl
Installing collected packages: pylint, astroid, six, logilab-common, unittest2, argparse
Compiling /tmp/pip_build_root/pylint/pylint/lint.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/lint.py', 585, 37, " control_pragmas = {'disable', 'enable'}\n"))

Compiling /tmp/pip_build_root/pylint/pylint/reporters/text.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/reporters/text.py', 136, 18, " for attr in ('msg', 'symbol', 'category', 'C')})\n"))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/abstract_abc_methods.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/abstract_abc_methods.py', 6, 31, 'class Parent(object, metaclass=abc.ABCMeta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/abstract_class_instantiated_py2.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/abstract_class_instantiated_py2.py', 15, 34, 'class GoodClass(object, metaclass=abc.ABCMeta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/abstract_class_instantiated_py3.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/abstract_class_instantiated_py3.py', 14, 34, 'class GoodClass(object, metaclass=abc.ABCMeta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/class_members_py27.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/class_members_py27.py', 34, 38, 'class TestMetaclass(object, metaclass=ABCMeta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/class_members_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/class_members_py30.py', 34, 38, 'class TestMetaclass(object, metaclass=ABCMeta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/defined_and_used_on_same_line.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/defined_and_used_on_same_line.py', 29, 20, "with open('f') as f, open(f.read()) as g:\n"))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/no_name_in_module.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/no_name_in_module.py', 15, 26, "print('hello world', file=sys.stdout)\n"))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/old_style_class_py27.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/old_style_class_py27.py', 10, 29, 'class NotOldStyle2(metaclass=type):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/unbalanced_tuple_unpacking_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/unbalanced_tuple_unpacking_py30.py', 9, 20, ' first, second, *last = (1, 2, 3, 4)\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/undefined_variable_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/functional/undefined_variable_py30.py', 8, 19, ' def test(self)->Undefined: # [undefined-variable]\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/functional/yield_outside_func.py ...
SyntaxError: ("'yield' outside function", ('/tmp/pip_build_root/pylint/pylint/test/functional/yield_outside_func.py', 2, None, 'yield 1 # [yield-outside-function]\n'))

/tmp/pip_build_root/pylint/pylint/test/input/func_assert_2uple.py:4: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (1 == 1, 2 == 2), "no error"
/tmp/pip_build_root/pylint/pylint/test/input/func_assert_2uple.py:5: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (1 == 1, 2 == 2) #this should generate a warning
/tmp/pip_build_root/pylint/pylint/test/input/func_assert_2uple.py:7: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (1 == 1, ), "no error"
/tmp/pip_build_root/pylint/pylint/test/input/func_assert_2uple.py:8: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (1 == 1, )
/tmp/pip_build_root/pylint/pylint/test/input/func_assert_2uple.py:9: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (1 == 1, 2 == 2, 3 == 5), "no error"
/tmp/pip_build_root/pylint/pylint/test/input/func_assert_2uple.py:11: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (True, 'error msg') #this should generate a warning
Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_bad_cont_dictcomp_py27.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_bad_cont_dictcomp_py27.py', 8, 9, ' for x in range(3)}\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_bad_exception_context_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_bad_exception_context_py30.py', 14, 25, ' raise IndexError from 1\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_continue_not_in_loop.py ...
SyntaxError: ("'continue' not properly in loop", ('/tmp/pip_build_root/pylint/pylint/test/input/func_continue_not_in_loop.py', 8, None, 'continue\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_e0108.py ...
SyntaxError: ("duplicate argument '_' in function definition",)

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_exec_used_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_exec_used_py30.py', 6, 22, "exec('a = 1', globals={})\n"))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_keyword_repeat.py ...
SyntaxError: ('keyword argument repeated', ('/tmp/pip_build_root/pylint/pylint/test/input/func_keyword_repeat.py', 8, None, 'function_default_arg(two=5, two=7)\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_kwoa_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_kwoa_py30.py', 3, 15, 'def function(*, foo):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_loopvar_in_dict_comp_py27.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_loopvar_in_dict_comp_py27.py', 8, 28, ' return {x: lambda: x for x in range(10)}\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_noerror_mcs_attr_access.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_noerror_mcs_attr_access.py', 14, 29, 'class Test(object, metaclass=Meta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_noerror_unused_variable_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_noerror_unused_variable_py30.py', 12, 21, ' nonlocal attr\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_return_outside_func.py ...
SyntaxError: ("'return' outside function", ('/tmp/pip_build_root/pylint/pylint/test/input/func_return_outside_func.py', 3, None, 'return\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_return_yield_mix_py_33.py ...
SyntaxError: ("'return' with argument inside generator",)

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_set_literal_as_default_py27.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_set_literal_as_default_py27.py', 5, 23, 'def function1(value={1}):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_syntax_error.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_syntax_error.py', 1, 9, 'def toto\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_undefined_metaclass_var_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_undefined_metaclass_var_py30.py', 8, 20, 'class Bad(metaclass=ABCMet):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_unused_import_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_unused_import_py30.py', 10, 21, 'class Meta(metaclass=abc.ABCMeta):\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_used_before_assignment_py30.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/test/input/func_used_before_assignment_py30.py', 10, 20, ' nonlocal cnt\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/func_w0705.py ...
SyntaxError: ("default 'except:' must be last", ('/tmp/pip_build_root/pylint/pylint/test/input/func_w0705.py', 28, None, '__revision__ += 1\n'))

Compiling /tmp/pip_build_root/pylint/pylint/test/input/syntax_error.py ...
Sorry: IndentationError: ('expected an indented block', ('/tmp/pip_build_root/pylint/pylint/test/input/syntax_error.py', 2, 5, "print('hop')\n"))
Compiling /tmp/pip_build_root/pylint/pylint/utils.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/pylint/pylint/utils.py', 59, 26, 'MSG_TYPES_LONG = {v: k for k, v in six.iteritems(MSG_TYPES)}\n'))

Compiling /tmp/pip_build_root/astroid/astroid/tests/testdata/python3/data/module.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/astroid/astroid/tests/testdata/python3/data/module.py', 55, 32, " print('yo', end=' ')\n"))

Compiling /tmp/pip_build_root/astroid/astroid/tests/testdata/python3/data/module2.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/astroid/astroid/tests/testdata/python3/data/module2.py', 100, 22, "print('bonjour', file=stream)\n"))

Compiling /tmp/pip_build_root/astroid/astroid/tests/unittest_brain.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/astroid/astroid/tests/unittest_brain.py', 71, 33, ' self.assertSetEqual({"a", "b", "c"}, set(base.instance_attrs))\n'))

Compiling /tmp/pip_build_root/astroid/astroid/tests/unittest_inference.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/astroid/astroid/tests/unittest_inference.py', 492, 44, ' self.assertSetEqual({n.__class__ for n in xxx.infered()},\n'))

Compiling /tmp/pip_build_root/astroid/astroid/tests/unittest_modutils.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/astroid/astroid/tests/unittest_modutils.py', 248, 41, " {os.path.join(package, x) for x in ['__init__.py', 'module.py', 'module2.py', 'noendingnewline.py', 'nonregr.py']})\n"))

Compiling /tmp/pip_build_root/astroid/astroid/tests/unittest_scoped_nodes.py ...
SyntaxError: ('invalid syntax', ('/tmp/pip_build_root/astroid/astroid/tests/unittest_scoped_nodes.py', 556, 39, " expected_methods = {'__init__', 'class_method', 'method', 'static_method'}\n"))

Running setup.py install for logilab-common
/usr/lib64/python2.6/distutils/dist.py:266: UserWarning: Unknown distribution option: 'test_require'
warnings.warn(msg)
package init file './test/__init__.py' not found (or not a regular file)
package init file './test/__init__.py' not found (or not a regular file)
changing mode of build/scripts-2.6/pytest from 644 to 755
package init file './test/__init__.py' not found (or not a regular file)
Installing /usr/lib/python2.6/site-packages/logilab_common-0.63.2-py2.6-nspkg.pth
changing mode of /usr/bin/pytest to 755
package init file './test/__init__.py' not found (or not a regular file)
Successfully installed pylint astroid six logilab-common unittest2 argparse
Cleaning up...
krice4@graph301p:~$ dir
Thinking this would be good, I did the normal first-use invocation:

[esm@graph301p:bin]$ pylint
Traceback (most recent call last):
  File "/usr/bin/pylint", line 11, in
    sys.exit(run_pylint())
  File "/usr/lib/python2.6/site-packages/pylint/__init__.py", line 22, in run_pylint
    from pylint.lint import Run
  File "/usr/lib/python2.6/site-packages/pylint/lint.py", line 585
    control_pragmas = {'disable', 'enable'}
                                ^
SyntaxError: invalid syntax
[esm@graph301p:bin]$ dir

No joy.  Thinking my install failed, I turned to the intertubewebs and found that the Pylint people decided to not support Python 2.6 anymore.  It's apparently a Python 2.7 game only now.


I found out from here:  https://bitbucket.org/logilab/pylint/issue/390/py26-compatiblity-broken

Not to worry, though, pychecker still works.

Sunday, November 30, 2014

Python Coding Best Practice Guide - Practical Goodness (prevent over-smart show-offs from killing your project)

Over the last 8? 9? years of coding in Python, I've had to deal with code written by a variety of people:

  • Some of these people were very good at coding so I could quickly understand what they were doing and modify it as needed.
  • Some of these people were very smart.
  • These two groups overlapped only haphazardly.

My definition of good code is stuff that the average joe programmer, maybe 2 years out of a C.S. degree or so (or the equivalent industry experience), can understand quickly and modify to meet the business-demand of the moment.

Bad code is that which has to be read, re-read, analyzed, tried out interactively to see how parts work, analyzed again, and drawn out with data-flow diagrams because it requires a lot of brainpower to hold all of the parts in your head at once.

Maybe I'm not that smart. Maybe I am. I like to think I'm smart. But, when it's 2 am and I have to figure something out Right Now to make it work before I can go back to sleep, or if I'm really working on a giant project and I only have an hour or so to make a small mod someone wants - that's not when I want to be analyzing something to trace back where the problem is.

Back 15 years ago, when I was full-time coding in C (or C++, somewhat rarely), I had a whole list of things that I thought the designers of C, Kernigan and Richie, got wrong. I still remember some of that list. But, the important part is the adage:

Just because you CAN do a thing doesn't mean you SHOULD do a thing.

In C, the verboten (or at least things I avoided) included:

  • extensive use of #define and #ifdef for things other than constants and MAX functions. Smart people show off and write complex crap.
  • unions. Yes, unions. You could declare a struct that has an alternate layout as a union, and switch between the two. It's ripe for stupidity.
  • lots of mallocs and frees. I put everything I could on the stack because if there was an exit path I missed, that memory wasn't getting freed anytime soon.
  • ...etc.

Some of this translates below into Python. I think the corollary to IFDEF in Python is decorators. You're doing the same thing - putting all sorts of logic into a wrapper. That is, you have to figure out the wrapper before they can figure out your code. if it's simple, great, go ahead. If not, you're shooting the person who comes after you.

Some decorators are golden - @property and @classmethod are great, no problem there because you can create getters/setters quickly and singletons, etc. But, a frequent example is something like, @deprecated which writes a lot message with the call stack before turning around and calling the function. BUT, it's just as easy to insert a function call at the top of the function, something like, GlobalCallDeprecated() which does the same thing. When a function works, why use a decorator? People like to show off how smart they are. Whup-tee-do, great, you're smart, so am I, get over it and write something I don't have to spend time figuring out.

With that in mind: Below is the list we're using here at (Large.Fortune-50.Company.Infrastructure.Dept.). This department generates graphs, thus the references to 'graph' boxes.


1.  pep-8
2.  use 4-space indentation.  Anyone touching code should use reindent to fix this.
    There is a program called 'reindent' available on all graph boxes.  it rewrites the file with 4 space indent.
3.  spacing (pep-8) is important, once space around equals in assignment, but none in calls:
    a = 5
    self.runThis(5, 6, abc=12)
4.  run pychecker on all code
        * alias pychecker="pychecker --limit=9999"
        * some unused vars are okay
5.  MOST except the shortest sets of code should be in classes that do nothing but can be invoked from a test.
    MOST outer scope routines should contain try/except blocks.
    When creating something that produces multiple outputs (measurements, for example), each set of measurements
      should be created in a separate try/except block, so a failure in one area does not lead to total failure.
    MOST classes should start with a capital letter, e.g., CamelCase.
    MOST modules should start with a lowercase letter, e.g., camelCase.py
    MOST module globals should be all caps with underscores, e.g., ALL_CAPS
    MOST methods should be lowercase first letter, e.g., camelCase() or camel_case()
    Slight preference for camelCase vs. under_scored var and method names, but both are okay.
    VERY FEW global variables should be used, if any, and all should be at the top of a file.
    Globals should be used in the init so they can be overriden by tests and other code.
    Filenames and/or explicit paths should be at the top, as globals, so they can be found / changed easily.


    import this
    import traceback

    VERY_FEW_GLOBALS = 1

    class Something(object):

        def __init__(self, logger=None):
            if logger:
                self.log = logger
            else:
                self.setupLogging()
            self.var1 = 'c'
            self.var2 = 'ff'
            self.veryFewGlobals = VERY_FEW_GLOBALS

        def blah1(self, instr):
            instr += 'a'
            self.var1 = instr

        def blah2(self, instr):
            instr += 'b'
            self.var1 = instr

        def main(self):
            self.log.info("starting.")  #whatever
            try:
                x = 'abc'
                self.blah1(x)
                self.var2 = self.blah2(x)
                self.var1 = str(self.var1) + ''
            except Exception, e:
                tb = traceback.format_exc()
                self.log.error("Something.main(): Exception: e: %s, tb: %s" % (e, tb))
            return

    if __name__ == '__main__':
        s = Something()
        s.main()

6.  All code should have at least two tests: a trivial test, and an instantiation test.
    Tests should be in a file named test_doSomething.py in a subdirectory tests.
    Monkey-patch / duck-punch methods to reduce and simplify the test.

    # for one test, run:  nosetests metricType:TestMetricType.test_simpleRetrieve
    class TestSomething(unittest.TestCase):

       def test_trivial(self):
           return True

       def test_instantiate(self):
           s = Something()
           assert s

       def test_blah1(self):
           s = Something()
           s.var1 = 'ddd'
           s.blah1('bb')
           assert s.var1 == 'abcab'

       def myroutine(self, instr):
           something = {
               'metricType'  : 'poweriq',
               'metric'      : metricName,
               'timestamp'   : timestamp,
               'value'       : value,
               'context'     : context,
           }

3.  Function calls with lots of parameters should not be indented to the max level - it makes for
    lots of whitespace, all the code on the right side of the screen, too-long lines, etc.

    # BAD - discouraged.
    someVarNameHere = self.callingFunctionNowHere(exceptionallyLongVarName=1,
                                                  freakishlyLongExceptionallyLongVarName=2,
                                                  additionallyLongExceptionallyLongVarName=4,
                                                  somethihng=5)
    # PREFERRED:
    someVarNameHere = self.callingFunctionNowHere(
        exceptionallyLongVarName=1,
        freakishlyLongExceptionallyLongVarName=2,
        additionallyLongExceptionallyLongVarName=4,
        somethihng=5
        )

    # ALSO POSSIBLE:
    someVarNameHere = self.callingFunctionNowHere(exceptionallyLongVarName=1,
        freakishlyLongExceptionallyLongVarName=2,
        additionallyLongExceptionallyLongVarName=4, somethihng=5 )

14. Comments added to the end of multiple lines are discouraged:

    # BAD - Discouraged
    var1 = self.callingSomethingHere(12)   # ESOC-123334
    var2 = self.callingSomethingHere(14)   # ESOC-123334

    # Preferred:

    # added for ESOC-123334:
    var1 = self.callingSomethingHere(12)
    var2 = self.callingSomethingHere(14)

15. Exceptionally long quotes should use the triple-quote concept to prevent problems with
    escaping quotes:

    x = ''' very long
        lines go here and
        continue at indentations
        presuming that's not a problem.
        '''

16.  It is generally quite possible to avoid line-continuation characters.
     Generally, long lines end up being inside a list.
     As with the above function calls with many params, just open the paren
     and keep adding variables.

17.  Line length limits of 80 or even 120 characters are far less than the width of our
     modern screens.  However, some limit should apply since the eye has a hard time aligning
     to very, very long lines.
     There's no hard/fast limit, it's up to us individually, but 150 characters is probably a good limit
     except in very unusual circumstances.

18.  Logging lines should start with the name of the routine, so we know where the line came from.
     self.log.info("someRoutine():  started routine to figure out blah.")

19.  Libraries incorporated into other programs should have the class name also.
     self.log.info("sendReconEvent.connect(): connected to %s" % (pformat(self.destination)))

20.  Getting command line options should always be its own routine.
     This is because test cases cannot parse or understand parseOptions.  Example:

     # pulled out so can be overridden in test routines, e.g., m.getOptionParser = lambda x: None
     def getOptionParser(self):
         return OptionParser()

     def getOptions(self):
         parser = self.getOptionParser()
         parser.add_option("-d", "--debug",   action="store_true", dest="debug",   default=False, help="debug mode for this program, writes debug messages to logfile." )
         parser.add_option("-v", "--verbose", action="store_true", dest="verbose", default=False, help="verbose mode for this program, prints a lot to stdout." )
         parser.add_option("-p", "--processes", action="store", type="int", dest="processes", default=10, help="number of processes to run.")
         (options, args)             = parser.parse_args()
         self.verbose                = options.verbose
         self.debug                  = options.debug
         self.numProcesses           = options.processes
         if (self.verbose):
             print "verbose=%s, debug=%s, processes=%s" % (self.verbose, self.debug, self.numProcesses)
         return

21.  Except in test routines and sorts, the use of lambda functions is strongly discouraged.  Example:
     return [lambda x, i=i : i * x for i in range(5)]   # WTF? What does it do? Tested: [x(2)for x in f]=>[0,2,4,6,8] Why? Dafuq?

22.  In general, the use of yield is troublesome and should be avoided except when strongly needed.
     Especially grievous is the use of multiple yield statements in one method / function, since tracing
     code execution paths with this is vastly nasty.

23.  In code, avoid the use of 'from blah import *' for many, many, many reasons.  Most proximate:  pychecker can't check the code now.

24.  Decorators:
     * Writing custom decorators is strongly discouraged as it obfuscates code.  
     * Generally acceptable decorators are ones that are included in standard libraries (incl. Django)
     * Decorators should be used as tools, not as weapons of mass confusion.

25.  Any but the most simplistic use should be avoided of the following:
        * getattr and setattr methods

26.  The overall architectural goal of all scripts should be SIMPLICITY of code readability by zombies (the on-call guy at 3 am). 
     This is versus other considerations like execution speed, code size, 'elegance', etc.

27.  No file or class should be named the same as any standard library method.

28.  No variables anywhere should be named with reserved words or the names of standard libraries.  Examples:
     type = 'a'
     resource = 'Something'
     int = 7
     long = True
     path = 'blah/blah1'   # os.path is common, sometimes people do 'from os import path' and then you're in trouble.

29.  The use of **args is discouraged.  It is way too easy to mess up.  It has to be called exactly correctly and used
     within a method exactly correctly.  That is, calls with foo(context) are bad but easy, must do foo(**context) instead,
     but you will not get an error message

     #So, instead of:
     context = { foo: 7, bar: 8, baz: 10 }
     someClassInstance.funcOne(baz=6, **context)
     def funcOne(baz=8, bar=9, **args):
         # magic happened, baz is 6 or 10, but probably not 8.  Good luck.

     # do this, it's utterly obvious even at a glance:
     miscVals = { foo: 7, bar: 8 }
     someClassInstance.funcOne(baz=8, context=miscVals)
     def funcOne(baz=8, bar=9, context={}):
         if baz != 8 and context.get('baz', None):  print "WHICH ONE?"
         baz = context.get('baz', None) or baz
         bar = context.get('bar', None) or bar

30.  To close db handles, instead of __del__ use atexit.register() as in:

     import mysql
     import atexit
     class Bar(object):
         def __init__(self):
             self.dbHandle = mysql.openConnection(...) # or something
             atexit.register(self.cleanup)
         def cleanup(self):
             try:
                 self.dbHandle.close()
                 print "db handle closed."
             except:
                 print "FYI, db close failed."

31.  Additional rules: http://docs.python-guide.org/en/latest/writing/style/
     with the exception of using lambdas, map, and filter.

32.  In a perfect world, we clean up our code to bring it to clarity and obviousness right away.
     Short of that, we should endeavor to clean up our code any time we touch it.
     This includes making scripts use classes and adding a test_blah.py file with a couple of tests in it.
     Before committing, we should always run pychecker.

33.  Functions and methods should not be absurdly long.  
     If they are too long, they should be broken up.
     A handy library for reducing method size is counterStats which allows obj=CounterStats(['x']); obj.x = 5 coding.




Thursday, September 04, 2014

SOLVED: MongoDB - Pymongo ConnectionFailure: function takes at most 4 arguments (5 given)

I was getting this odd message, leading to pymongo fail the Connection() and MongoClient() instantiation.  No connection?  Really?  Getting this message:

[user@boxnamehere:logs]$ python
Python 2.6.6 (r266:84292, Jul 20 2011, 10:22:43)
[GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pymongo import MongoClient
>>> mc = MongoClient('servername.here', 11222)
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 369, in __init__
    raise ConnectionFailure(str(e))
pymongo.errors.ConnectionFailure: function takes at most 4 arguments (5 given)

So, what's wrong?  It's tough to find out.  I went in, backed-up and modified the files in the lib64/python2.6/site-packages/pymongo and bson/ dirs.  I imported traceback, added tb=format_exc() to each level of the try/except needed.  This gave me a really long traceback but at least I could see what was happening.

Once modified, my test program now produces:

user@boxname:~$ ./testPymongo.py
starting.
Traceback (most recent call last):
  File "./testPymongo.py", line 5, in
    mc = MC(host='servername.goes.here', port=11222, socketTimeoutMS=15000)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 373, in __init__
    raise ConnectionFailure(msg)
pymongo.errors.ConnectionFailure: function takes at most 4 arguments (5 given) Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 867, in __find_node
    member, nodes = self.__try_node(candidate)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 714, in __try_node
    {'ismaster': 1})
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 695, in __simple_command
    response = helpers._unpack_response(response)['data'][0]
  File "/usr/lib64/python2.6/site-packages/pymongo/helpers.py", line 117, in _unpack_response
    compile_re)
TypeError: function takes at most 4 arguments (5 given)
, Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 367, in __init__
    self._ensure_connected(True)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 936, in _ensure_connected
    self.__ensure_member()
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 807, in __ensure_member
    member, nodes = self.__find_node()
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 900, in __find_node
    raise AutoReconnect(', '.join(errors))
AutoReconnect: function takes at most 4 arguments (5 given) Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 867, in __find_node
    member, nodes = self.__try_node(candidate)
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 714, in __try_node
    {'ismaster': 1})
  File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 695, in __simple_command
    response = helpers._unpack_response(response)['data'][0]
  File "/usr/lib64/python2.6/site-packages/pymongo/helpers.py", line 117, in _unpack_response
    compile_re)
TypeError: function takes at most 4 arguments (5 given)

So, we see the problem is that the Pymongo version is calling, at about line 85 of /usr/lib64/python2.6/site-packages/pymongo/helpers.py

result["data"] = bson.decode_all(response[20:],
                                  as_class, tz_aware, uuid_subtype,
                                  compile_re)


But, when I go into /usr/lib64/python2.6/site-packages/bson/_init_.py I see function defined differently:

def decode_all(data, as_class=dict,
               tz_aware=True, uuid_subtype=OLD_UUID_SUBTYPE):
    """Decode BSON data to multiple documents.


So, I could fix this in the bson code, by adding an optional param.

Instead, though, I'm going to just remove the extraneous param since I'm thinking I don't need it.

Changeset: One file modified: /usr/lib64/python2.6/site-packages/pymongo/helpers.py

[user@server:pymongo]$ diff helpers.py helpers.py.orig
16,117c116,117
<                                      as_class, tz_aware, uuid_subtype
<                                     )
---
>                                      as_class, tz_aware, uuid_subtype,
>                                      compile_re)

*Solution*

Turns out, the problem isn't  with the code in pymongo.  It's that MongoDB Corp. has chosen to bundle their own version of the 'bson' package in with pymongo, and that version doesn't engage in happy-kindergarden-play with the pypy version of bson.

So, to fix this, do the following (omitting the $ prompts so you can cut and paste).  Note important steps of NOT BELIEVING PIP that it really uninstalled everything:


sudo pip uninstall bson
sudo pip uninstall pymongo
cd /usr/lib64/python2.6/site-packages
# Now, remove old versions of pymongo and bson.  Pip doesn't delete everything, dammit.
sudo rm -rf bson pymongo*
sudo pip install pymongo

Done!  Pymongo installs its own, proper version of bson.

Restart your processes and re-run your tests, it should work now.  Pymongo will connect to your mongo cluster properly, etc.

Note that the above problem occurs especially frequently if the box is under heavy load because (as we noted above) the initial connection raised an AutoReconnect. 

Enjoy your newfound powers in peace and with goodwill towards all.


Tuesday, September 02, 2014

SOLVED: Skype crashes on startup on Ubuntu 14.04

Just had a problem with Skype crashing immediately upon startup on my Ubuntu 14.04 box (as of Sept. 2, 2014).  This was an unexpected, sudden, and seemingly a random failure.

Skype would startup with the blue skype starting screen, open the contacts window for several seconds, then the window would close with no message.

Needing a message, I invoked from the command line and saw messages:

me@boxname:~/.Skype$ skype
Gtk-Message: Failed to load module "overlay-scrollbar"
Gtk-Message: Failed to load module "unity-gtk-module"
(skype:3825): Gtk-WARNING **: Unable to locate theme engine in module_path: "murrine",
(skype:3825): Gtk-WARNING **: Unable to locate theme engine in module_path: "murrine",
... bunch of these ...

(skype:3825): Gtk-WARNING **: Unable to locate theme engine in module_path: "murrine",
(skype:3825): Gtk-WARNING **: Unable to locate theme engine in module_path: "murrine",
Gtk-Message: Failed to load module "canberra-gtk-module"
Corrupt JPEG data: 3014 extraneous bytes before marker 0xd9
Aborted (core dumped)
me@boxname:~$


Found this set of directions: 

http://community.skype.com/t5/Linux/Skype-is-crashing-with-Corrupt-JPEG-data/td-p/3455587

Worked perfectly.

me@boxname:~$  cd
me@boxname:~$ tar -czf my.dot.skype.dir.tgz ~/.Skype
me@boxname:~$ sqlite3 ~/.Skype/my.profile.name/main.db
SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from Messages where type = 68;
... bunch of crap here ...

sqlite> select count(*) from Messages where type = 68;
25
sqlite> quit
me@boxname:~$ skype
... same error messages as above, without the last one about 'Aborted (core dumped)'.
 Success!

Thursday, August 28, 2014

SOLVED: diff 2 files, get results without markup

So, I want do a logrotate and remove files that are old, keeping only the latest 3 files.  So, script starts with:

$ fullLoglist="/tmp/full_loglist"
$ keepLoglist="/tmp/keep_loglist"
$ rmLoglist="/tmp/rm_loglist"

$ ls -lrt ${d}/rs*log* > ${fullLoglist}

 
First, I create a file with all the files listed:


$ cat ${fullLoglist}
graphdb_1j/rs-47-a/rs-47-a.log
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T13-25-13
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-46-46
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-48-44
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-14-35
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-35-11
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T19-48-56
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-28T13-44-21


I only want the top 3 files, though, and to rm the older ones.  So, I extract the ones I want to keep:

$ cat ${fullLoglist} | head -3 > $keepLoglist
$ cat $keepLoglistgraphdb_1j/rs-47-a/rs-47-a.log
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T13-25-13
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-46-46



Now, I need to get a diff of the rest of them so I know what to call rm on.  But how?  If I just do a diff, I get:

$ diff /tmp/full_loglist /tmp/keep_loglist
4,8d3
< graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-48-44
< graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-14-35
< graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-35-11
< graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T19-48-56
< graphdb_1j/rs-47-a/rs-47-a.log.2014-08-28T13-44-21



I DON'T WANT the extra markup.  I want diff without markup.  I try to google:

linux diff without markup
linux diff without greater-than less-than
linux diff line format markup
linux diff supress markup
linux diff only different lines

I try various options, like:

$ diff --line-format "%L" --suppress-common-lines /tmp/full_loglist /tmp/keep_loglist
graphdb_1j/rs-47-a/rs-47-a.log
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T13-25-13
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-46-46
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-48-44
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-14-35
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-35-11
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T19-48-56
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-28T13-44-21

This is WRONG, it prints all the lines, not just the different ones. Ug. 

SOLUTION ONE:  comm -3

I found two solutions.  The first is to use comm, and pass a -3 option.  This prints just the different lines.  I hadn't ever heard of comm before, but it's nice:

$ comm -3 /tmp/full_loglist /tmp/keep_loglist
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-48-44
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-14-35
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-35-11
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T19-48-56
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-28T13-44-21

SOLUTION TWO:  sort | uniq -u

I don't care about the ordering of these things.  So, I can use the simple solution of sort and uniq -u.  The uniq command normally prints all unique lines, removing duplicates.  But, an option is -u, which only prints lines that occur once-and-only-once.

$ cat /tmp/full_loglist /tmp/keep_loglist | sort  | uniq -u
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T14-48-44
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-14-35
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T15-35-11
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-27T19-48-56
graphdb_1j/rs-47-a/rs-47-a.log.2014-08-28T13-44-21


SOLVED!   No >, no < signs with diff and no line number markup.


Script to implement this process, for your edification:

#!/bin/bash

cd /opt/storage
dirlist=`ls -d graphdb_[1234][ij]/rs-*`
fullLoglist="/tmp/full_loglist"
keepLoglist="/tmp/keep_loglist"
rmLoglist="/tmp/rm_loglist"

for d in $dirlist
do     
  echo "---------------------------"
  echo "dir: ${d}"
  ls ${d}/rs*log* > ${fullLoglist}
  ls -al ${fullLoglist}
  echo "full files: `cat ${fullLoglist}`"
  cat ${fullLoglist} | head -3 > $keepLoglist
  echo "keep files: `cat ${keepLoglist}`"
  cat $keepLoglist $fullLoglist | sort | uniq -u > $rmLoglist
  echo "rm   files: `cat ${rmLoglist}`"
  if [ -s $rmLoglist ]
    then
        echo "Removing files...."
        cat $rmLoglist | xargs rm -v
    else
        echo "No files to remove this time."
    fi
done


Done.








Tuesday, August 26, 2014

SOLVED: Multiprocess updates of shared dict-of-dicts

Earlier, I posted the following:

I've run into my very first bug in Python.  It's a known, existing bug, but it's still the first time I've actually encountered one spontaneously.

I should caveat that I occassionally forget things, and I've been using Python 8 years, so it could be this has happened before but I've forgotten it.

The problem is as follows:

* I'm working in multiple processes, so I'm sharing a data structure across processes.  it's a dict.  So, I do

from multiprocessing import manager
d = manager.dict()

and share the sucker around.  Then, I do:

self.ddict.setdefault(mname, {})
#self.ddict[mname][ts] = val
self.log.error("updating mname: %s w/ ts: %s, val: %s" % (mname, ts, val))
self.ddict[mname].update({ts:val})
self.log.error("ddict %s, mname at %s" % (self.ddict, self.ddict[mname]))


output: 

2014-08-26 11:20:57,221 MainThread ERROR updating mname: a.b.c.d w/ ts: 1409070057, val: 1.0
2014-08-26 11:20:57,221 MainThread ERROR ddict {'a.b.c.d': {}}, mname at {}
2014-08-26 11:20:57,222 MainThread ERROR ddict: {'a.b.c.d' {}}


According to http://bugs.python.org/issue6766, this is a known bug.  I'm using Python2.6 and cannot upgrade. 

Damnit.  Looking for a workaround.


SOLVED:

Problem is that the shared memory 'watcher' manages the data structure and shares around updates to everyone using it. This means that this manager has to know when the data struct is updated in one process and tell the other processes to pull in the updates.

If you're using a dict-of-dicts DOD, the inner dict just looks like a blob that process A changes by direct access to the object, without informing DOD of the change. So, to fix this do:

dod = Manager.dict()
process ONE: dod['a'] = { 'x' : 12 }
# process TWO sees this update of a = x:12
process TWO: dod['a']['x'] = 13
# process ONE has no idea this happened, manager doesn't bring in the update.
# to fix:
process TWO: y=dod['a'];y['x']=13;dod=y
# this informs manager that dod is different.