Monthly Archive:: May 2010

PyCassa vs Lazyboy (updated)


As Hans points out in the comment below, it appears pycassa natively supports authentication with org.apache.cassandra.auth.SimpleAuthenticator. Lazyboy on the other hand doesn’t by default.

It’s not too hard to do it though. Intuitively, we could do something like this.

NB: Untested code!! I might create a patch for this when I get the time, so this is just an outline.

# Add this to lazyboy's connection package
from cassandra.ttypes import AuthenticationRequest

And in lazyboy’s _connect() function, add another parameter called logins, that is a dict of keyspaces and credentials which looks like the following.

# logins format
{'Keyspace1' : {'username':'myuser', 'password':'mypass'}}

def _connect(self, logins):
"""Connect to Cassandra if not connected."""

    client = self._get_server()
    if client.transport.isOpen() and self._recycle:
        if (client.connect_time + self._recycle) > time.time():
            return client
    elif client.transport.isOpen():
        return client
        # Login code 
        # Remember that client is an instance of Cassandra.Client(protocol)
        if logins is not None:
            for keyspace, credentials in logins.iteritems():
                request = AuthenticationRequest(credentials=credentials)
            client.login(keyspace, request)
        client.connect_time = time.time()
    except thrift.transport.TTransport.TTransportException, ex:
        raise exc.ErrorThriftMessage(
            ex.message, self._servers[self._current_server])

Original Post
I’ve been looking to answer which Python library is currently more fully featured to use to communicate with Cassandra.

From Reddit,

API-wise, both look like they are pretty much basic wrappers around the Cassandra Thrift bindings. I’d prefer lazyboy over pycassa though, given that firstly, it’s being used in production right now at Digg, and because it looks like lazyboy’s connection code is more featured than pycassa.


The connection code (Lazyboy) seems to be much more suited for use in production (use of auto pooling, auto load balancing, integrated failover/retry, etc.) (than PyCassa)

Thanks to GitHub, I was able to do some analysis of their traffic and commits,

Traffic Data



Commit Data



A larger number of people know about LazyBoy but code commits on it are currently on a stand still. Pycassa on the other hand seems to be growing at a pretty fast rate.

It looks like LazyBoy is probably a better library to start with, for now. I’ll talk about my experiences with both in another post.

Thrush Operators in Clojure (->, ->>)

I was experimenting with some sequences today, and ran into a stumbling block: Using immutable data structures, how do you execute multiple transformations in series on an object, and return the final value?

For instance, consider a sequence of numbers,

user> (range 90 100)
(90 91 92 93 94 95 96 97 98 99)

How do you transform them such that you increment each number by 1, and then get their text representation,


Imperatively speaking, you would run a loop on each word, and transform the sequence data structure in place, and the last operation would achieve the desired result. Something like,

>>> s = ""
>>> a = [i for i in range(90,100)]
>>> a
[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

>>> for i in range(0,len(a)):
...   s += chr(a[i]+1)
>>> s

If you knew about maps in python, this could be achieved with something like,

>>> ''.join([chr(i+1) for i in range(90,100)])

The easiest way to do this in Clojure is using the excellently named Thrush operator (-> and ->>). According the doc,

Threads the expr through the forms. Inserts x as the
second item in the first form, making a list of it if it is not a
list already. If there are more forms, inserts the first form as the
second item in second form, etc.

It is used like this,

user> (->> (range 90 100) (map inc) (map char) (apply str))

Basically, the line, (-> 7 (- 3) (- 6)) implies that 7 be substituted as the first argument to -, to become (- 7 3). This result is then substituted as the first argument to the second -, to get (- 4 6), which returns -2.

user> (-> 7 (- 3) (- 6))


Stock Crash

This is what the stock market looked like at 2pm today.

From the Reuter’s article,

The Dow suffered its biggest ever intraday point drop, which may have been caused by an erroneous trade entered by a person at a big Wall Street bank, multiple market sources said.

and the suspected cause? A UI Glitch!

In one of the most dizzying half-hours in stock market history, the Dow plunged nearly 1,000 points before paring those losses—all apparently due to a trader error.

According to multiple sources, a trader entered a “b” for billion instead of an “m” for million in a trade possibly involving Procter & Gamble [ PG 60.75 -1.41 (-2.27%) ], a component in the Dow. (CNBC’s Jim Cramer noted suspicious price movement in P&G stock on air during the height of the market selloff. Watch.)

Sources tell CNBC the erroneous trade may have been made at Citigroup [ C 4.04 -0.14 (-3.35%) ].

“We, along with the rest of the financial industry, are investigating to find the source of today’s market volatility,” Citigroup said in a statement. “At this point we have no evidence that Citi was involved in any erroneous transaction.”

According to a person familiar with the probe, one focus is on futures contracts tied to the Standard & Poor’s 500 stock index, known as E-mini S&P 500 futures, and in particular a two-minute window in which 16 billion of the futures were sold.

Citigroup’s total E-mini volume for the entire day was only 9 billion, suggesting that the origin of the trades was elsewhere, according to someone close to Citigroup’s own probe of the situation. The E-minis trade on the CME.

C++ Modulus Operator weirdness

Its surprising that the modulus (%) operator in C++ works upwards, but not downwards. When working on some code, I expected,

-1 % 3 = 2
0 % 3 = 0
1 % 3 = 1
2 % 3 = 2

but ended up with,

-1 % 3 = -1
0 % 3 = 0
1 % 3 = 1
2 % 3 = 2

As a result, you’d need to ensure that either you check that your result is <0 and reset it accordingly,

result = n % 3;
if( result < 0 ) result += 3;

Or, a better solution might be to change the expression such that the negative case never arises,

{ int n = 0; int inc = -1; cout << (20 + n + inc) % 10; } 9 { int n = 9; int inc = 1; cout << (20 + n + inc) % 10; } 0

Hope this helps someone out there!