Monthly Archive:: June 2010

Mutable vs Immutable datastructures – Serialization vs Performance

In my last post, I was playing around with methods to serialize Clojure data structures, especially a complex record that contains a number of other records and refs. Chas Emerick and others mentioned in the comments there, that putting a ref inside a record is probably a bad idea – and I agree in principle. But this brings me to a dilemma.

Lets assume I have a complex record that contains a number of "sub" records that need to be modified during a program's execution time. One scenario this could happen in is a record called "Table", that contains a "Row" which is updated (Think database tables and rows). Now this can be implemented in two ways,

  • Mutable data structures – In this case, I would put each row inside a table as a ref, and when the need to update happens, just fine the row ID and use a dosync – alter to do any modifications needed.

    • The advantage is that all data is being written to in place, and would be rather efficient.
    • The disadvantage however, is that when serializing such a record full of refs, I would have to build a function that would traverse the entire data structure and then serialize each ref by dereferencing it and then writing to a file. Similarly, I'd have to reconstruct the data structure when de-serializing from a file.

 
{:filename "tab1name",
 :tuples
 #},
      :tup #}
     {:recordid nil,
      :tupdesc
      {:x
       #},
      :tup #}}>,
 :tupledesc
 {:x
  #}}

	

  • Immutable data structures – This case involves putting a ref around the entire table data structure, implying that all data within the table would remain immutable. In order to update any row within the table, any function would return a new copy of the table data structure with the only change being the modification. This could then overwrite the existing in-memory data structure, and then be propagated to the disk as and when changes are committed.

    • The advantage here is that having just one ref makes it very simple to serialize – simply de-ref the table, and then write the entire thing to a binary file.
    • The disadvantage here is that each row change would make it necessary to return a new "table", and writing just the "diff" of the data to disk would be hard to do.

 
#

So at this point, which method would you recommend?

Serializing Clojure Datastructures

I’ve been trying to figure out how best to serialize data structures in Clojure, and discovered a couple of methods to do so. (Main reference thanks to a thread on the Clojure Google Group here )

(def box {:a 1 :b 2})

(defn serialize [o filename]
  (with-open [outp (-> (File. filename) java.io.FileOutputStream. java.io.ObjectOutputStream.)]
    (.writeObject outp o)))

(defn deserialize [filename]
  (with-open [inp (-> (File. filename) java.io.FileInputStream. java.io.ObjectInputStream.)]
    (.readObject inp)))

(serialize box "/tmp/ob1.dat")
(deserialize "/tmp/ob1.dat")

This works well for any Clojure data structure that is serializable. However, my objective is slightly more intricate – I’d like to serialize records that are actually refs. I see a few options for this,

– Either use a method that puts a record into a ref, rather than a ref into a record and then use the serializable, top level map
– Write my own serializer to print this to a file using clojure+read
– Use Java serialization functions directly.

Thoughts?

Ode to an Orange

A whiff of citrus – vibrant,
shiny, dimpled and thick,
your fingers move, probing
textural ecstacy,
as your tastes await
the sweet tartness within.
Peel away the layers
softly, envelop a piece,
let your tongue steep
in a myriad of flavors,
with the lingering scent
of summer under a blue sky,
look around,
and all is well again.

Stack implementation in Clojure II – A functional approach

My last post on the topic was creating a stack implementation using Clojure protocols and records – except, it used atoms internally and wasn’t inherently “functional”.

Here’s my take on a new implementation that builds on the existing protocol and internally, always returns a new stack keeping the original one unmodified. Comments welcome!

(ns viksit-stack
  (:refer-clojure :exclude [pop]))

(defprotocol PStack
  "A stack protocol"
  (push [this val] "Push element in")
  (pop [this] "Pop element from stack")
  (top [this] "Get top element from stack"))

; A functional stack record that uses immutable semantics
; It returns a copy of the datastructure while ensuring the original
; is not affected.
(defrecord FStack [coll]
  PStack
  (push [_ val]
	"Return the stack with the new element inserted"
	(FStack. (conj coll val)))
  (pop [_]
       "Return the stack without the top element"
	 (FStack. (rest coll)))
  (top [_]
       "Return the top value of the stack"
       (first coll)))

; The funtional stack can be used in conjunction with a ref or atom

viksit-stack> (def s2 (atom (FStack. '())))
#'viksit-stack/s2
viksit-stack> s2
#
viksit-stack> (swap! s2 push 10)
#:viksit-stack.FStack{:coll (10)}
viksit-stack> (swap! s2 push 20)
#:viksit-stack.FStack{:coll (20 10)}
viksit-stack> (swap! s2 pop)
#:viksit-stack.FStack{:coll (10)}
viksit-stack> (top @s2)
10

Resolving Chrome’s SSL Error

I recently started getting a number of SSL related errors on accessing https links with Google Chrome on Ubuntu. One looks like,

107 (net::ERR_SSL_PROTOCOL_ERROR)

The top link on Google’s search results is pretty fuzzy, so here’s the solution that works for me.

Go to Settings -> Options -> Under the hood, and enable both SSL 2.0 and SSL 3.0. This should allow Chrome to talk to the server with either protocol.

There’s also a DEFLATE bug that got fixed to solve this issue in release 340 something. http://codereview.chromium.org/1585041

Stack implementation in Clojure using Protocols and Records

I was trying to experiment with Clojure Protocols and Records recently, and came up with a toy example to clarify my understanding of their usage in the context of developing a simple Stack Abstract Data Type.

For an excellent tutorial on utilizing protocols and records in Clojure btw – check out Kotka.de – Memoize done right .


;; Stack example abstract data type using Clojure protocols and records
;; viksit at gmail dot com
;; 2010

(ns viksit.stack
  (:refer-clojure :exclude [pop]))

(defprotocol PStack
  "A stack protocol"
  (push [this val] "Push element into the stack")
  (pop [this] "Pop element from stack")
  (top [this] "Get top element from stack"))

(defrecord Stack [coll]
  PStack
  (push [_ val]
	(swap! coll conj val))
  (pop [_]
       (let [ret (first @coll)]
	 (swap! coll rest)
	 ret))
  (top [_]
       (first @coll)))

;; Testing
stack> (def s (Stack. (atom '())))
#'stack/s
stack> (push s 10)
(10)
stack> (push s 20)
(20 10)
stack> (top s)
20
stack> s
#:stack.Stack{:coll #}
stack> (pop s)
20

More tutorial links on Protocols,

[1] http://blog.higher-order.net/2010/05/05/circuitbreaker-clojure-1-2/
[2] http://freegeek.in/blog/2010/05/clojure-protocols-datatypes-a-sneak-peek/
[3] http://groups.google.com/group/clojure/browse_thread/thread/b8620db0b7424712