Reorganizing data with list & dict comprehensions

While writing scripts, I frequently run into the issue of needing to re-arrange sets of data into a more "process friendly" format. A common issue I encounter is needing to turn a list (array) into a dictionary (associative array) or vice versa. More often than not, I find myself needing to be able to access list elements by a key, but since they aren't setup in a dictionary I have to pull out a looping technique to reorganize the data for this to be possible.

Take the following set of data for example:

[[1, 'John Smith', 'admin'],
 [2, 'Jane Doe', 'superuser'],
 [3, 'Sam Jones', 'user']]

What we have here is a few rows of user data. In this example, the data is in a Python list (which in PHP would be an array).

In PHP, this would look something like (using print_r):

Array (
    [0] => Array (
        [0] => 1
        [1] => John Smith
        [2] => admin
    [1] => Array (
        [0] => 2
        [1] => Jane Doe
# (...etc...)

So, what if I found myself writing some code that needed to be able to access each record by its first value, which in this case would be the user_id?

Well, in PHP the easiest way to make this happen would be a good-old-fashioned foreach loop:

foreach ($row as $val) {
    $out[$val[0]] = $val;

Not very "sexy" for something I find myself having to more often than I would like, but it works nonetheless.

Now, here's where the beauty of Python kicks in. Python has a built-in feature called "comprehension" which allows you to modify and/or convert sets of data quickly and easily.

Remember what our original data set looks like? Here's a refresher:

>>> print row
[[1, 'John Smith', 'admin'],
 [2, 'Jane Doe', 'superuser'],
 [3, 'Sam Jones', 'user']]

Now, in Python all we need to do is use the dictionary comprehension:

>>> out = {val[0]: val for val in row} # Python 3.X
# (or, for Python 2.X)
>>> out = dict((val[0], val) for val in row) # Python 2.4+

>>> print out
{1: [1, 'John Smith', 'admin'],
 2: [2, 'Jane Doe', 'superuser'],
 3: [3, 'Sam Jones', 'user']}

See how easy that was?

And what if I had a dictionary in the above format, and needed to turn it into the type of list I had previously? Well, I could simply use the list comprehension to re-generate this set of data as a list:

>>> print [val for val in out.values()]
[[1, 'John Smith', 'admin'],
 [2, 'Jane Doe', 'superuser'],
 [3, 'Sam Jones', 'user']]

This is an extremely simple comprehension example. For a deeper look, check out Wikipedia's Python syntax page or the official Python documentation.


The great web technology shootout - Round 1: A quick glance at the landscape

A lot of the information below is out of date. Please see the new framework shootout page for the latest benchmarks.

Recently I went on a benchmarking spree and decided to throw ApacheBench at a bunch of the different web development technology platforms I interact with on a day-to-day basis. The results were interesting enough to me that I decided I'd take a post to share them here.

Disclaimer: The following test results should be taken with a *massive* grain of salt. If you know anything about benchmarking, you will know that the slightest adjustments have the potential to change things drastically. While I have tried to perform each test as fairly and accurately as possible, it would be foolish to consider these results as scientific in any way. It should also be noted that my goal here was not to see how fast each technology performs at its most optimized configuration, but rather what a minimal out-of-the-box experience looks like.

Test platform info:

  • The hardware was an Intel Core2Quad Q9300, 2.5Ghz, 6MB Cache, 1333FSB, 2GB DDR RAM.
  • The OS was CentOS v5.3 32-bit with a standard Apache Webserver setup.
  • ApacheBench was used with only the -n and -c flags (1000 requests for the PHP frameworks, 5000 requests for everything else).
  • Each ApacheBench test was run 5-10 times, with the "optimum average" chosen as the numbers represented here.
  • The PHP tests were done using the standard Apache PHP module.
  • The mod_wsgi tests were done in daemon mode ...

Why I've fallen in love with Python

Now that I'm using Python for a large percentage of my development, I thought it would be fun to highlight a few reasons why Python has become my new language of choice.

In an effort to help you understand where I'm coming from, let me briefly rehash some of my programming history: I spent much of the 90's doing dynamic web development using Perl (weren't those the days). I eventually migrated to PHP which usually made things much easier on the web; and subsequently replaced most of my console scripting with BASH [shell scripting]. However, I'm kind of a hack and love languages so I have occasionally been known to write something in C; and although I'm not a complete stranger to Java and Ruby, I never really felt like I "clicked" with either of those languages.

Ok, now that I've hopefully convinced you that I'm not just a fly-by-night programmer, let me show you some Python code. Brace yourself, as this article is bound to get lengthy...

Reason #1: "Whitespace done right" is actually a good thing

The first thing that people either absolutely love or adamantly hate about Python is the fact that its syntax is heavily tied to proper usage of whitespace. At first glance, this causes many curious onlookers from other languages to shy away from Python and continue in their brace-encapsulated bondage. I'll admit, at first I wasn't too wild about these new restrictions either ...