How Does Safari 5's JavaScript Performance Stack Up?

Apple has made some strong claims about the performance of Safari 5 in comparison to not just Safari 4, but to all other major browsers. This piqued my curiosity as to the validity of the claim. To test, I used Mozilla's Dromaeo, Webkit's Sunspider and Google's V8 benchmark suite. All tests were done on an otherwise idle iMac 2.8 GHz Intel Core i7 w/ 8GB ram running Mac OS X 10.6.3.

The result? It depends on who you ask. Chrome beat out Safari 5 in 2 out of 3 of the tests. The one test Safari won out on was SunSpider, which was developed by the Webkit team, which seems like it's a fair test to compare Chrome and Safari with. For all of the boasting the Opera guys make about Opera's performance, I found it to be in the middle of the pack. Firefox is clearly the last horse in the race.

For the purpose of what I was interested in, I'll be highlighting only the "total" scores or timings from each benchmark.

SunSpider

I ran the latest version of the SunSpider benchmark. Safari 5 was 1.4 times faster than Safari 4 in Sunspider while it was 1.27 times faster than Opera and only 1.15 times faster than Chrome. Firefox was the real loser in performance, Safari 5 was 2.34 times faster than Firefox 3.6.3.

V8 Benchmark Suite

I would expect that Chrome would shine in Google's own benchmark suite and indeed it did. Chrome was 1.46 times faster than Opera and 1.61 times faster than Safari 5. It was 2.29 times faster than Safari 4. At 11.78 times faster, it outpaced Firefox handily.

Dromaeo

For the Dromaeo benchmark, I used the recommended tests option. This appears to be the most comprehensive of all of the tests, or at least it took the longest of all the benchmarks. I didn't test Safari 4 with Dromaeo, but given that V8 tested Safari 5 as 1.42 times faster than Safari 4 and SunSpider said Safari 5 was 1.4 times faster than Safari 4, I don't feel too bad about omitting it. Okay I feel a little bad and wish I had remembered about Dromaeo before upgrading to Safari 5.

Dromaeo had Chrome winning out over Opera at 2.22 times faster. In addition Chrome was 6.67 times faster than Firefox and 2.27 times faster than Safari 5.

 

Benchmark Data

These numbers represent the overall performance metric for each test.

  • Chrome 5.0.375.70
    • Dromaeo: 903.35 runs/s
    • SunSpider: 265.4ms +/- 6.6%
    • V8 Benchmark Suite: 6,966
  • Firefox 3.6.3
    • Dromaeo: 135.42 runs/s
    • SunSpider: 678.2ms +/- 1.3%
    • V8 Benchmark Suite: 591
  • Opera 10.53
    • Dromaeo: 405.27 run/s
    • SunSpider:  290.4ms +/- 2.5% 
    • V8 Benchmark Suite: 4,755
  • Safari 4.0.5
    • SunSpider: 319.6ms +/- 3.2%
    • V8 Benchmark Suite: 3,041
  • Safari 5.0
    • Dromaeo: 397.35 runs/s
    • SunSpider: 228.8ms +/- 2.4%
    • V8 Benchmark Suite: 4,325

Benchmark Links

Dromaeo Results

Sunspider Results

Filed under  //

Comments [5]

Countdown to pgCon

As you may be aware, pgCon 2010the PostgreSQL conference, is just a few months away.  It is an important year for pgCon with the upcoming release of PostgreSQL 9.0.  pgCon's roots are in the PostgreSQL Anniversary Summit where hackers from all over the world flew in to celebrate PostgreSQL's 10 years of open-source.  pgCon has kept with the tradition of catering to the core of the PostgreSQL community including hackers and DBA's alike.  

As the largest PostgreSQL conference one can always expect to be exposed to new features, the latest in thoughts on PostgreSQL performance and scaling, not to mention the many talks aimed at all levels of PostgreSQL professionals and enthusiasts.  To get ready for pgCon this year, I thought I might point out the talks not to miss.

One of the biggest talks will have to be Heikke Linnakangas introduction of the new built-in replication features in PostgreSQL, Hot Standby and Streaming Replication.  If you're not familiar with these features, they are truly game changing for Postgres.  Building on the conceptual foundation of Warm-Standby and Point and Time Recovery log shipping, Hot Standby and Streaming Replication turn idle Warm Standby boxes in to active read-only slaves.  Married with a HA solution, Hot Standby becomes an immediate failover solution with the ability to disconnect from the Hot Standby master server and turn on writes.  In an effort to speed up the whole log-shipping paradigm, the streaming replication feature means no more external commands for copying log segments across the wire to Warm Standby servers.  By combing these two features, one of the most sought-after features for PostgreSQL, native replication, becomes a reality.  Heikke brings to this talk an intimate knowledge of the implementation of these two new features making this talk a must-see.

If you haven't been to pgCon before and are a PostgreSQL user, DBA or enthusiast like myself, this is the year not to miss.  As for the other significant talks for pgCon 2010, stay tuned as I will highlight my must-see list of talks in the coming weeks.

Filed under  //

Comments [0]

Moving from Tumblr

I decided to move all of my blogging from my two tumblr blogs to Posterous.  While I did need to massage the content of my latest blog post on Tornado a bit, I think that's more of a me thing than a Posterous thing.  What's strange is I put in a meta refresh to auto-redirect from Tumblr to Posterous but it didn't seem to like that very much.  I think I'll need to write some javascript to do the trick.

Comments [0]

Tornado Tip: Variables & Functions in Tornado Templates

Tornado has a very fast and flexible template system that is reminiscent of other template systems, such as that found in web.py.  I have found that in practical use of the template system, the documentation is thin in some areas.  A good example is in what variables and functions are exposed to the templates by default.

To that end, here is a list of the variables and functions available to a template as defined in both template.py and web.py:

  • _ (underscore)
    An alias for the locale.translate() function.
  • current_user
    The current_user object as returned by RequestHandler.get_current_user().
  • datetime
    The datetime module.   Example: {{ datetime.date.today().year }} returns the current year.
  • escape
    A function that escapes a string so it is valid within XML or XHTML.
  • handler
    The request handler that called the self.render function to process the template.  Example: {{handler.static_url(‘foo’)}} will run the static_url function in RequestHandler returning a full URL path, pre-pending static_url.
  • json_encode
    A function that JSON-encodes the given Python object.
  • locale
    The locale value as returned by RequestHandler.get_user_locale().
  • request
    The request object that is passed into the Request Handler.
  • reverse_url 
    Given a full module and class (myapp.Homepage) it will look at the handler to URI mapping and return the URL for the given class.
  • squeeze
    A function that replaces all sequences of whitespace chars with a single space.
  • static_url
    The value of the static_url property passed in to the application settings.
  • url_escape
    A function that returns a valid URL-encoded version of the given value.
  • xsrf_form_html
    If using the cross-site forgery protection, this function returns the hidden input field containing the xrsf variable.
This is the current list as of today's master branch on github.  If you're using 0.2 there are a few functions that are not in that version.  Did I miss one or get something wrong?  Please let me know.

Filed under  //

Comments [2]

Web Application Development with Tornado

I recently finished a rewrite of privatepaste.com in python using Tornado as the web framework. There are multiple reasons why I decided to use Tornado instead of something like Django, cherry.py or web.py, all of which I’ve previously used.  One of the main reasons for my choice to switch to Tornado was due of its feature-rich yet light-weight nature. In addition, the benchmarks and asynchronous http server were intriguing.

Tornado is distributed as a set of loosely coupled python modules. It’s up to the developer to decide which aspects of Tornado you’d like to use. It’s also up to the developer to write the core application which is responsible for running your web application as a daemon. To create a base level application, all that is required is using tornado.httpserver and tornado.web. If you’ve ever programmed with using web.py, many of the conventions should be similar to you.

The base principle in writing an application is to map a URI to a class. In that class you provide functions for the HTTP methods you intend to support. Because Tornado at its core is a HTTP server, you must implement every HTTP method you intend on supporting for a URI. For example, if you are writing a CMS, you would not only implement the GET function, but you’d want to implement a HEAD function for returning browsers with cache information for your content.

Being loosely-coupled, it is up to the developer to implement everything from session handling and authentication to localization and the data layer.  Some may consider this an issue, but not to worry, Tornado includes modules to help.  There are authentication mix-ins for Google, Twitter, Facebook and Friendfeed.  To achieve authentication with Tornado, you would extend the tornado.web.RequestHandler class and extend the get_current_user() to handle the authentication functionality.

Localization is handled in a similar fashion.  While there is some magic under the covers, it generally leaves localization implementation up to the developer.  By extending get_user_locale(), the developer returns a locale object which has been initialized with the appropriate language.  As with other modules, the meat of the documentation is in the locale and web classes.

Documentation is one of the key drawbacks of using Tornado.  If you can not dive in to other peoples code to find what you need, you’re going to have a difficult time with Tornado.  Much of the initial time that I spent with Tornado was in the Tornado code itself, figuring out how to access different parts of data within the http request, templating system and the application class.  The documentation provided is deceivingly simple, and indeed, for a Hello World application the documentation is sufficient and accurate.  It’s when you’re knee deep in code that you’ll find yourself having to go beyond the documentation to get what you need.

The template engine is full-featured and has yet to leave me wanting.  While there has been some back and forth on the mailing lists about the speed of the template engine, it has proven important to turn off debug mode when comparing template engines.  I have found the template engine to be very fast, even when extending other templates and including modules.

The biggest hurdle, which isn’t uncommon in any web application, is right-sizing processes to serve your application.  Because your tornado app runs as a stand-alone HTTP server that is directly coupled to your application classes, you need to run multiple processes to serve multiple requests.  Like FriendFeed, I use Tornado behind a web server using a reverse proxy module.  However, instead of Nginx, I am using Cherokee.  I use Python’s multiprocessing module to spawn multiple HTTPServer instances with my application.  Each instance has its own port number and the reverse proxy server uses these backends in a round-robin pool to provision requests.

When coming from a CGI based backend, one has to think a bit more about how you size your backends.  Because your web server front-end can’t spawn new backends on demand, you’ll need to make sure that you have enough backends to provision your maximum number of simultaneous requests.  There are changes in the master branch of Tornado on github to make Tornado fork on its own, spawning a thread per CPU core, but this will not change the scaling concern, as the same principles apply.

Asynchronous request handling is one of the more often touted features of Tornado.  It’s important to understand exactly how async requests fit into your application development model.  Because each Tornado back end is single threaded it is important to think about the blocking areas of your application, such as database calls, to determine if you can benefit from the async functionality.  To truly benefit from the async server, you’ll need to use a fully async model for any type of operation that would normally be blocking.

An example where the async functionality shines is the Authentication Mix-in’s for Google, Facebook, FriendFeed and Twitter.   When you use these mix-ins, you specify a callback function to call once the HTTPClient class has returned a result from your call.

Because I use psycopg2, a blocking PostgreSQL driver, for my database access, I generally do not use the async functionality.  For me this is not an issue, as in full featured applications I still see performance as fast as 1ms from start to finish of request.  Of course your own application has as much, if not more impact on performance than Tornado itself.

If you’re just getting started with Tornado, be sure to check out the demo code.  If you’re looking for a little more structure in getting started, check out Tinman, a meta-framework on top of Tornado.

Edit: Changed to reflect a misstatement about the new forking changes coming up for 0.3.

Filed under  //

Comments [0]

The Attention Deficit Disorder Guide to RabbitMQ

RabbitMQ has been one of my interests of late, as I’ve identified it as part of our technology path at work. There are other very good resources that dive pretty deep in RabbitMQ and how to use it. The goal of this guide is to help you get on your feet quickly and easily. It assumes a couple of things:

  • You already know about message queues and have some experience or knowledge on the subject.
  • You know what AMQP is.
  • You are already interested in RabbitMQ enough to try it out.

If you’re good on those things, let’s get started…

RabbitMQ is written in erlang. As such, you should have already downloaded and installed erlang as a first step.

Download RabbitMQ and install it, which is pretty easy.  I like to setup RabbitMQ in an /opt/rabbitmq directory. To do that, I set some environment variables before compiling (bash assumed):

1 export TARGET_DIR=/opt/rabbitmq
2 export SBIN_DIR=/opt/rabbitmq/sbin
3 export MAN_DIR=/opt/rabbitmq/man

Then I compile and install with “make install.” Because I like to run as my own user or a service user, I’ll chown -R myuser /opt/rabbitmq as appropriate.

There are a few other things we need to do including make the log directory and the directory RabbitMQ will use to store its data:

1 mkdir /var/log/rabbitmq
2 chown myuser /var/log/rabbitmq
3 mkdir /var/lib/rabbitmq
4 chown myuser /var/lib/rabbitmq

Now as “myuser” we can “cd /opt/rabbitmq/sbin” and run “./rabbitmq-server” and what you should see is:

RabbitMQ 1.6.0 (AMQP 8-0)
Copyright (C) 2007-2009 LShift Ltd., Cohesive Financial Technologies LLC., and Rabbit Technologies Ltd.
Licensed under the MPL. See http://www.rabbitmq.com/

node  : rabbit@binti
log  : /var/log/rabbitmq/rabbit.log
sasl log  : /var/log/rabbitmq/rabbit-sasl.log
database dir: /var/lib/rabbitmq/mnesia/rabbit

starting database …done
starting core processes …done
starting recovery …done
starting persister …done
starting guid generator …done
starting builtin applications …done
starting TCP listeners …done

If you have the hang of starting RabbitMQ and now want to run it in the background, instead do: “./rabbitmq-server -detached”.

Once we’ve gotten this far, we’ve got our broker up and running and now we’ll need some way to talk to it. For the purposes of this article, I’m going to talk about amqplib and Python. There are AMQP libraries for just about every relevant language at this point. RabbitMQ 1.6.0 implements the AMQP 0.8 standard. The easiest way to install amqplib is a simple “easy_install amqplib”.

But before we dive into code, there are a few key concepts we need to talk about:

Queues: You should get these already, one puts a message in a queue and a consumer app receives it somewhere else.

Exchanges: These are a little more tricky than queues. I like to think of them as namespaces.  One of the keen things about RabbitMQ exchanges is that different exchanges will get a different erlang process which should help make better use your available hardware resources. There are three types of exchanges that we need to talk about:

Direct: a direct exchange means when you put a message in, it goes to one consumer and he’s all that will get that message routed through the exchange.

Fanout
: a fanout exchange sends your message to every consumer that listening to a particular exchange / queue combination.

Topic Exchange
: this type of exchange allows you to do neat things like listen to the same queue across exchanges on one consumer, multiple queues in one namespace in a consumer and other wildcard type trickery.

Bindings: In RabbitMQ you bind your exchanges and queues together in unique combinations which determine how messages are routed to what consumers.

Memory: As of RabbitMQ 1.6.0 all messages are kept in memory. If you have nothing consuming your messages and you send too many of them, you’ll run out of memory.

Monitoring: The main install has the app rabbitmq_ctl which you can use to inspect the various parts of RabbitMQ. This isn’t very good for remote monitoring or visualization. For that there’s a great project called Alice which is also erlang based.

Speed: There are two ways to get messages from RabbitMQ: basic_get and basic_consume.

basic_get is where your app, on a message by message basis, asks RabbitMQ for a message. This is the slower of the two methods and will not allow single consumer applications to scale to a very high transaction rate.  Note that RabbitMQ will not register these connections as a consumer and you will not see them in list_queues or in Alice as such.

basic_consume
is where your app registers itself with RabbitMQ as a consumer and RabbitMQ will send messages to you as fast as you’re able to consume them.

Durability: If you want to have the definitions of your queues and exchanges hang around if you have to restart RabbitMQ you need to define them as durable.

Auto-Delete: If you want your queues and exchanges to exist even when there are no consumers waiting for messages on them, you need to turn auto-delete off.

Persistence: If you do not tell RabbitMQ that you want it to hang on to your messages if it reboots, it will not do so. You must set the delivery mode of a message to “2” to tell it to persist it until it is consumed.

Auto-Ack: You can tell RabbitMQ to automatically acknowledge receipt of a message, or you can do it yourself. This is a boolean setting that you use when you’re consuming messages via basic_get or basic_consume.

Queue and Exchange definitions: By default, queues and exchanges do not exist until you connect a consumer to them. You can cheat and do this in your code that enqueues your messages.

Now that we have that out of the way, here’s some sample Consumer code:

 1 #!/bin/env python
 2 """ Sample Consumer Code """
 3 
 4 import amqplib.client_0_8 as amqp
 5 # This is the function that basic_consume will send messages to                               
 6 def process_message( message ):
 7     """ Callback function used by channel.basic_consume """
 8     print 'Received: %s' % message.body
 9 
10 # Rabbit Server to connect to
11 host = '127.0.0.1'
12 port = 5672
13 
14 # Exchange and queue information
15 exchange_name = 'test'
16 exchange_type = 'direct'
17 queue_name = 'messages'
18 routing_key = 'test.messages'
19 
20 # Let's set this up by default, we'll use it later
21 process_messages = True
22 
23 # Connect to Rabbit
24 connection= amqp.Connection( host ='%s:%s' % ( host, port ),
25                         userid = 'guest',
26                         password = 'guest',
27                         ssl = False,
28                         virtual_host = '/' )
29 
30 # Create a channel to talk to Rabbit on
31 channel = connection.channel()
32 
33 # Create our exchange
34 channel.exchange_declare( exchange = exchange_name, 
35                           type = exchange_type, 
36                           durable = True,
37                           auto_delete = False )
38                                        
39 # Create our Queue
40 channel.queue_declare( queue = queue_name , 
41                        durable = True,
42                        exclusive = False, 
43                        auto_delete = True )
44             
45 # Bind to the Queue / Exchange
46 channel.queue_bind( queue = queue_name, 
47                     exchange = exchange_name,
48                     routing_key = routing_key )
49 
50 # Let AMQP know to send us messages
51 consumer_tag = channel.basic_consume( queue = queue_name, 
52                                       no_ack = True,
53                                       callback = process_message )
54 
55 # Loop while process_messages is True
56 while process_messages:
57 
58     # Wait for a message
59     channel.wait()            
60             
61 # Close the channel
62 channel.close()
63 
64 # Close our connection
65 connection.close()
66             
67 # This might go somewhere like a signal handler
68 def cancel_processing():
69     """ Stop consuming messages from RabbitMQ """
70     global channel, consumer_tag, process_messages
71     
72     # Do this so we exit our main loop
73     process_message = False          
74     
75     # Tell the channel you dont want to consume anymore  
76     channel.basic_cancel( consumer_tag )

Note that a lot of what is in that example is commented code and whitespace for ease of reading, the actual implementation is pretty darn simple.

Now that we have a consumer going let’s send some messages in:

 1 #!/bin/env python
 2 import amqplib.client_0_8 as amqp
 3 
 4 # Connect
 5 connection = amqp.Connection( host = "localhost:5672", 
 6                               userid = "guest", 
 7                               password = "guest", 
 8                               virtual_host = "/", 
 9                               insist = False )
10 
11 # Create our channel
12 channel = connection.channel()
13 
14 """ We've already declared our queue, exchange and binding in our consumer so just send the messages """
1 for i in range(0, 10):
2         message = amqp.Message("Test message %i!" % i)
3         message.properties["delivery_mode"] = 2
4         channel.basic_publish( message, 
5                                exchange = "test", 
6                                routing_key = "test.messages")

That’s it! If we did this right, you’ve now setup RabbitMQ, sent some messages and consumed them on the other end of the pipe.

If I’ve kept you this long and you’re still interested, but still have questions, I highly recommend this article which goes much more in depth and has been a valuable guide for me.

If you’re into both python and RabbitMQ, you might want to check out my consumer framework “rejected.py,” it’s on GitHub.

I hope you enjoyed the first of my A.D.D. Guides. I’d be happy to answer any questions and would appreciate feedback so I may improve this and future articles to come.

Filed under  //

Comments [1]

jQuery Tip: How to break out of .each()

I’m working on a dynamic page using jQuery where I have an unordered list of 12 items, one of which will randomly have the class “selected.” My goal was to find out the position in the list of the “selected” item. I hunted around the various properties I could get from a jQuery selector and didn’t find what I was looking for so I ended up with the following solution as illustrated by the following HTML and JavaScript.

1 <ul id=”mylist”>
2   <li>Item #1</li>
3   <li>Item #2</li>
4   <li class=”selected”>Item #3</li>
56   <li>Item #12</li>
7 </ul>

Using the following JavaScript, I was able to determine the position of the li with the selected class:

 1 var selected = 0;
 2 // Iterate through item in the list.  If we find the selected item, return false to break out of the loop
 3 $(‘ul#mylist li’).each(function(index){
 4   if ( $(this).hasClass(‘selected’) )
 5   {
 6     selected = index;
 7     return false; 
 8   } 
 9 });
10 console.debug(‘Selected position is: ’ + selected);

I had spent so much time looking for a function or selector to help me with this issue, I overlooked the obvious, I could keep a counter of my position in the .each() function and then break out from there. But I was faced with a problem, I didn’t know how to break out of an each. The context of it is fairly odd, from a procedural standpoint. Thankfully, Rey Bango had the solution: returning false.

As illustrated in the above code snippet, returning false from an .each() will act like a traditional loop break. If you have any solutions for solving my original problem, I’d love to hear them.

Filed under  //

Comments [1]

Golconde 0.4 Released

I am pleased to announce the first beta release of Golconde, 0.4.

Golconde is a queue based replication solution for PostgreSQL written in Python 2.6.

It is designed to be loosely coupled and rely upon existing enterprise messaging systems that have STOMP protocol support. Designed to scale easily and with multi-data center implementations in mind, the application and message queues for distribution live outside of the database. By decoupling Golconde from PostgreSQL it is differentiated from existing replication solutions, moving the workload from the database tier, where CPU, RAM and IO overhead can be very expensive to a commodity layer where the operational cost for performing the data distribution work is much less expensive. In a typical Golconde target database, the PostgreSQL operational overhead is similar to the canonical database write workload.

For more information, including downloads, please visit http://code.google.com/p/golconde/

Filed under  //

Comments [0]

Golconde 0.3 Released

Dubbed a test release, I posted 0.3 today which contains the fully functioning golconde daemon, examples in the test directory for configuration and use, and the ability to use triggers to enqueue messages for distribution.

It can be downloaded here.

I consider this the first stable test release and will be using it as a foundation for subsequent releases.  The roadmap is as follows:

0.4 - Client classes to abstract the enequeue process from the protocol level.
0.5 - AMQP support in addition to Stomp support and possibly 0MQ support.
0.6 - Two-Phase Commit like behavior on non-trigger application flow via rollback commands.

In addition I have already added additional documentation to the Golconde wiki and will be adding more as time permits.  If you have an opportunity to play with or test it, I’d love to hear your thoughts.

Filed under  //

Comments [0]

Initial ext3 vs ext4 Results

We’ve started to do some internal benchmarking of ext3 vs ext4 at myYearbook.com to see if what we’ve seen and heard about ext4 was really true.  While the following benchmark is not in-depth, it does represent our initial findings, which match our anecdotal findings.  If all of these findings hold true, we expect them to have a large impact on our PostgreSQL OLTP workload where machines are IO bound.

The test platform:

  •  Dell r905
  •  Quad, Quad Core AMD Opteron(tm) Processor 8360 SE 
  •  128GB RAM 
  •  Red Hat Enterprise Linux Server release 5.3
  •  2x Dell MD1120 DAS Arrays, 1 Perc 6/E per DAS
  •  48 Seagate 2.5” SAS 15k RPM 72GB Drives
  •  Each MD1120 has 2x RAID10 LUN’s
  •  Linux kernel RAID0 across all 4 LUN’s

The Results:

EXT4DEV
"  Initial write "  214776.55
"        Rewrite "  305409.42
"           Read "  361373.46
"        Re-read " 9440588.47
"    Random read " 1452105.32
" Mixed workload " 1327560.92
"   Random write "  101430.37
 
EXT3
"  Initial write "  199546.18
"        Rewrite "  340091.72
"           Read "   91159.31
"        Re-read "   93897.52
"    Random read "   52234.47
" Mixed workload "  276443.58
"   Random write "   92115.62

Graph:

Larger values are better.  As you can see, in most cases ext4 is faster than ext3 and in the case of re-read, we initially thought the results were wrong and double checked the findings.

Kudos to the ext4 team.  I’ll post more results as I find them.

Filed under  //

Comments [0]