2011-05-16 11:17:57 +00:00
|
|
|
#+TITLE: libpsyc Performance Benchmarks
|
|
|
|
|
|
|
|
In this document we present the results of performance benchmarks
|
|
|
|
of libpsyc compared with libjson-glib and libxml2.
|
|
|
|
|
|
|
|
* Procedure
|
|
|
|
We'll use typical messages from the XMPP ("stanzas" in Jabber
|
|
|
|
lingo) and compare them with equivalent JSON encodings,
|
|
|
|
verbose and compact PSYC formats.
|
|
|
|
|
|
|
|
In some cases we will additionally compare PSYC packets to
|
|
|
|
a more efficient XML encoding based on PSYC methods, to have
|
|
|
|
a more accurate comparison of the actual PSYC and XML
|
|
|
|
syntaxes, rather than the protocol structures of PSYC and XMPP.
|
|
|
|
|
|
|
|
* The Benchmarks
|
|
|
|
** A presence packet
|
|
|
|
Since presence packets are by far the dominant messaging content
|
|
|
|
in the XMPP network, we'll start with one of them.
|
|
|
|
Here's an example from paragraph 4.4.2 of RFC 6121.
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/presence.xml src xml
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
And here's the same information in a JSON rendition:
|
|
|
|
|
2011-05-18 18:51:46 +00:00
|
|
|
#+INCLUDE: packets/presence.json src js
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
Here's the equivalent PSYC packet in verbose form
|
|
|
|
(since it is a multicast, the single recipients do not
|
|
|
|
need to be mentioned):
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/presence.psyc src psyc
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
And the same in compact form:
|
|
|
|
|
|
|
|
#+BEGIN_SRC psyc
|
|
|
|
:c psyc://example.com/~juliet
|
|
|
|
|
|
|
|
=da 4
|
|
|
|
np
|
|
|
|
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
** An average chat message
|
|
|
|
|
2011-05-18 19:59:00 +00:00
|
|
|
XMPP:
|
2011-05-16 11:17:57 +00:00
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/chat_msg.xml src xml
|
2011-05-16 11:17:57 +00:00
|
|
|
|
2011-05-18 19:59:00 +00:00
|
|
|
JSON:
|
|
|
|
|
|
|
|
#+INCLUDE: packets/chat_msg.json src js
|
|
|
|
|
2011-05-16 11:17:57 +00:00
|
|
|
PSYC:
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/chat_msg.psyc src psyc
|
2011-05-16 11:17:57 +00:00
|
|
|
|
2011-05-18 19:59:00 +00:00
|
|
|
Why PSYC doesn't have an id? Because packet counting from contexts
|
|
|
|
and circuits is automatic: The packet already has a number just by
|
|
|
|
being there.
|
|
|
|
|
|
|
|
Also, PSYC by default doesn't mention a "resource" in XMPP terms,
|
|
|
|
instead it allows for more addressing schemes than just PSYC.
|
|
|
|
|
2011-05-16 11:17:57 +00:00
|
|
|
** A new status updated activity
|
|
|
|
Example taken from http://onesocialweb.org/spec/1.0/osw-activities.html
|
|
|
|
You could call this XML namespace hell:
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/activity.xml src xml
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
http://activitystrea.ms/head/json-activity.html proposes a JSON encoding
|
|
|
|
of this. We'll have to add a routing header to it.
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/activity.json src js
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
http://about.psyc.eu/Activity suggests a PSYC mapping for activity
|
|
|
|
streams. Should a "status post" be considered equivalent to a presence
|
|
|
|
description announcement or just a message in the "microblogging" channel?
|
|
|
|
We'll use the latter here:
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/activity.psyc src psyc
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
** A message with JSON-unfriendly characters
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/json-unfriendly.xml src xml
|
2011-05-18 19:59:00 +00:00
|
|
|
#+INCLUDE: packets/json-unfriendly.json src js
|
|
|
|
#+INCLUDE: packets/json-unfriendly.psyc src psyc
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
** A message with XML-unfriendly characters
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/xml-unfriendly.xml src xml
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
** A message with PSYC-unfriendly strings
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/psyc-unfriendly.xml src xml
|
2011-05-18 19:59:00 +00:00
|
|
|
#+INCLUDE: packets/psyc-unfriendly.json src js
|
|
|
|
#+INCLUDE: packets/psyc-unfriendly.psyc src psyc
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
** A packet containing a JPEG photograph
|
|
|
|
... TBD ...
|
|
|
|
|
|
|
|
** A random data structure
|
|
|
|
In this test we'll not consider XMPP at all and simply compare the
|
|
|
|
efficiency of the three syntaxes at serializing a typical user data base
|
|
|
|
storage information. We'll again start with XML:
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/user_profile.xml src xml
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
In JSON this would look like this:
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/user_profile.json src js
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
Here's a way to model this in PSYC:
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
#+INCLUDE: packets/user_profile.psyc src psyc
|
2011-05-16 11:17:57 +00:00
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
* Results
|
|
|
|
|
|
|
|
Parsing time of 1 000 000 packets in milliseconds:
|
|
|
|
|
|
|
|
| input: | PSYC | | JSON | | | XML | |
|
|
|
|
| parser: | strlen | libpsyc | json-c | json-glib | libxml sax | libxml | rapidxml |
|
|
|
|
|-----------+--------+---------+--------+-----------+------------+--------+----------|
|
|
|
|
| presence | 30 | 246 | 2463 | 10197 | 4997 | 7557 | 1719 |
|
|
|
|
| chat msg | 41 | 320 | | | 5997 | 9777 | 1893 |
|
|
|
|
| activity | 42 | 366 | 4666 | 16846 | 13357 | 28858 | 4419 |
|
|
|
|
| user prof | 55 | 608 | 4715 | 17468 | 7350 | 12377 | 2477 |
|
|
|
|
|-----------+--------+---------+--------+-----------+------------+--------+----------|
|
|
|
|
| / | < | > | < | > | < | | > |
|
|
|
|
|
|
|
|
These tests were performed on a 2.53 GHz Intel(R) Core(TM)2 Duo P9500 CPU.
|
2011-05-16 11:17:57 +00:00
|
|
|
|
|
|
|
* Conclusions
|
|
|
|
... TBD ...
|
|
|
|
|
|
|
|
* Criticism
|
|
|
|
Are we comparing apples and oranges? Yes and no, depends on what you
|
|
|
|
need. XML is a syntax best suited for complex structured data in
|
|
|
|
well-defined formats - especially good for text mark-up. JSON is a syntax
|
|
|
|
intended to hold arbitrarily structured data suitable for immediate
|
|
|
|
inclusion in javascript source codes. The PSYC syntax is an evolved
|
|
|
|
derivate of RFC 822, the syntax used by HTTP and E-Mail, and is therefore
|
|
|
|
limited in the kind and depth of data structures that can be represented
|
|
|
|
with it, but in exchange it is highly performant at doing just that.
|
|
|
|
|
|
|
|
So it is up to you to find out which of the three formats fulfils your
|
|
|
|
requirements the best. We use PSYC for the majority of messaging where
|
|
|
|
JSON and XMPP aren't efficient and opaque enough, but we employ XML and
|
|
|
|
JSON as payloads within PSYC for data that doesn't fit the PSYC model.
|
|
|
|
For some reason all three formats are being used for messaging, although
|
|
|
|
only PSYC was actually designed for that purpose.
|
|
|
|
|
|
|
|
* Caveats
|
|
|
|
In every case we'll compare performance of parsing and re-rendering
|
|
|
|
these messages, but consider also that the applicative processing
|
|
|
|
of an XML DOM tree is more complicated than just accessing
|
|
|
|
certain elements in a JSON data structure or PSYC variable
|
|
|
|
mapping.
|
|
|
|
|
|
|
|
For a speed check in real world conditions which also consider the
|
|
|
|
complexity of processing incoming messages we should compare
|
|
|
|
the performance of a chat client using the two protocols,
|
|
|
|
for instance by using libpurple with XMPP and PSYC accounts.
|
|
|
|
To this purpose we first need to integrate libpsyc into libpurple.
|
|
|
|
|
|
|
|
* Futures
|
|
|
|
After a month of development libpsyc is already performing pretty
|
|
|
|
well, but we presume various optimizations, like rewriting parts
|
|
|
|
in assembler, are possible.
|
|
|
|
|
2011-05-17 10:27:07 +00:00
|
|
|
|
|
|
|
* Appendix
|
|
|
|
** Tools used
|
|
|
|
|
|
|
|
libpsyc:
|
|
|
|
|
|
|
|
: test/testStrlen -sc 1000000 -f $file
|
|
|
|
: test/testPsycSpeed -sc 1000000 -f $file
|
|
|
|
: test/testJson -snc 1000000 -f $file
|
|
|
|
: test/testJsonGlib -snc 1000000 -f $file
|
|
|
|
|
|
|
|
xmlbench:
|
|
|
|
|
|
|
|
: parse/libxml-sax 1000000 $file
|
|
|
|
: parse/libxml 1000000 $file
|
|
|
|
: parse/rapidxml 1000000 $file
|