From 254bb58a7cdf25ec6a774e851a94980efde3b73b Mon Sep 17 00:00:00 2001 From: Gabor Adam Toth Date: Mon, 16 May 2011 13:17:57 +0200 Subject: [PATCH] bench: converted to org, added packet extraction script, added first results --- bench/Makefile | 28 +++++ bench/benchmark.org | 270 ++++++++++++++++++++++++++++++++++++++++++++ bench/results.org | 55 +++++++++ 3 files changed, 353 insertions(+) create mode 100644 bench/Makefile create mode 100644 bench/benchmark.org create mode 100644 bench/results.org diff --git a/bench/Makefile b/bench/Makefile new file mode 100644 index 0000000..d886661 --- /dev/null +++ b/bench/Makefile @@ -0,0 +1,28 @@ +ORG_PATH = /usr/share/emacs/site-lisp/org-mode +INIT = (setq load-path (cons \"/usr/share/emacs/site-lisp/org-mode\" load-path)) (require 'org-install) + +wiki2org: + perl -pe '\ + s/^= (.*) =\s*$$/#+TITLE: $$1\n/; \ + s/^== (.*) ==\s*$$/* $$1/; \ + s/^=== (.*) ===\s*$$/** $$1/; \ + s/^{{{/#+BEGIN_SRC/; \ + s/^}}}/#+END_SRC/ \ + ' benchmark.wiki >benchmark.org + +tangle: + emacs -Q --batch --eval \ + "(progn ${INIT} (find-file \"benchmark.org\") \ + (setq org-babel-tangle-pad-newline nil org-src-preserve-indentation t) \ + (org-babel-tangle) (kill-buffer))" + perl -pi -e 'print "\n" unless $$p; $$p=1' packets/user_profile.psyc + +html: + emacs -Q --batch --eval \ + "(progn ${INIT} (find-file \"benchmark.org\") \ + (org-export-as-html-batch) (kill-buffer))" + +pdf: + emacs -Q --batch --eval \ + "(progn ${INIT} (find-file \"benchmark.org\") \ + (org-export-as-pdf org-export-headline-levels) (kill-buffer))" diff --git a/bench/benchmark.org b/bench/benchmark.org new file mode 100644 index 0000000..7532afa --- /dev/null +++ b/bench/benchmark.org @@ -0,0 +1,270 @@ +#+TITLE: libpsyc Performance Benchmarks + +In this document we present the results of performance benchmarks +of libpsyc compared with libjson-glib and libxml2. + +* Procedure +We'll use typical messages from the XMPP ("stanzas" in Jabber +lingo) and compare them with equivalent JSON encodings, +verbose and compact PSYC formats. + +In some cases we will additionally compare PSYC packets to +a more efficient XML encoding based on PSYC methods, to have +a more accurate comparison of the actual PSYC and XML +syntaxes, rather than the protocol structures of PSYC and XMPP. + +* The Benchmarks +** A presence packet +Since presence packets are by far the dominant messaging content +in the XMPP network, we'll start with one of them. +Here's an example from paragraph 4.4.2 of RFC 6121. + +#+BEGIN_SRC xml :tangle packets/presence.xml + + away + +#+END_SRC + +And here's the same information in a JSON rendition: + +#+BEGIN_SRC js :tangle packets/presence.json +["presence",{"from":"juliet@example.com/balcony","to":"benvolio@example.net"},{"show":"away"}] +#+END_SRC + +Here's the equivalent PSYC packet in verbose form +(since it is a multicast, the single recipients do not +need to be mentioned): + +#+BEGIN_SRC psyc :tangle packets/presence.psyc +:_context psyc://example.com/~juliet + +=_degree_availability 4 +_notice_presence +| +#+END_SRC + +And the same in compact form: + +#+BEGIN_SRC psyc +:c psyc://example.com/~juliet + +=da 4 +np +| +#+END_SRC + +** An average chat message + +XML: + +#+BEGIN_SRC xml :tangle packets/chat_msg.xml + + Art thou not Romeo, and a Montague? + +#+END_SRC + +PSYC: + +#+BEGIN_SRC psyc :tangle packets/chat_msg.psyc +:_source psyc://example.com/~juliet +:_target psyc://example.net/~romeo + +_message_private +Art thou not Romeo, and a Montague? +| +#+END_SRC + +** A new status updated activity +Example taken from http://onesocialweb.org/spec/1.0/osw-activities.html +You could call this XML namespace hell: + +#+BEGIN_SRC xml :tangle packets/activity.xml + + + + + + to be or not to be ? + http://activitystrea.ms/schema/1.0/post + + http://onesocialweb.org/spec/1.0/object/status + to be or not to be ? + + + + http://onesocialweb.org/spec/1.0/acl/action/view + + + + + + + + +#+END_SRC + +http://activitystrea.ms/head/json-activity.html proposes a JSON encoding +of this. We'll have to add a routing header to it. + +#+BEGIN_SRC js :tangle packets/activity.json +["activity",{"from":"hamlet@denmark.lit/snsclient"},{"verb":"post", +"title":"to be or not to be ?","object":{"type":"status", +"content":"to be or not to be ?","contentType":"text/plain"}}] +#+END_SRC + +http://about.psyc.eu/Activity suggests a PSYC mapping for activity +streams. Should a "status post" be considered equivalent to a presence +description announcement or just a message in the "microblogging" channel? +We'll use the latter here: + +#+BEGIN_SRC psyc :tangle packets/activity.psyc +:_context psyc://denmark.lit/~hamlet#_follow + +:_subject to be or not to be ? +:_type_content text/plain +_message +to be or not to be ? +| +#+END_SRC + +** A message with JSON-unfriendly characters +#+BEGIN_SRC xml :tangle packets/json-unfriendly.xml + + "Neither, fair saint, if either thee dislike.", he said. +And +the +rest +is +history. + +#+END_SRC + +** A message with XML-unfriendly characters +#+BEGIN_SRC xml :tangle packets/xml-unfriendly.xml + + Wherefore art thou, Romeo? + + PročeŽ jsi ty, Romeo? + + +#+END_SRC + +** A message with PSYC-unfriendly strings +#+BEGIN_SRC xml :tangle packets/psyc-unfriendly.xml + + I implore you with a pointless +newline in a header variable + Wherefore art thou, Romeo? +| +And for practicing purposes we added a PSYC packet delimiter. + +#+END_SRC + +** A packet containing a JPEG photograph +... TBD ... + +** A random data structure +In this test we'll not consider XMPP at all and simply compare the +efficiency of the three syntaxes at serializing a typical user data base +storage information. We'll again start with XML: + +#+BEGIN_SRC xml :tangle packets/user_profile.xml + + Silvio Berlusconi + Premier + I +
+ Via del Colosseo, 1 + 00100 + Roma +
+ http://example.org +
+#+END_SRC + +In JSON this would look like this: + +#+BEGIN_SRC js :tangle packets/user_profile.json +["UserProfile",{"Name":"Silvio Berlusconi","JobTitle":"Premier","Country":"I","Address": +{"Street":"Via del Colosseo, 1","PostalCode":"00100","City":"Roma"},"Page":"http://example.org"}] +#+END_SRC + +Here's a way to model this in PSYC: + +#+BEGIN_SRC psyc :tangle packets/user_profile.psyc + +:_name Silvio Berlusconi +:_title_job Premier +:_country I +:_address_street Via del Colosseo, 1 +:_address_code_postal 00100 +:_address_city Roma +:_page http://example.org +_profile_user +| +#+END_SRC + +* Conclusions +... TBD ... + +* Criticism +Are we comparing apples and oranges? Yes and no, depends on what you +need. XML is a syntax best suited for complex structured data in +well-defined formats - especially good for text mark-up. JSON is a syntax +intended to hold arbitrarily structured data suitable for immediate +inclusion in javascript source codes. The PSYC syntax is an evolved +derivate of RFC 822, the syntax used by HTTP and E-Mail, and is therefore +limited in the kind and depth of data structures that can be represented +with it, but in exchange it is highly performant at doing just that. + +So it is up to you to find out which of the three formats fulfils your +requirements the best. We use PSYC for the majority of messaging where +JSON and XMPP aren't efficient and opaque enough, but we employ XML and +JSON as payloads within PSYC for data that doesn't fit the PSYC model. +For some reason all three formats are being used for messaging, although +only PSYC was actually designed for that purpose. + +* Caveats +In every case we'll compare performance of parsing and re-rendering +these messages, but consider also that the applicative processing +of an XML DOM tree is more complicated than just accessing +certain elements in a JSON data structure or PSYC variable +mapping. + +For a speed check in real world conditions which also consider the +complexity of processing incoming messages we should compare +the performance of a chat client using the two protocols, +for instance by using libpurple with XMPP and PSYC accounts. +To this purpose we first need to integrate libpsyc into libpurple. + +* Futures +After a month of development libpsyc is already performing pretty +well, but we presume various optimizations, like rewriting parts +in assembler, are possible. + diff --git a/bench/results.org b/bench/results.org new file mode 100644 index 0000000..a81c3ae --- /dev/null +++ b/bench/results.org @@ -0,0 +1,55 @@ +#+TITLE: Benchmark results +#+OPTIONS: ^:{} + +* libpsyc + +: ./testPsyc -snqc 1000000 -f $file + +- presence: 597 ms +- chat_msg: 714 ms +- user_profile: 1806 ms +- activity: 903 ms + +* libjson + +: ./testJson -snqc 1000000 -f $file + +- presence: 3247 ms +- user_profile: 5847 ms +- activity: 5768 ms + +* rapidxml + +: ./rapidxml 1000000 $file + +- presence: 1719 ms +- chat_msg: 1893 ms +- user_profile: 2477 ms +- activity: 4419 ms + +* rapidxml fast mode + +: fast_mode=1 ./rapidxml 1000000 $file + +- presence: 1643 ms +- chat_msg: 1799 ms +- user_profile: 2218 ms +- activity: 4001 ms + +* libxml + +: ./libxml 1000000 $file + +- presence: 7557 ms +- chat_msg: 9777 ms +- user_profile: 12377 ms +- activity: 28858 ms + +* libxml sax + +: ./libxml-sax 1000000 $file + +- presence: 4997 ms +- chat_msg: 5997 ms +- user_profile: 7350 ms +- activity: 13357 ms