PostgreSQL connection pooling for mod_php

In a quest for better performance with postgres, I’ve been looking for connection pooling tools. There are a few quirks that I tend to require be met. First, it must run on Solaris. This isn’t so much a quirk, since the server runs Solaris and is SPARC hardware, and I’m not going to install a second server in colo just to accomodate software that doesn’t work on Solaris/SPARC. Additionally, I refuse to install GCC, so it must build with Sun Studio, which is much more GCC compatible that it used to be, but still isn’t GCC. Also, I want it to be reasonably simple to install and setup. I am willing to consider prebuilt packages from sunfreeware. If I get desperate enough, maybe even blastwave. Unfortunately, none of the top choices appear to be on sunfreeware.

The top choices appear to be:

  • pgpool
  • This is the classic choice, building and install is easy, but setup is very arcane.

  • pgbouncer
  • This looks like it should be simple to install and setup, but the configure script refuses to find my libevent install.

  • SQLRelay
  • Works for many databases, unlike the others, including sqlite. However, it requires the rudiments library from the same author, and this library won’t build because the autoconf stuff doesn’t understand anything but GCC.

So, I haven’t broken down to checking out blastwave yet, but so far none of the normal choices are working out for PostgreSQL connection pooling.

Then, I made a small breakthrough when I found that PHP has pg_pconnect. pg_pconnect does some background bookeeping to keep connections open after you call pg_close, and return the same connection if the arguments are the same. Practically, this means that if you use a PHP system that keeps persistant php interpreters (say, mod_php in Apache, which is what I use for PHP), then you have effectively gotten connection pooling for PHP only.

This is a big help already, but I still need a solution that helps out with python.

Yes, I am working on a little web development on vacation.

Posted in Programming, System Administration | Comments Off on PostgreSQL connection pooling for mod_php

How to reset a wordpress user password via SQL.

I found I had forgotten an admin password on a WordPress site I run. After figuring out how to reset it, I thought I would stick it here so that I can find it myself again in the future.


UPDATE wp_users SET user_pass=MD5('secret_password_here') WHERE user_login = 'yourself';

Posted in System Administration | Comments Off on How to reset a wordpress user password via SQL.

Databases for simple web development

I have log been a fan of PostgreSQL over MySQL, believe that PostgreSQL is more feature complete and generally as fast or faster, with obvious caveats about being used appropriately, of course, and not to mention no real comparative testing. Every body gets to have an untested opinion, right?

I did end up doing some performance testing though. What I learned is that both are reasonable fast at simple queries. Great. However opening a new connection to MySQL is much faster than opening a new connection to PostgreSQL. Once the connection is open, but seem equally fast for very simple tests.

Why this matters though is that simple web development in many languages with the most common tools don’t do connection pooling. If you want to just whip up an example PHP program using mod_php, then every page load will result in a new connection. The same goes for mod_python or mod_wsgi (as well as frameworks sitting on top of those plugins). Using each of these common tools with PostgreSQL results in a slow web site. This was driven home when I upgraded from a single 550mhz UltraSPARC II to a dual 1.1Ghz UltraSPARC3 III, and still certain web apps I’ve been tinkering with writing using PostgreSQL for the database are slow.

Certainly there are ways around this. Using a database connection pooling tool for starters, would certainly cure the problem. Also, choosing something that keeps your script running (or a least your database connections open) would also help. Or even writing your application as stand alone program that keeps the database connections open and talks to the web server via JSON-RPC or XML-RPC. But, to quickly whip something out MySQL may be simpler.

Of course, for some applications Sqlite could be a contender. Certainly it is very fast, and very simple to use. For a scalable web site though, it is probably out of the question. There is a reason that Django defaults to using Sqlite first though. And there are also, those less traditional database servers like CouchDB or memcachedb which seem to generally have very fast connection times.

This is a bit disappointing though. AOLServer used to offer connection pooling built into the web server. Of course, I certainly don’t want to use TCL as my development language, but still that would be nice to have.

Meanwhile, can anyone suggest a good Solaris and PostgreSQL connection pooling library?

Posted in Programming, System Administration | Comments Off on Databases for simple web development

toStr – A small C++ utility function.

This can be used as string(“bob”) + toStr(5) without declaring the type, presuming that the type can correctly be inferred by the compiler. Obviously, T must support operator<<.

template <class T>
static inline std::string toStr(T v)
{
  std::stringstream s;
  s << v;
  return s.str();
}
Posted in Programming | Comments Off on toStr – A small C++ utility function.

Emacs Mode for Protobuf editing

I like using Google’s Protocol Buffers (aka protobuf). It is faster and more bandwidth/disk efficient than JSON, but perhaps not quite as simple or flexible.

In protobuf you have messages. In messages, everything is a key/value pair. Keys are tagged to show the type of the value (int, float, string, sub-message), then followed by the value. Keys can be used repeatedly to create something like arrays. The big difference between a protobuf message and a JSON associative array is that in JSON the keys are strings, while in protobuf, the key is a number. Protobuf uses .proto files to describe the messages, primarily to map a text name for keys to a number, and also to describe the type of the field, specify a default value, and to flag it as required/optional/repeated. The messages in the files look something like this:


message Person {
        required int32 id = 1;
        required string name = 2;
        optional string email = 3;
}

I also like Emacs a lot, and so obviously, I will use Emacs to edit these .proto files. Here is the start of a major mode for editing proto files. Just paste it into a file named protobuf.el, then (require ‘protobuf) in you .emacs file.


(define-derived-mode protobuf-mode c-mode
  "Protocol Buffer" "Major mode for editing Google Protocol Buffer files."
  (setq fill-column 80
          tab-width 4))

(add-to-list 'auto-mode-alist '("\\.proto$" . protobuf-mode))
(provide 'protobuf)

This is my first major mode. It is derived from C mode, and I still haven’t figured out how to add syntax rules for the =1 type stuff required by every line. I hope to get back and flesh this out further eventually.

Posted in Emacs | Comments Off on Emacs Mode for Protobuf editing

RESTful message queuing in Python

Alright.  Pass 1 is done.  Here is a link to it.  The server is in Python.  Clients in PHP and Python are provided.  It follows this design document.  On a quad Opteron, it gets about 600 short messages a second.  It isn’t threaded.  Next step on this project is to redo this in Erlang.  And then maybe C++ for the heck of it.

This uses WSGI, specifically the wsgi reference server, so in theory it shouldn’t be hard to adopt to other wsgi servers, like mod_wsgi.  However, beware of problems of thread safety.  Also beware that wsgi servers that use multiple processes will require some sort of external data store instead of process local memory.

(edit: Download link was moved to github, design doc link was also changed)

Posted in Programming | Comments Off on RESTful message queuing in Python

A new message queuing system

First, why a new one?  Because I haven’t found any that do what I need and look simple and well supported.  Besides, it seems like a reasonable learning experience.

The initial summary of what I need is a light weight method for PHP (in the form of scripts running in mod_php) to send messages to a back end program written in Python. However, in the future other languages may be used, so general cross language compatibility is important. That means either bindings exist for every conceivable language, or bindings are trivial to write.  Also, the solution must run on Solaris.

Comparison:

At work I use Apache QPID, which is an AMQP implementation. I can’t find any PHP client for AMQP though. I was able to find discussion that suggested that the AMQP protocol is too heavyweight for PHP being run in mod_php.

Looking at other solutions, I don’t want to run anything that requires Java for the server. That rules out Apache ActiveMQ and other JMS systems. I also believe that XMPP is too heavy weight to parse.  I also found some systems written in Perl, Ruby, and PHP, but they looked rather slap dash, and I don’t particularly want to use those languages.  The initial requirement for supporting PHP is only because I’m working on a PHP web app that I forked.  I do not want to add any new PHP code bases that I need to maintain.  Besides, once I start looking at fringe choices, it gets to be a lot easier to justify writing my own, particularly if I am going to use it as a learning project to get more familiar with, say, Erlang.

Summary:

It is to be a RESTful design.  I will be using JSON for the payloads, but I haven’t decided yet if it makes sense to force this, or if it makes sense to allow all ascii data.  Queues will be single read, meaning that if multiple end points need to get the same message, then there will need to be a separate queue for each end point.  Initially there will be no security model or persistence.  Commands will be standard HTTP verbs.  If possible I will try to make response codes valid HTTP response codes.

Goals:

Run on Solaris

Support Python, PHP, and Javascript as clients.

Limitations:

Initially this will not support persistence.

Also, this will not support any security.

Commands:

PUT queueName

Create a new queue. What the responses will be still need to be decided.

Responses:

201 created, entity required, probably just a confirmation message

409 already existed.

POST queueName

The body of the post will be the contents of the message.

403 queueName wasn’t found.

201 created, and perhaps the entity will be an id for the message.

GET

Gets that do not match the following patterns will be answered with a 404.

GET msg/queueName

Get the next message from the queue queueName.  Here I need some way to to return a message ID in addition to the message body.  It may make sense for the response to be JSON: {‘id’: integer, ‘content’: <valid JSON here>}

If the response is as proposed, the the contents of the POST must be valid JSON as well.

403 queueName wasn’t found.

200, the message

GET queues

Get a list of the created queues.

200

DELETE queueName/integer

Delete a message identified with integer from queue queueName.

204 deleted, no entity required

403, queuename or integer not found.

DELETE queueName

Delete a queue and all the messages in it.

204 Deleted, no entity in response

403 queueName wasn’t found

Posted in Programming | Comments Off on A new message queuing system

Thread Worker Pooling in Python

The worker pool pattern is a fairly common tool for writing multi-threaded programs.  You divide your work up into chunks of some size and you submit the to a work queue.  Then there is a pool of threads that watch that queue for tasks to execute, and when complete, they add the jobs into the finished queue.

Here is the file.

Thanks to Global Interpreter Lock, threads are of somewhat limited usefulness in Python.  I foresee myself mostly using this for network limited tasks, like downloaded a large quantity of RSS feeds.  My idea is that tasks put into the system shouldn’t modify global state, so if I actually needed this for computational tasks, it may be feasible to build it on forks instead, or perhaps the 2.6 multiprocessing system.  However, I still use a lot of systems with only python 2.3 installed, so I’m not likely to want to write 2.6 specific code anytime soon.

Many of the thread pool systems I seem have you specify a single function for the pool, then you just enqueue the inputs.  Mine is different in that each item in the queue can be a different function.  I haven’t actually used it this way though, so it is possible that the extra flexibility is generally wasted.

Python’s lambda seem rather limited.  It is limited to a single expression.  I suppose that this is what Lisp and Scheme do as well, but their expressions offer things like progn.  My first idea is that the task to execute would be a function with no arguments.  I was picturing using a lambda to wrap up whatever I wanted to do.

Now, I still offer that via addTask and assume it internally, but I also offer addTaskArgs, and it takes a function reference, and either an argument list (as a list) or a named argument list (as a dict) and wraps it in a lambda to enqueue.

I now find that my knowledge about how to unit test threaded code is rather limited, and the included unit tests are extremely thin.

Posted in Programming | Comments Off on Thread Worker Pooling in Python

Braided Cinnamon Bread

Preheat oven to 400 degrees.

Follow a basic milk bread recipe (flour, sugar, salt, yeast, water, milk, shortening) through the first kneading and rising stage.

Split dough into three equal portions

Roll out each portion, into a roughly 4:3 shape. Cover with a mixture of sugar, butter, and cinnamon, then roll into a tube.

Fan out three tubes, pinch ends together and braid.

Allow to rise again until double in size.

Bake until lightly golden.

Top with milk/powdered sugar glaze (roughly 1tbsp milk to 1/2 cup powdered suger, 2-4 multiples of that formula will be needed).

Serve. It took 4 hands and a lot of care to move that monster to the platter seen in the first picture.

Posted in Cooking | Comments Off on Braided Cinnamon Bread

Making Miro work with USB sound devices on Ubuntu

On Ubuntu (and possibly other linux distributions) Miro refuses to work with a secondary sound card, it will only work with the primary one despite what the ALSA default is set to, unlike most programs which offer some way to override the default.

Potentially, the second sound card in question could be a PCI card or something else, but based on other people’s experience (like my own) it is usually a USB sound card that is causing trouble. See here (note, the suggested fix didn’t work for me, just like it didn’t the original poster there) and here (they mention fixing it in the trunk, but that doesn’t help me until a new release comes out).

Some people actively want both the onboard sound and the USB or PCI device working, but if you are willing to sacrifice on-board sound, I found a work around. In my case, the on-board sound is worthless. It has some terrible humming/buzzing in the background so I never ever want to use it again.

The solution is to find what the module is that supplies your on-board sound. In my case, the on-board sound is a VT8233, so when I looked at the output from lsmod, it was obvious that the module for this sound device was the snd_via82xx module.

Then, open the /etc/modprobe.d/blacklist file to edit it:
sudo pico /etc/modprobe.d/blacklist
and add the line:
blacklist snd_via82xx
Then reboot.

Now, the USB audio device will be the first audio device.

Posted in System Administration | Comments Off on Making Miro work with USB sound devices on Ubuntu