Wednesday, July 16, 2008

Minimize your sessions

I was debugging some session issues and I saw a 2K session size. The main reason was because of serialization of objects in the session. I find this session size very large and problematic. For a small number of requests it isn't that problematic.

Assuming a session is 2K and is read and write on every request and there are 30 req/sec then you have 7.6 meg per minute traffic on the wire. Not to mention 1K sessions mean 2 Megs of memory with only a thousand sessions. Not to mention larger session means more memcopy (kernal) from socket to user space and probable form web server to framework you add another. Race conditions from all this overhead also becomes an issue because of the increased latency.

With objects there is the serialization and deserialization of an object is expensive. Then added to that if you change a class all the session data must be flushed because the old object no longer matches the new object.

Lessons:
  1. Don't store objects in session
  2. Minimize session size
  3. If you are using memcache or memory session storage don't store anything that shoudl be persisted in the data store.

Tuesday, July 15, 2008

frisky async server and src site launch

Alright finally after a couple weeks of get bits an pieces of time to work on it I am going to share the proof of concept web server I have been hacking for fun. The original server was hacked up a long time ago which I have blogged about before called FAWPS. I was working on trying to make it a bit easier to use FAWPS and hopefully down the road provide enough framework to still have high performance WSGI async framework but still be easy to hack. Also I haven't been able to get in touch with the maintainer of FAWPS so I created a little project called Frisky. This is just the proof of concept that I am releasing. I created the src (src.hackingthought.com) to host a number of little code projects I have and can share assuming some one would like to see them. But also to run Frisky as a web server. It has been a blast to work on!
I think if you have some static files you need to server up or high performance Python web servering then give it a try. On my laptop it is getting thousands of requests/second with dynamic templates. With straight cached content (images, html) I am seeing around 4.5K req/sec. Blowing the doors off any other webserver I have tried with a scripting language. It uses FAWPS which in turn uses libevent http server.
One of the things I learned at the Google IO conference is that speed matters. If you have used dojo or other javascript framework in development and you are waiting for Rails or Python framework to load all the javascript files you know speed matters. So now I need to get WSGI frameworks hooked into it so I can use it for static file serving until I can get a high performance datastore.

Will Object Databases save us from ORMs?

Well I am trying hard to figure out Object Databases and how to use them. As my previous post humbly points out I am being dated by ORMs. Not that I haven't in the past written my own ORM in python (2 different times). My favorite ORM is SQLAlchemy is great because it provides a relation mapping to tables. After the darks days of EJB and trying to figure out how to write a better ORM I came to the conclusion that Relational data are not OBJECTS or classes, a table is not composed in another table it is related! Round peg square hole is what I started to see and now it seems so clear to me.
There is a problem though. We don't have a language agnostic (that I know of) Object Database. The benifit of RDBMS as the application data usually changes little even though languages and frameworks go out of fashion very quickly. However we would need to store this in something like Corba IDL which describe language agnostic object syntax. We are doing all this work so that we can store the methods along with the data? We don't need to store the method definitions though right. So really all we need to do is store the data structure which our Relation Databases already do very well. Or do they? The advantage of what I have seen form Object Dataabases (include googles app engine storage) is that they provide storage of the data. So really I am looking for data structure storage that is language agnostic, high peformance, transactional and easy to use. Does that exist? Well off to find something (CouchDB?) that I can use.

Pet peeve pluaral database table names

I was reading this post and the post it links to as it started to come to why plural table names drive me crazy. I have a pet peeve about database schemas. I guess this is a sign I am getting old (mature developer). Way back in the dark ages we described our data structures (schema) in a specific way. I guess in the old days we defined our data structures in C like so:

struct example {
int x;
int y;
};

Then as we learned about objects C++, Java, Objective-c we started adding functions (methods)
to our data structures.

class Example {
public:
Cube();
~Cube();
int add();
int x;
int y;
};

However the data schema was still singular. And so when ever I would see a table schema:

CREATE TABLE examples (INT x, INT y);

I associated with an MS Access database schema got upgraded to a RDBMS. The individual who created the database probably never had formal data structure education. So usually the schema had little if any normalization to it and generally I would need to migrate the entire database schema at some point. Fortunately for me I would see this more and more because novice tools such as PHP and ASP where allowing more novice computer users to create web applications using RDBMS (DBMS as mysql was at the time). This was a wonderful thing more application online means I am more in demand! WOOT! However I still keep seeing this more frequently and more frequently. And now that I am doing more work with Rails I am starting to wonder why are all these table names plural? From the main rails site under documentation the first tutorial I am seeing a possible answer.

Plural table names I guess are all the rage. Do they teach this in data structure classes? I mean should I expect to see classes in Ruby:

class Examples < ActiveRecord::Base
# Uh what?
end

None of the examples show that ... so why do they define the relation schema plural? Well I am sure my Python peeps are old school like me and sqlalchemy wouldn't do that... DOH! Even sqlalchemy Python "et tu Burte"?

Guess I am just dated. In the days of ORMs it maybe make more sense to use plural because we only references myobject.examples.query(foo). And since ORM schema is autogenerated then developers are not writing "Database Schemas" any more so defining the data is just whatever is generated by the ORM.