Thursday, May 31, 2007

Getting AutKit to work more like TG Identity

One of the things I have been struggling with on Pylons is the support for AuthKit. TuboGears has such a great framework call identity. AuthKit is under heavy development which might be part of the reason I am having so much trouble. However I think as part of the Pylons community we need to come up with a standard solution that has standard implementations. One of the main things is being able to ask the framework if a user is in a group.
u = get_user()
u.has_role('admin') or u.in_group('admin')

This user object needs to be accessible any where in the application. Right now I am working on placing the object in the namespace of my templates. A lot UI is based on roles that a user has. The current implementation would end up working something like this.

import authkit

class AppUser(authkit.AuthUser):
"""
The AuthUser would have has_role, authenticated ect.
Also you could override the password verification methods and all that
instead of plugging it into globals somewhere.
"""

def __init__(self):
authkit.AuthUser._init__(self)
self.roles = ['some list, property, or function call']

AuthKit has some classes that provide some amount of support for this but I think my solution would be much easier to use in templates.

Friday, May 25, 2007

OpenId, AuthKit

OpenId seems to really hit a niche. Especially with all this cross commenting on blogs. I hope that blogger.com will support it soon. Looks like the web 2.0 frameworks have taken to it, Pylons, TuboGears and Symfony I know support plug ins of some kind for it.
AuthKit has support for it but I am kinda blah on the hole AuthKit Pylons integration. Mainly I have not seen anything that makes it easy to use role based control in templates. And I haven't come up with a good way to do it yet. I guess I could copy how TurboGears does it and stuff information in the namespace of the template engine to support this.
The great part of Pylons is the amount of control over the application. To create a initial project it takes a lot longer however I think for large projects it provides enough flexibility you need.

Wednesday, May 16, 2007

Event Architecture Experiment

The benchmarks are impressive. In this situation:
I was able to get over 800 request per second. Every request multicast an event to spread. A Parallel Python process with 0 workers listened and sent the request on to a ppserver process with 2 workers. Each worker wrote the message to a file. Largest Python process was 7mb. So this looks to be pretty tight operation.
Notes:
  • If I don't have the ppserver processes write to a file I get over 1,000 request per second.
  • I have it increment a memcache counter it is ~900 request per second.

Apache ab tool output:
Concurrency Level: 100
Time taken for tests: 9.325 seconds
Complete requests: 10000
Failed requests: 0
Broken pipe errors: 0
Total transferred: 1120773 bytes
HTML transferred: 110066 bytes
Requests per second: 1072.39 [#/sec] (mean)
Time per request: 93.25 [ms] (mean)
Time per request: 0.93 [ms] (mean, across all concurrent requests)


Thats no benchmark!
There are so many reasons this is not a true benchmark. However without a lot of hardware (of which I am lacking) this will do just fine IMO. If I had access to hardware to test this on I would.
  1. Run multiple FAWPS servers
  2. Spread on it own host
  3. Event listener on its own host
  4. ppserver on its own host
  5. Memcached on its own host

Tuesday, May 15, 2007

WSGI Async Framework

I have been playing around with an asynchronous web framework call FAPWS. Took me a while to stub my toe on this one but eventually I found it and have been very impressed with it so far. William the creator of FAPWS (which I am the first to say the project needs a name change) is a great guy has really taken all the feed back I have given him and responded as quickly as he could. He made the FAPWS server able to serve WSGI compliant application. So Pylons (as well as many other WSGI frameworks) can be driven async by FAPWS. First this illustrates the power of WSGI. Second koodos to William.

Template Performance
See the blazing speeds of FAPWS I had to find out how fast it really was. I am a huge template fan for web applications. Once you have template you will never go back. So I pull out my favorite template language (Genshi) and bench marked it. The code just displayed the unix time on a webpage. The first round with I think Python 2.4 and Genshi 0.3 was 765.10 request a second. Now with Python 2.5 and Genshi 0.4 I am getting 1209.19 request a second. None of this is officially really that useful because the apaches benchmarking tool 'ab' was running on my localhost along with the server and I wasn't doing all the things that I am supposed to do to make sure I had a true benchmark. This is blazing fast! In theory if I load balanced my laptop with something like lighty I would be able to do over 2K request a second!

Insane Performance
William has been emailing me about how fast he has gotten FAWPS to work and it may be close to the fastest web server out there. It is still in very early stages of development but I think this is going to be an awesome tool for high performance WSGI applications. As time goes on more backend frameworks like sqlalchemy will support async IO providing a faster over all system.

FAWPS + Pylons
I have been using Pylons lately because of its philosophy of WSGI compliance. Also it is a much simpler framework than Turbogears. If you have a large application I find it better to have a stripped down framework. If you are prototyping and need an application yesterday then I still think TG is the best to have something up and running in 5 minutes. Pylons being so simple can sit under FAWPS so far I have only seen 25% performance increases however there are a lot of integration optimizations I have not done yet. Right now I have a database retrieving some of the data so the bottleneck is the database queries. Once I add caching into the system then we should see a sharp increase in performance because there won't be as much IO blocking.

Guido, threads and thank you Parallel Python

Parallel Python is a little known library that is utterly amazing. If some of you have been following the ongoing Python GIL drama. Guido just made a statement about thread support in Python 3000 (the next release of Python). I was once one of the brain washed thinking that threads and Java where the solution to all multitasking issues. Then I had one, two, three oh my gosh I am in deadlock explosion. Yes those days of the JBoss deadlock exception where really just horrible. As the "Enterprise" community came up with more solutions to fatten an already bloated container it became embarrassingly clear that threads where not the answer to a heavily loaded we application.

The Past
I don't claim to be a very bright programmer and apparently every time I tried to figure out a way with all the performance hacks I had into (or caching/ other libraries I was using) to figure out how to cluster the threads I couldn't. The only solution was to buy a bigger machine. I then spent some time on another project who was using another vendor who's name starts with a 'B'. Thinking well they are big money product that I am sure has figured out how to cluster these threads containers... Boy was I wrong they hadn't really figured out how to do it either, sure they supported a cluster but the problem was you had code like you where not in a threaded environment. Ok so if the solution is to code like your not in a threaded environment then what is the point of using threads? Clearly only the uber "coders" should be coding Java and threads and the rest of us can stick to our simpler, faster to develop, maintain, run and cluster Python (add dynamic language here).
When I returned to the project it was time to oust JBoss, Java and all the threading nightmares it put on us. So we created a custom Python framework. Performance was up 10x, with about 3x faster development. It was clear to me then that this threading thing was for the birds.

The Problem
So now that we know using processes instead of threads is better. I may not have presented a great argument but I have learned my lesson. Now that threads are out I had been doing lots of process programming but hadn't yet really developed any applications that communicated back and forth. So now I needed a producer consumer and I had no idea how to write it. I quickly realized why threads where so common they where an easy. Then I found Parallel Python! Which makes it painfully easy to take existing code and processing using multiple processes, on multiple hosts. I dynamically sent a function to another host in about 3 lines of code!