Monday, February 3, 2014

crypter: my attempt at improving security


Legolas: It is not the eastern shore that worries me. A shadow and a threat have been growing in my mind. Something draws near, I can feel it.... Orcs or the NSA? Either way the lack of online security has been something that has been a growing concern for me. I have been researching how to improve security of the software systems I work on. Initially my main motivation has been providing the same level security for the users that I would expect to have for the products I use. It is surprising how much FUD there is in computer security industry. Fear sells computer "security" products. This fear has seem to infiltrate software development culture. The concept that there are "secure" systems and insecure systems is that the "secure" systems have highly trained group of mathematician keep the system "secure". In reality, there is a spectrum of security in which complexity of security does not equal greater security. Just as almost any user can use sftp (ssh), developers can use simple API's that implement the right encryption to increase the security of our apps.

A lot of this fear is based on improper implementation of security. However exploiting a security hole in an encryption algorithm is a lot harder than it may sound. Even a bad implementation that leaves a security hole is better than no encryption at all because the bar is so much higher. My goal is just to constantly increase the bar of security of the systems I work on. Even though I think it is going to be challenge to defend against someone like the NSA (who have unlimited resources) we can build reasonable secure systems to keep out the Orcs.

Crypter is a tool I built to help encrypt data from the commandline. What I wanted was to be able to have a single binary that could decrypt using a symmetric key encryption. Originally my goal was that any code or data that was transmitted over the wire was encrypted. This would be double encrypted using ssl or ssh and symmetric key encryption.

Practically this means using crypter to encrypt my code and push it to a secure s3 bucket. Then the servers that are deploying it download it from s3 decrypt it and deploy the code (or binaries). Also this is how I encrypt backups and before I send any data over the wire to production servers I try to make sure it uses crypter. What is nice about crypter is it compiles to a single binary and does not have any dependencies. So for doing devops tasks with a fresh server instance you can scp crypter to the new server and the data and then decrypt the data on the server with no dependencies. Once the setup script is decrypted you can run it to finish the rest of the setup of the server.
As I said I am trying to improve security so please send me any feedback, bugs or suggestions.

Wednesday, January 15, 2014

mdserve a Markdown binary utility for README.md written in Go

Problem: README.md (markdown docs) formatting

Markdown is a pretty good renderer of docs so I try to use it as much as possible. I notice that often what might look fine in vim doesn't render very well in HTML once it gets pushed to github or rendered by the doc generator. Commonly days or even weeks later touching up docs that with simple formatting and taking extra time.

Solution: mdserve http server for a markdown file

Basically I am lazy. I could generate the html then open it in a browser. However each time I was going to change the file I would have to rerun generate the html and reload the browser page. What I wanted is a program that I could just reload the browser and it would redisplay my changes. This isn't exactly like what github or the a doc generator would do but it catches all of the formatting issues I normally have.

Give it a try:

go get https://github.com/lateefj/mdserve


Friday, January 10, 2014

codap a library I created to port some concurrent access patterns from Go to Python

As I was building an application that used a couple different data stores. MongoDB and S3 there came times that I needed to optimize the performance of a couple operations. It seemed that accessing the IO concurrently boiled down to 3 very simple patterns.

Key / Value (Dictionary, Map ect): It was very common that I would need to get data from data stores or third party services put it into a dictionary to be rendered with a template or marshaled to JSON for a web service. This pattern I used everywhere and is even used to implement the Order List pattern.

Ordered List: In some cases the data that was being retrieved needed to maintain its ordering.  Usually this was because it was sorted in some way.

First Reply: In other cases especially when dealing with many large files that then needed to get processing the order didn't matter. I either just needed the first one that came back or to get all of the files and start processing those files as soon as possible. This came in very handy specifically when I was doing encryption / decryption and compression. All the files where going to end up encrypted and compressed on the same file or stream so it didn't matter which one was first just that it happened as fast as possible.
The complex part was measuring performance. The database was consistent however both S3 and third party services had a large range in response times. Network traffic in AWS is what I guessed. In general though I found that this it was very beneficial if the IO operations where large or there where many operations even if they are small. The large operations could read or write happened concurrently so the tended to only last as long as the largest operation took thus for 3 large IO operations it only took 1/3 or 1/2 the time. With many small operations the performance was more complex however in general I would see 40% improvement on performance.

Please try it out and give me feedback and any notes you have. Works in Python 3!

pip install codap

Github: codap

Tuesday, January 7, 2014

Go Web Service Testing meets the Gorilla First Iteration

After making some headway with McTest in my previous post. I was not able to test any of the routing rules because I use Gorilla Mux which is an awesome library. A bug could live in the routing code so it was time to tackle the 800lb Gorilla (if you will). Since my goal was to try and get 100% test coverage for this simple web service it was soup to nuts.

First lets take a look at the request handler code:

InitRest function had the map code. This will need to get called before any test code can run.


Now the test code is an entirely different beast (Gorilla):

There are some funkyness getting this working. It is far from elegant or nice code. This is mostly a hack after I read the Gorilla context code. The first hack is the constant bits that are set at the top of the handler code. This is needed to associate with the request. The second is instead of using the document function mux.Vars we need to use context.Get. Not the end of the world but I would like it to be a lot cleaner so if anyone has any suggestions on cleanest ways to fix this let me know.

Saturday, January 4, 2014

Go Web Service Testing

An email came in a couple weeks back that opened up a discussion about testing web services in Go (golang).  As I have been writing mostly Go over the last 6 months it was something I had been thinking about for a while. Very little code is in the request handlers however there is enough to have a bug I admit. With 1.2 being released with a coverage tool I realized I had a lot of egg on my face. It was time to write some publicly accessible web services and try for a complete code coverage. This post is about a small lib I wrote to save a bit of boilerplate code.

First lets take a look at a simple request handler:


The test code:


There are to functions that save some typing AssertCode and AssertBody. If the results don't match what is passed in as a parameter then the function will call test.Errorf and return false. 

The complete example is here and the project is McTest. All though this is a trivial example I hope to cover a more complex one next that takes into consideration routes. Would love to get any feedback on thoughts of better ways to do this.

Saturday, September 28, 2013

Part 3 (Final) Dart vs Go vs Python

It has been a long time since I wrote a blog which has a lot of reasons. The first being I didn't write much on a sailing trip I took for 7 months last year (Tickety Boo). Combined with moving to San Francisco and getting a new job has taken a tole on my blogging.
A reader emailed me to suggest I update the numbers on this trivial benchmark I did almost a year and a half ago! Dart has significantly changed, Go released 1.1 and Pypy has had a number of stability releases. So I ran these test on my macbook pro and linux node I have in the cloud, but doesn't really matter because the code is here: https://bitbucket.org/lateefj/dart_compare so YMMV on your own hardware. I had started to add Rust into the benchmarks but at the moment I want to get this update out and move on to some other hacking blogs that are on my mind.

Since the last post Dart has improved the most. Specifically JSON benchmark it is now second CPU time only to Python (which is a thin wrapper around C). Darts memory usage is greatly reduced also. The graphs are here:





The Dart IO library has had some changes in the last year so I had to do some code changes to that code. I think the code makes more sense now. Python and Go didn't need any changes from off the top of my head.

I wish I had the same exact environment to test it on however I didn't have so I am not going to get to specific about any conclusions I have. Dart has improved by leaps and bounds. This isn't completely surprising if you follow the language https://www.dartlang.org/performance/. Go has made measurable improvements in its performance also but as expected with a point release. PyPy also seems to have shaken out some performance and fixed some memory issues. Yeah for the ecosystem!
As always look forward to any feedback.

Wednesday, May 1, 2013

Notes from my first real Dart Web App

I mostly build interfaces for IE. Well they have to be compatible for IE7+ so like most web developers I build them on Chrome and then fix the IE issues. Since Dart or dart2js only works with IE9+ I didn't have any projects that I could use for it. Finally I have a contact database I needed to build that I only needed to support HTML5 browsers! I figured I would give Dart a real shake down. I didn't want to build everything from scratch which I was luckily that DWT (Dart Web Toolkit) existed.

Dart

The language type system is a little Java ishy but that makes it really trivial to learn if you know Java. I find this mainly annoying with Generics but it is just one of those things that is here to stay I suppose. It is dynamic enough to do things like for(var x in [foo, bar, bat]) {x.show();}. After writing Python for almost a decade I have a hard time living without a level of flexibility. I like the Javascript bits since it make for first pass coding really fast and then it can be statically typed on future passes. I love being able to organize the code in files that make sense and not worry about some complex build process since dart2js will optimize it for me (dependency management is so much better done by machines).

DWT

I spent a couple years writing GWT code and it has its ups and downs of which I have written about extensively. I have also written a good mountain of Javascript and more recently Coffeescript. DWT give that rich API from GWT, I have found the static typing makes it easy to read libraries and consistant coding. Fast development combined with a great widget library and code that is not a complete mess to look at. Honestly I have been amazed at how easy it is to write DWT. I have found it faster to write DWT that it does GWT mainly because Dart is not the pig Java is. It has been relatively easy to style with the existing application using bootstrap.

var's, typedef's and dynamic's oh my!

Oh how I love var. Even though I write static typed code all the time I love being able to just type var when I am in the zone. Save from having to look up a type or to create it and loose the flow. Many times I am not sure exactly what the type is going to be so better to just have a place holder and then during a code cleanup pass fix the var types.
dynamic is the type that json.parse returns. I have found this extremely useful for two purposes. I like being able to configure using it like I do with the DataTable in dwt_lhj. JSON from the backend is another thing I often don't think needs a formally design data structure.
typedef is how to define a function. Say I have a function that take to int and I need to call it once the server responds. For example the Paginator that I wrote in the dwt_lhj tables code:
typedef SelectedPage(int page, int start);
When the user selects a different page then the paginator will call this function. Compile time failure if it doesn't work woot!

dwt_lhj

This are a handful of things I wrote that seem to be missing. I am sure eventually there will be better implementations however I figured I would share them in case someone else had simular needs.
UL and LI elements in HTML tags are missing in GWT as well in DWT. Since I use them a lot for navigation (bootstrap) and for making list of things! I created pretty simple wrappers. It is very self explanatory and easy to use with very little code.
Displaying tabular data is another thing that is currently not implemented in DWT. GWT calls them cell widgets. These are pretty large undertaking from a code perspective to implement for DWT. I have used Cell Widgets a lot in GWT. I find them very tedious to code. I have been using Datatables.net which is pretty nice Javascript library but I have found the API to be pretty complex for the functionality I am looking for (maybe that is just the docs). I was hoping to build something in the middle that I could quickly build but would have a lot of future flexibility that I could change the behavior in. So I built one. I also built a simple paginator that I use with the bootstrap paginator css classes. I am hot real happy with the code but it works and I look forward to improving it in the future.
Wrote a regex routing library because I couldn't get the existing https://github.com/dart-lang/route library to work. Based on the tickets they seem to exclude the features I needed. I decided to write a simple one that uses the DWT History package however I think it could quickly be improved and migrated to be a more general purpose library. When I get some better docs and examples up and someone finds it useful I will publish it in the pub.dartlang.org repository.

Mopping up

So far development in dart has been pretty fantastic. There are a couple bumps in the road with bugs in DWT or old examples to dart code online. I think in a year most of this will get sorted out. There seems to be some amount of effort going into coding Dart for the backend. In general I don't find much advantage to this. I find languages like Python and Go to be much better language for solving those types of problems. As a "unicorn" (I code both front end and backend) I find certain languages have strengths that appeal to the types of front end or backend problems (Dart craptastic support for concurrency IMO is one reason I wouldn't want to use it for backend coding).   For building large code base of widgets Dart seems to be fitting the bill nicely. This app will hopefully go into production in June and I will follow up with another post then.