NationJS talk on NodeJs now on Vimeo

It’s a bit late, but finally my NationJS talk is on Vimeo:

Node.js + WebSockets + Wiimote = Fun from Andrew Brampton on Vimeo.

Slides here: http://bramp.github.io/nodewii-talk/

Modern Software Development Life Cycle

I recently got asked by a friend at a start-up on how to ensure better quality in their product. They were looking for advise on the QA process, but after digging a little I found they needed improvements to their full software development life cycle (SDLC). After a few emails back and forth I ended up writing what’s below. It has plenty of references for further reading, and I thought it would be good to share.

Generally people view the SDLC as a pipeline, and there are different ways to manage the pipeline, Scrum, Kanban, Waterfall, etc. Each with their pros and cons, and all can help your quality, but I’ll address that later

The pipeline typically consists of the following steps; Requirements, Design, Development, Testing, Deployment. At each stage you can ensure quality in your product. However, you should consider this an iterative process, always going back to the beginning to re-evaluate your thoughts/findings/etc

Requirements
Firstly, it sounds like your customers weren’t getting what they were expecting. I can’t stress how important correct requirements gathering can be. Office Space may have made fun of this, but you should be sitting with your client, understanding their use cases, understand why they want what they want. These are all important to building a good product. Some would argue you should only listen to your clients in moderation, but, if you only have a couple of clients, and if they are paying you for the work, then you should listen.

Once you think you know what they want, wireframe it, mock it up, write a document, and get the client to sign off on it. The sign off is key as it ensures both parties are in agreement as to what is being delivered. A good product manager would be doing the bulk of this phase.

At this point, you might also have time estimates, know how long it will take, and how much it will cost. Setting correct expectations with clients on timing is always important. Sometimes people don’t care how long it will take as long as your estimate is accurate. Missing deadline is never good.

Design
Once you know what you want, design it, diagram the flows, create a database schema, the API endpoints, maybe even make a proof of concept, to learn the technology.

Learning to design software takes practices, and I don’t think is something you can learn from reading, instead practice makes perfect. However, sites like Highly Scalability show you how others have solved problems, and there are certainly many books on the topic; Software Design, Design Patterns, Architectures, etc.

One way to make your design work easier, is to use a framework. A good framework will force you to break your code into layers, such as controllers, services and data access. This helps to keep your project well organised, and has many additional benefits, such making your code testable, giving you access to large pools of plugins, and developers who already have knowledge in your framework.

Development
What do developers spend most of their time doing, reading code or writing code? Contrary to what you may think you pay them for, they spend most of their time reading code. Not just other people’s code, but their own code. Most developers forget what they wrote the previous day.

So to help developers you should do everything to keep code clean, readable and maintainable. That doesn’t just mean adding comments here and there, instead using various simple techniques such as sensible variable names, short functions (that do one thing), keeping the code well indented, etc. There are a few great books on the topic.

Clean simple code is very important, it makes the developer’s job easier, reducing mistakes and bugs. I actually like to track lines of code my team writes over time. Not in the traditional IBM KLOC way, but instead looking for the number to decrease over time. This can happen when we realise things are redundant, find libraries that take the burden of the work, or simplify the design once we have a better understanding. There are even tools to help you measure how complex your software is!

Never reinvent the wheel, there are 1000s of awesome open source projects out that, and one of them will solve whatever problem you have. Whoever solved the problem, more than likely spent more time thinking about it than you! Otherwise they wouldn’t have deemed it worth sharing online. This typically means you get a lot of value for free, that you don’t have to maintain.

You should focus your effort on adding business logic, and value to your product, not focusing on implementing a clever caching algorithm, or figuring out the ins-and-outs of how SMTP works. Those problems are worth solving, but not now, and not unless you could gain measurably value.

Testing
To keep your pipeline quick and efficient, you should be automating as much as possible. Testing is one area you can easily automate, but sadly many people leave this as an after thought. Concepts like Test Driven Development (TDD) are useful for ensuing tests get written upfront, and code is well design. Even without TDD you should be writing Unit tests, Integration Tests, and maybe later, Performance tests.

Unit tests, are very simple and should test one unit of code. Lets consider a system that accepts user input, validates it, and if needed displays an error. The unit tests here, would create fake input, and test the function under each condition. If the function depends on some underlying system (such as a database) that complexity should be mocked. That is, not really using a database but instead using a fake system underneath, which behaves like a real database but under your control. The end goal is that a unit test should test one thing, and do it quickly. If a single test takes more than 100ms you are doing it wrong. Some will even argue a developer must run all unit tests before checking any code in.

With mocking/stubbing and other techniques, you should be able to test many layers of your application. However, your application most likely depends on external processes, and this is where integration testing comes in. Typically, this is testing your database behaves how it should, and the code you have written interacts with it correctly. Since it depends on external applications, integration testing usually takes longer to run, and is more complex to set up. In many cases a application like Jenkins or Bamboo is used to help automate the testing.

There are other classes of testing, such as performance testing, acceptance testing, and web based testing. Performance testing measures latency, throughput, etc, and graphs this over time to ensure that no new code is negatively impacting performance. Acceptance is as simple as verifying that all your requirements are actually satisfied, and can be automated. Finally, web based testing (for lack of a better name) is using software like Selenium , that fires up a real browser and automates clicking on buttons, and interacting with your UI. I’m personally not a fan of Selenium as good unit/integration tests can catch most of those issues.

Once you have written tests there are numerous tools to help you measure your coverage. How many functions/lines of code did you actually test!. This software can help you target your most critical functions, and ensure things are being tested as expected.

Last, but not least, is QA/QC. Actual humans in the loop, following test plans, and actually validating that the application does what it’s expected to do. This is as simple as described, and should be repeatable and auditable.

In fact, one more step, User Acceptance Testing, or in other words, putting the product in front of your client before you go live. Set up a staging environment, or as some call it a UAT environment. This mirrors your production env, but allows clients to play with new features before they are rolled out. This is a good way to make the client feel part of the process, and give regularly feedback. Make it clear that the UAT env is for testing, and that all data gets wiped every couple of weeks. Let them do your QA for you :)

Bug Tracking
While conducting QA/UAT/etc you should certainly be logging all defects to a bug tracking database. This enables you to regularly prioritise what needs to gets fixed, it allows users to track the status of their bug, and it also means things don’t get forgotten about.

Deployments
Finally, your code has been written, it must be pushed out into production. Some will tell you that you should no longer do deployments manually, and you should use automation tools such as Chef/Puppet/Capistrano, and I would agree. It makes the deployments testable, repeatable, and predictable. You remove a large amount of human error from the process. However, when things do go wrong, they typically go wrong fast and wide spread. So make sure you test your deployment scripts, as you would test your code.

SLDC
I mention there were different techniques for the SDLC, Agile based approaches (Scrum, Kanban, etc), Waterfall, etc. The SLDC should allow for continuous integration, constantly running the pipeline and revalidating each step. Some will argue Agile is the way to go, and I would tend to agree. Agile seems to prefer short iterations with constant feedback. Feedback should be often, and rapid. If you break some code, a unit test should notify a human quickly, and not at the end of a development cycle. QA should be done in an agile manner, testing as soon as the feature is complete. This allows a human is quickly test the new feature and give feedback to the developers shortly after the code was written.

Different teams, and different projects, require different SLDCs. I personally have a team working on two week Scrum sprints, with deployments happening at the end of each. In other cases, I have projects with far less rigorous schedules.

I highly recommend the The Phoenix Project, it talks about SLDC, and is a good read (even for those non-technical readers).

Finally, I’d like to quickly introduce the newer concept of Continuous Delivery. This extends continuous integration, by making your pipeline end at deployment. From code check-in to being live in a production environment, should be as automated as possible. Companies like Etsy and Facebook like to advertise that they deploy numerous times a day.

Grabbing a Certificate with OpenSSL and importing it into Java

Occasionally I have to grab a SSL cert from a server, and turn it into something that Java can use. Here are the quick instructions

# Store the cert issued by a web server
openssl s_client -showcerts -connect www.google.com:443 > www.google.com.pem

# Convert it from PEM format to DER format
openssl x509 -in www.google.com.pem -inform PEM -out www.google.com.der -outform DER

# Import it into your keystore
sudo /usr/java6/bin/keytool -import -alias www.google.com -file www.google.com.der -keystore /usr/java6/jre/lib/security/cacerts

# The keystore password is by default "changeit"

SMS Character Count

It is commonly known that Twitter allows 140 character messages, and some will tell you that a single SMS message is limited to 160 characters. However, it’s not as simple as that. In the US a single SMS message can contain 140 bytes of data, which if using GSM encoding, we can squeeze up to 160 7-bit characters. Those 7-bit GSM characters don’t match up with normal ASCII characters, and even worse, not all characters take 7 bits, some take up 14 bits (for example the { character)!

When we start talking about messaging in non-latin scripts, such as Chinese, then a different encoding must be used. In the SMS world the encoding of choice is UCS-2, which uses 16 bits per character. This limits a single part message to 70 characters (down from 160).

On top of that, most SMS clients will let you send concatenated SMS messages. That is, multiple message parts that appear as one long SMS message. A two part message allow up to 304 characters, not the 320 (160×2) you might expect. This is due to the overhead required to store meta data about each part.

This all makes it very hard to count how long a SMS message will be, what characters are allowed, and how many parts it will take. To help with these isuses, I’ve created this simple tool which allows you to type out your message, and see how well it’ll fit

SMS Character Count

http://bramp.net/sms/

Alignment of Raphaël Paper.text(…) and Paper.print(…)

Working with Raphaël I noticed the alignment of text drawn with the Paper.text(…) and Paper.print(…) methods differed. The documentation wasn’t helpful in explaining the difference, so I wrote a simple test to work out their behaviour, and then a small method to normalise them.
Read more »

Most starred project this week, and second most forked.

After getting my js-sequence-diagrams project onto Hacker News, the popularity has gone viral.

github-most-starred

Draw UML Sequence Diagrams with Javascript

I’m happy to announce one of my projects, js-sequence-diagrams. This uses Javascript to draw UML sequence diagrams in SVG format. Here is an example:

js-sequence-diagram example

You can alter the diagram in real time, and I even have a simple jQuery plugin to make this easy to use on your own sites.

<script src="sequence-diagram-min.js"></script>
<div class="diagram">A->B: Message</div>
<script>
$(".diagram").sequenceDiagram({theme: 'hand'});
</script>

MongoDB Compression

For a while people have wanted MongoDB to compress their data, or at least compress their field names. This would be beneficial in not only reducing the amount of disk space required, but also in theory improving performance as we trade disk IO with CPU IO. I thought this be a fun project to investigate, so I started by working out if this would actually be useful. Read more »

How many ways are there to say phone number?

In the various systems I’ve worked on, I have seen far too many terms to describe a phone number. I thought I’d catalogue them! Read more »

Invalid IP range checking defeated by DNS

I’ve seen a particular kind of vulnerability in a few different applications but I’m not sure of an appropriate name for it. So I thought I’d write about it, and informally call it the “DNS defeated IP address check”. Basically, if you have an application that can be used as a proxy, or can be instructed to make web request, you don’t want it fetching files from internal services. Read more »