Recommended Tech Podcasts

I think podcasts are a great way of keeping up with a topic in that otherwise dead brain time when you’re travelling to work, washing the dishes and cleaning the floor. Here’s a few of the best that don’t focus on any one particular technology I’ve found over the last few years.

Security Now (feed)

Since 2005, Steve Gibson and Leo Laporte have been talking security each week. You’ll get a summary of any high-impact or interesting security news, deep dives on technical topics and listener Q&As. You also get detailed show notes and full transcriptions of each podcast at grc.com, a service that has proved useful more than once in referring back to something I’d heard.

This is the place I first heard about Heartbleed and Shellshock. Steve’s discussion of HTTP/2 is both in-depth and straightforward, explaining a few details I’d missed in my own reading. The politics around security, privacy, advertising and encryption are also often a topic of discussion, and he recently explained how to use three dumb routers to securely operate IoT devices at home.

Episodes

Weekly, 1-2 hours. Summary of news early in the episode, deep dives later.

Recommended For

If you work in tech, you should be listening to this. If you don’t. but you have any interest at all in computers, you’ll probably get a lot out of it too.

Software Engineering Radio (feed)

‘The Podcast for Professional Software Developers’ has been working with IEEE Software since 2012, but has been broadcasting interviews with software industry luminaries since 2006. This is where I first learnt about REST, way back in 2008. More recently, the episodes on Redis, innovating with legacy systems, and marketing myself (which is why I’m making an effort to blog regularly!) really got me thinking.

Episodes

A little variable in timing, but normally at least one per month. 1-2 hours per episode, short introduction then straight on to the interview.

Recommended For

No prizes for guessing ‘Software Developers’. I think this is great podcast for broadening your awareness of what’s going on out there outside whatever area you’re focussing on.

CodePen Radio (feed)

CodePen lets you write and share code with others, but that’s largely incidental to the podcast. Instead, the founders Chris Coyier, Alex Vasquez and Tim Sabat talk about the challenges and choices they face building and running CodePen. One of the things I like is the discussion of mistakes and compromises – it’s food for thought and makes me feel better about the mistakes and compromises I make!

They cover a variety of topics around running a site like CodePen. They talk about how their ‘Recent Activity’ feature works, switching from running their own database to using Amazon’s RDS, and how they deal with edge cases. They also talk about the business side of things, like hiring people and getting funding.

Episodes

2-4 episodes per month. A minute or two for introductions, moving on to main topic.

Recommended For

Detailed, practical insights into building and operating a small, successful tech company in 2016, so if this is something you do or want to do, I’d listen to this.

Developer Tea (feed)

Jonathan Cutrell produces ten-minute interviews and advice snippets for developers. He’s talked about prototypes, focus and ensuring professionalism. I think of this one as the super-short-form version of SERadio.

Episodes

10 minutes, 2-3 times weekly. Short intro, then content.

Recommended For

Software developers, maybe designers. The short format might work for you or not – I personally find it doesn’t seem to stick as well as the longer podcasts. I think a lot of the advice here is aimed at early-career developers, but still worthwhile for later career if you have time.

Wrapping Up

Have I missed any great podcasts along these lines? Let me know!

Node.js Microservice Optimisations

A few performance, scalability and availability tips for running Node.js microservices.

Unlike monolithic architectures, microservices typically have a relatively small footprint and achieve their goals by collaborating with other microservices over a network. Node.js has strengths that make it an obvious implementation choice, but some of its default behaviour could catch you out.

Cache your DNS results

Node does not cache the results of DNS queries. That means that every time your application uses a DNS name, it might be looking up an IP address for that name first.

It might seem odd that Node handles DNS queries like this. The quick version – the system calls that applications can use don’t expose important DNS details, preventing applications from using TTL information to manage caching. If you’re interested, Catchpoint has a nice walkthrough of why DNS works the way that it does and why applications typically work naively with DNS.

Never caching DNS lookups is going to really hurt your application’s performance and scalability. I think the simplest solution from a developer’s perspective is to add your own naive DNS cache. There are even libraries to help, like dnscache. I’d tend to err on the side of short cache expiry, particularly if you don’t own the DNS names your looking up. Even a 60-second cache will have a big impact on a system that’s doing a lot of DNS lookups.

An alternative, if you are running in an environment where you have sufficient control, is to add a caching DNS resolver to your system. This might be a little more complex but a better solution for some scenarios as it should be able to take advantage of the full DNS records, avoiding the hardcoded expiry. Bind, dnsmasq and unbound are solutions in this space and a little Google-fu should find you tutorials and walkthroughs.

Reuse HTTP Connections

Based on the network traffic I’ve seen from applications and test code, Node’s global HTTP agent disables HTTP Keep-Alive by default, always sending a Connection:close request header. That means that whether the server you’re talking to supports it or not, your Node application will create and destroy an HTTP connection for every request you make. That’s a lot of potentially unnecessary overhead on your service and the network. I’d expect a typical microservice to be talking frequently to a relatively small set of other services, in which case keep-alive might improve performance and scalability.

Enabling keep-alive is straightforward if it makes sense to do so, passing the option to a new agent or setting the global agent http.globalAgent.keepAlive andhttp.globalAgent.keepAliveMsecs parameters as is appropriate for your situation.

Tell Node if it’s running in less than 1.5G of memory

According to RisingStack, Node assumes it has 1.5G of memory to work with. If you’re running in less, you can configure the allowed sizes of the different memory areas via v8 command line parameters. Their suggestion is to configure the old generation space by adding the “–max_old_space_size” with a numeric value for number of megabytes to the startup command.

For a 512M available, they suggest 400M old generation space. I couldn’t find a great deal of information about the memory settings and their defaults in v8, so I’m using 80% as a starting point rule of thumb.

Summary

These tips might be pretty obvious – but they’re also subtle and easy to miss, particularly if you’re testing in a larger memory space, looping back to localhost or some local container.

Continuous Integration for Researchers?

TL;DR

Could tailored continuous integration help scientific researchers avoid errors in their data and code?

Computer Error?

Nature reported on the growing problem of errors in the computer code produced by researchers back in 2010. Last year, news hit the press about an error made in an Excel spreadsheet that undermined public policy in the UK. Mike Croucher discusses several more examples of bad code leading to bad research in his talk ‘Is your Research Software Correct?’.

It seems odd that computers are involved in these kinds of errors – after all, we write instructions down in the form of programs, complete and unambiguous descriptions of our methods. We feed the programs to computers and they do exactly what the programs tell them to do. If there’s an error, the scientific method should catch them when other researchers fail to reproduce the results. So why are errors slipping through?

That’s the question that Mike and I were chewing over between talks at TEDxSHU in December 2015. I think the talks I heard there inspired me to think harder about trying to find an answer. It seems like the first step to solving the problem is reproducing results.

Reproducibility Fail

My MSc. dissertation involved processing a load of data that I was given and running programs that I’d written to draw conclusions. Although my dissertation ran to many thousands of words, it was a fairly shallow description – my interpretation, in fact – of what the data said and what the code did. I can’t give you the data or the code as there were privacy and intellectual property concerns about both.

If I’m going to tear it apart, my dissertation really describes what I intended to tell a computer to do to execute my experiment. Then it claims success based on what happened when it did what I actually told it to do.

If you had my code, you could run it on your own data and see if my conclusions held up. You could inspect it for yourself. You could see the tests I wrote and maybe write some yourself if you had concerns. You could see exactly what versions of what library code I was using – maybe there have been bugs discovered since that invalidate my conclusions. If you had my data you could check that my answers were at least correct at the time and are still correct on more recent versions of the libraries.

If you had my code and my data, you won’t know what kind of computer I did the work on or how it was set up. Even that could change the result – remember the pentium bug? Finally, if you had all that information, you’ve still got to get hold of everything that you need, wire it all up and do your verifications. That’s quite a time and cost commitment, assuming that you still can get hold of all that stuff months or years later.

Continuous Integration to the Rescue?

I’m sure I’ve just skimmed the surface of the problem here – I’m not a researcher myself, nor am I claiming that my dissertation was in any way equivalent to an academic paper. It’s just an example I can talk about, and it’s enough to give me an idea. It sounds a little like the “works on my machine” problem that used to be rife in software development. One of the tools we use to solve it is “continuous integration”.

Developers push their code to a system that “builds” it independently, in a clean and consistent environment (unlike a developer’s computer!). “Building” might involve steps like getting libraries you need, compiling and testing your code. If that system can’t independently build and test your code, then the build breaks and you fix it.

A solution along these lines would necessarily have to automatically verify that all the information needed to get the code running, such as the code itself, configuration parameters, libraries and their versions, and so forth are present and correct. If the solution could also accept data and results, and then verify that the code runs against the data to produce the results, then it seems like we’ve demonstrated reproducibility.

Setting your own CI server isn’t necessarily straightforward, but Codeship, SnapCI and the like show that hosted versions of such solutions work, offer high levels of privacy and (IMHO) simplify the user experience dramatically. A solution like one of these, but tailored to the needs and skills of researchers might help us start to solve the problem.

Tailored CI for Researchers

I think that the needs of a researcher might differ a little from those of a software developer. What kinds of tailoring am I talking about? How about:

quick, easy uploading of code, data and results, every effort to make it “just work” for a researcher with minimal general computing skills
built-in support for common research computing platforms like MATLAB and Mathematica
simple version control applied automatically behind the scenes – maybe by default each upload of code, data and results is a new commit on a single branch
maybe even entirely web-based development for the commonly-taken paths (taking cloud9 as inspiration)
support taking your code and data straight into big cloud and HPC compute services
enable more expert users to take more control of the build and test process for more unusual situations
private by default with ability to share code, data and results with individuals or groups
ability to allow individuals or groups to execute your code on their data, or their code on your data, without actually seeing any of your code or data
what-if scenarios, for example, does the code still produce the correct results if I update a library? How about if I run it on a Mac instead of a Windows machine?
support for academic scenarios like teams that might be researching under a grant but then move on to other things
support for important publication concerns like citations
APIs to allow integration with other academic services like figshare and academic journal systems

I think that’s the idea, in a nutshell. I’m not sure if it’s already being or been done, or if not, what could happen next, so I’m punting it into the public domain. If you have any comments or criticism, or if there’s anything I’ve skimmed over that you’d like me to talk about more please leave me a comment or ping me on Twitter.

Finishing my MSc. Dissertation

I finished my dissertation a couple of months ago, and since graduated. Finishing was a great feeling, but I certainly remember the time when I thought I was losing control of the whole thing. I thought my experiments would fail to produce any positive results, and I lost any confidence I would finish at all. A time of sleepless nights and distracted days, but I learned I’m not alone in feeling that way whilst trying to get my dissertation to come together. To anyone else who’s in that place, try not to get too stressed and negative about it. Stay focussed on what you want to achieve and keep going. If I can do it, you can – it will come together.

Here’s the final result of all that work, Pattern Recognition in Computer System Events – Paul Brabban, published here in the School of Computer Science library. If you want to read it, I’d suggest having a skim over the introduction and then maybe skip to the conclusions. If you’re still interested then the detail is in the middle sections and if you want to try and reproduce my work, there is an appendix detailing some of the implementation choices I made.

I’m lucky to have had such great tuition and support at Manchester, not to mention the excellent supervision I received for my project from Dr. Gavin Brown. I was also very happy to receive some great feedback from my external examiner, Professor Muffy Calder at the University of Glasgow. I couldn’t have done the project without the support of the industry partner, so thanks to them and their representatives. My mum and stepbrother painstakingly proofread my later drafts and picked out any number of grammatical errors, and my wife, my friends and my family supported me and listened to me going on and on about computer science geekery.

My eternal gratitude to everyone I’ve mentioned and anyone I’ve forgotten!

A few weeks with the System76 Gazelle Pro

After writing about choosing and unboxing, I was going to write this post after two weeks of using my new laptop. It’s been over a month because I’ve busy with a Coursera course and – well – the laptop has just kinda worked. In fact, it’s been so uneventful that there’s not all that much to write about, but I’ve now tried three distributions on it.

Ubuntu 12.10

It arrived as described with Ubuntu installed, and pretty much everything worked, as you’d expect. The problem I could find was pointed out thanks to @TechHomeBacon on twitter:

@brabster @system76 comes out of the box saying graphics “unknown”

— Tech Home The Bacon (@TechHomeBacon) April 25, 2013

However, the folks @System76 replied, explaining how to resolve the issue:

@techhomebacon @brabster sudo apt-get install mesa-utils’ fixes the description. mesa-utils isn’t installed by default.

— System76 (@system76) April 25, 2013

A minor niggle. As I said in my previous post, I’m not a fan of the Unity desktop so enough of that – the first thing I did was start again and install Kubuntu.

Kubuntu 12.10

The install of Kubuntu, a derivative of Ubuntu based on the KDE desktop, was uneventful. There were no problems and everything worked out of the box – sound, graphics, touchpad – all working. Not much to say, but Ubuntu to Kubuntu use the same underlying distribution and I’m already familiar with both, so I decided to try something a little more challenging.

Arch

Arch Linux is an fairly popular lightweight distribution more geared to folks who like to get their hands dirty, so the setup is more involved and exposes more of what’s going on. It’s not based on Ubuntu, and this machine wasn’t built with Arch in mind. I should also mention that I’ve never used Arch before, so I was expecting more problems.

The setup was certainly more interesting, but entirely due to the more involved nature of Arch and my lack of general smarts. The hardware worked just fine, picking up the right packages without any special configuration. Dammit, still nothing juicy to talk about!

I have noticed a couple of things that often don’t work properly. First, Ctrl-F7 toggled my display between laptop panel and external monitor out of the box, which is fantastically helpful as I’m constantly plugging in an external monitor. Next, my USB hub has an ethernet port and sound hardware on board – these also both worked out of the box.

In Conclusion

So far, I would recommend to a friend.

All the hardware works under all three distributions. Although I bought the lowest-spec i7 processor and the Intel graphics hardware is relatively modest, KDE is a joy to use, silky smooth through all the desktop effects. It’s very quiet in normal use with no discernible fan noise. The laptop keyboard has enough space and tactile feedback to be comfortable in use for extended periods – this is of course subjective, but it works well for me. The display panel is clear and bright when the ambient light isn’t so bright as to cause excessive reflections, as you’d expect.

An Aside

I find it surprising that people still write articles criticising Linux as not ready for the desktop, or the casual user. Quotes such as “Is it bad if I say that I was impressed that sound worked right out of the box?” on a recent Ars Technica article brought this to mind as I bought this laptop, and my experience with a multitude of distributions over the past few years leads me to the opposite view – that many distributions tend to work without fuss and seem quite capable of meeting the needs of a typical, casual user. I may try and talk my wife (a Windows 7 user when she’s not tapping and swiping on her iPad) into trying out a suitable distribution for a while, to try and see the experience from a more casual perspective…

Unboxing my System76 Gazelle Pro

In a previous post, I explained my reasoning behind purchasing a Gazelle Pro laptop from System76. Having never bought direct from a US company before, I had reservations – whether the machine would survive the trip in one piece and how tax would work on the import.

TL;DR: a good experience with nothing particularly bad to note, but things to be aware of if you’re considering buying one of these:

Check how to pay taxes if you’re importing – you might need cash, cheque or some other antiquated mode of disbursement on delivery
The system comes with a US power adapter rather than one for your region but it can be worked around
It’s not as light or as thin as an ultrabook
The gloss flat panel is – well – glossy

System76 mailed me when I made my order, then to confirm that my payment and address were validated and that my machine was being assembled and tested, and finally to confirm that it was on its way, with UPS tracking information. I ordered on the 30th March, and it shipped on 5th April. Not too shabby, given that the Easter holidays were in there, and within the 6-10 business days promised. So far so good.

It arrived at my door on the 10th April, exactly when the UPS tracking site said it would. The courier asked for payment of taxes on the doorstep and required payment by cash or cheque. You remember cheques, right? My grandfather swore by them.

Fortunately, I could lay my hands on my chequebook (after blowing the dust off it) because who keeps £150-ish in cash laying about? If I hadn’t been able to pay by one of these methods, the package would have gone back with the courier to redeliver the following day, which would have been a pain in the backside. A bit of potential annoyance there, it’s a shame UPS don’t tell you on their otherwise very handy tracking site how much you’re going to need to pay and that you’re need cash to cheques ready to take your package.

So – check exactly how you’re going to need to pay taxes. UK folks, right now, keep your cheque book handy or make sure you’ve got the cash to cover it.

Anyway. Now, I’ve got a package in my grubby little mits. The outer packaging contains another cardboard box. Taking a knife to the tape reveals that inside, the laptop is cradled in a couple of foam holders, with the power brick stashed down the side. A photo follows – nothing fancy, but who cares about fancy packaging anyway? So long as the kit is in one piece.

We unpack, to find a laptop with protective plastic covers, a power brick and cable and a US keyboard component. I had the UK keyboard fitted, explaining the spare part.

Ah – the power supply cable is for a US power outlet. Not much use for me here in the UK. Could have been a bit of a problem, but fortunately these days most laptop power bricks have a standard three-pin adapter cable between the wall socket and the brick. I swapped my old brick’s UK adapter cable and we’re in business, but it’s something you might need to bear in mind.

Something that’s clear from the System76 brochureware and again on removing the unit from its packaging is that it’s no Macbook Air-style ultrabook. It’s not particularly light or thin, but then it’s also not as expensive as those kinds of machine. To my eye, much more a workhorse than a fashion accessory, but I like that.

The Gazelle Pro out of its protective foam packaging

Booting up confirms that the machine works perfectly and that I have the hardware spec I asked for. I cut a corner to keep the cost down a little and went for the standard glossy screen. Was that a mistake? You be the judge. Here’s the screen with the power off, indoors but with bright sunlight streaming through the window nearby.

Reflection from the Gazelle Pro gloss screen in bright sunlight when switched off

Here is it at the Ubuntu login screen, again in bright sunlight.

The Gazelle Pro glossy screen at the login prompt in bright sunlight

I’m not sure how the matte panel would fare, but this unit, as is typical of glossy panels, isn’t going to work well in bright light. Still, I bought the unit knowing that this would be the case, so I’d generally be using it in much more subdued lighting conditions. Things are much better after drawing the curtains.

The Gazelle Pro gloss panel in subdued lighting (click for much larger image)

So after unboxing, I’m pretty happy. There were only a couple of minor, easily resolvable problems to do with shipping over from the US, and I have the machine I paid for. Next time, the verdict after I’ve installed a different Linux distribution (just can’t get on with the Gnome 3 Unity interface, sorry!) and used the system in anger to do some work.

Why I bought a System76 Gazelle Pro laptop

My laptop is a little underpowered these days and I’ve been having a bit of trouble with up to date support for the AMD Radeon graphics hardware it packs, so I’ve been thinking about upgrading for a few months. I wanted to get a machine designed for Linux, rather than a buying a Windows machine and installing my distribution du jour on it. There are a couple of reasons for this desire. First, it seems to be getting more difficult to be sure that a machine designed for Windows is going to work well with a Linux distro, thanks to features like NVIDIA Optimus and UEFI secure boot, and second I object to paying for an operating system I have no intention of using. I’d rather my money went to the projects and supporters of the open source communities that provide the operating system I choose to use.

The only viable options I found for a well specified bit of laptop kit designed for Linux are the System76, ZaReason and Dell. There are others providing Linux laptops but mostly as cheap or refurbished options.

I have a couple of specific requirements other than good Linux support. I want a 15.6 inch 1080p flat panel, because my eyesight is pretty good and I value screen real estate because I use software like photo editing suites and development environments that have big complicated user interfaces. Having run short of memory on a couple of projects recently I want at least 8GB memory, and I want decent processor. I’d like a fast hard disk or an SSD, and I also want to avoid NVIDIA and AMD graphics hardware and stick with Intel graphics, as I don’t do anything that needs epic graphics power and I’d rather have graphics hardware with a good reputation for long-term Linux support.

ZaReason, a US-based company, offers the Verix 530 which comes close but packs NVIDIA graphics hardware and needs both the memory and hard drive boosting to meet my spec, bumping up the price. Dell only offers one Linux laptop which is a bit pricey in comparison to the others and doesn’t have many customisation options. In only offering one machine and whacking a “Dell Recommends Windows” banner on the pages for their Linux machine, Dell’s not building my confidence that they really know what they’re doing with Linux.

System76 won my business with their Gazelle Pro. It comes close out of the box and I can customise the couple of other options I need without breaking the bank. The important options I chose are:

15.6″ 1080p Full High Definition LED Backlit Display with Glossy Surface (1920 x 1080)
Intel HD Graphics 4000
3rd Generation Intel Core i7-3630QM Processor (2.40GHz 6MB L3 Cache – 4 Cores plus Hyperthreading)
8 GB Dual Channel DDR3 SDRAM at 1600MHz – 2 X 4GB
500 GB 7200 RPM SATA II HDD
International UK Keyboard Layout – Including Pound, Euro, and Alt GR keys

It’s a shame they’re based out of the US as it adds shipping time and cost on. I also wasn’t sure exactly what happens about paying UK taxes on the import. I put the order in last week and the machine arrived today. Next up, unboxing and first impressions!

A disaster, minimised!

I’ve not been blogging this last few months what with all my spare time going into trying to do some proper computer science and then writing my dissertation. Last night, I had a catastrophe – I noticed something in my results that should be impossible and traced it back to a subtle bug that compromised all my results to date! Pretty nasty at this stage in the project…

The effect was subtle and I didn’t think it would alter my conclusions. That said, to ignore it and continue wouldn’t be right. The alternative of explaining about the bug and its effects in my dissertation is not something I wanted to have to do either.

After a few minutes of sitting with my head in my hands I decided to fix it and start again. After all, it’s just compute time and elbow grease, it’s not like I just threw away a month’s time on the LHC or anything! Turns out, my decision to script everything paid off and I could pretty much throw a few tens of hours of compute time in to reproduce all my data and then a couple of hours with the charting software and I’m good to go. The choice of LaTeX also turned out to be even more of a winner as I was able to rebuild my document with the new figures and any layout modifications required almost trivially.

I was right – the conclusions do not change, however they are now more striking and there are no oddities that I can’t really explain. Tips of the day for those doing work like this:

script everything you can – just in case you need to redo stuff
use LaTeX – because you can swap out every figure for a new version easily

There are plenty of other reasons for applying these two tips, but there’s two reasons I hadn’t thought of before yesterday.

Scripting Java with JavaScript

Java programs run on the Java Virtual Machine, a kind of virtual computer that hides many of the differences between the different kinds of computers it’s running on. Folks have been writing implementations of other languages that run on this virtual machine for a while now – besides JVM-specific languages like Scala and Groovy, you can also get ports of existing languages like JRuby, Jython and JavaScript.

Conveniently, in the Java 6 specification (released way back in September, 2006), official scripting support is required in the javax.script package, and a slightly stripped-down build of Mozilla Rhino, the JavaScript implementation is shipped with the JVM.

I’ve been meaning to take a look at this for a while now, and I decided to use these facilities to solve a problem I was having in my MSc. project.

My project consists of runnable experiments that produce some kind of results over sets of data. I want to have fully set up experiments ready to run so that I can repeat or extend the experiment very easily without having to refer to notes or other documentation, which involves programs that accept configuration information and wire up components.

The Java code to do this kind of thing tends to be very verbose – lots of parsing, type-checking and an inability to declare simple data structures straight into code. It’s tedious to write and then hard to read afterwards. Using JavaScript to describe my experiment setup looked like a good solution.

Example: creating a data structure that provides two named date parameters in Java, as concisely as I can:

package com.crossedstreams.experiment;

import java.text.SimpleDateFormat;
import java.util.HashMap;
import java.util.Map;

public class RunExperiment {
  public static void main(String[] args) throws Exception {
    SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd hh:hh:ss");

    Map config = new HashMap();

    config.put("start", format.parse("2012-02-01 00:00:00"));
    config.put("end", format.parse("2012-02-08 00:00:00"));

    // do stuff with this config object...
  }
}

That’s a lot of code just to name a couple of dates! The amount of code involved hides the important stuff – the dates. Now, achieving the same with JavaScript…

var config = {
  start: new Date("February 1, 2012 00:00:00"),
  end: new Date("February 8, 2012 00:00:00")
}

// do stuff with this config

When there are many parameters and components do deal with, it gets tough to stay on top of. Some of what I’m doing involves defining functions to implement filters and generate new views over data elements and JavaScript helps again here letting me define my implementers inline as part of the configuration:

filter: new com.crossedstreams.Filter({
  accept: function(element) {
    return element == awesome;
  }
})

This approach isn’t without problems, for example there’s some ugliness when it comes to using Java collections in JavaScript and JavaScript objects as collections. To be expected I guess – they are different languages that work in different ways so there’s going to be some ugly at some of the interfaces, maybe even some interpretation questions that don’t have one right answer.

Nothing I’ve come up against so far can’t be fairly easily overcome when you figure it out. I think that using Java to build components to strict interfaces and then configuring and wiring them up using a scripting language like JavaScript without leaving the JVM can be a pretty good solution.

Setting up my Project Website

One of the assessed deliverables for my MSc project is a project website, so I’ve been having a bit of a setup session this weekend.

The objectives set for the website are a little… what’s the word… vague? See what you think:

A multipage website summarizing the work so far.
– Objectives
– Deliverables
– Plan
– Literature

That’s it as far as I can tell. Exactly how will the delivered work be assessed? Your guess is probably about as good as mine. Having looked at the discussion forum for the module (the full-timers did this in the first half of the year – I’ve been told I set my own deadlines when it comes to the project stuff as I’m not a full-time student) it seems that the marking scheme was quite severe with many complaints about low marks and little evident explanation, so I’ll make some enquiries before I start work on the content proper.

Back in April, I asked how the website deliverable should be ‘handed in’ and was told that a zip with some files in it would be fine.

Screw that.

I mean, seriously – the world has moved on. To be even vaguely interesting, I’m thinking about reusing relevant content from this blog, and some of the tooling I’m using like Ganttproject saves XML data that’s crying out for some transformation and JavaScript magic. I have my own domain name and there’s an opportunity here to learn some stuff about infrastructure (and I am doing this MSc. to learn stuff in the first place), so I’ve been setting up a server. Again, checking back on the forums, some of the other students went the same route and there’s no evidence of it harming their chances. I think hosting the project website as a subdomain of crossedstreams.com makes sense – I already own the domain name and subdomains are a simple matter of extra DNS records, which is dead easy to set up with my provider, getNetPortal.

I shan’t be hosting my site on getNetPortal though. As I spend most of my professional life working on the Java EE platform, Java is the obvious choice. Why not use a different language for the experience? Whilst I’ve got the time to learn a bit about hosting a public-facing website, I’m not sure I’ll have the time to learn a new way of creating websites that I’ll be happy with… not to mention that there’s a toolset and delivery pipeline that varies from platform to platform. Playing about with Erlang or some such will have to wait for another day.

GetNetPortal do host Java web applications, but it’s a shared Tomcat environment with a bunch of limitations as well as apparently risks to other people’s app availability if I deploy more than three times in a day. So where else can I go? Other specialised hosting companies are out there, but they’re not exactly cheap…

So I’ve provisioned myself a server on Amazon’s Elastic Compute Cloud (Amazon EC2). Amazon provide a bunch of images themselves and one of them happens to be a Linux-based 64bit Tomcat 7 server. Time between me finding the image I wanted and having a working server available? About five minutes. No matter how you cut it, that’s pretty awesome. To be honest, the biggest challenge was choosing an image – there’s a huge number to choose from and I tried a couple of other images that weren’t as well set up before settling on the Amazon-provided one. The best thing – EC2 is pay-as-you-go, at dirt cheap rates for low utilisation.

For those of you who haven’t seen EC2, here’s a couple of screenshots that might help explain what it’s all about. First up, let’s take a look at the application server I provisioned.

AWS Management Console with my instances

Checking my bill tonight, I can see an itemised account of exactly what I’ve been billed for. Being able to see this level of detail should let me stay in control of what I’m spending.

The rest of my time has been spent having a look around my new server, setting up Tomcat (to host a placeholder app in the root context) and iptables (to route traffic from the privileged ports 80 and 443 out to the ports Tomcat is listening on – 8080 and 8443 – thus avoiding the need to install a dedicated webserver or run Tomcat with root privileges), setting up some self-signed SSL certificates (I’ll need those so that I can bring up apps that require logon – without SSL, those usernames and passwords would be floating around the internetz in clear, negating the point of their existence) and finally scripting up the setup process in case I need to set this stuff up again.

Now, I can tick off the project tasks around setting up hosting nice and early. Quite a productive weekend!