Pattern-Based Software Dev – Day 1

I got a couple of great surprises this morning on turning up in Manchester for the module starting today.

First up, the lectures were originally timetabled for a 9:30am start, and are now timetabled for 11:00am. That gives me loads of time between arriving in Manchester at 08:00 and starting lectures to eat, get to the library, do any admin stuff that’s easier when I’m onsite and generally chill out before getting started.

Second – I signed up for ‘IBM Patterns for e-Business Applications’, because I wanted to get some Software Engineering coverage as part of my MSc, and there was some coverage of design patterns in the syllabus for this module in 2009. I was in two minds about it, studying something with ‘IBM’ on it didn’t seem entirely right for an academic course.

To my surprise, the course has been re-branded ‘Pattern-Based Software Development’ overnight, and a complete re-write of the lectures has started to appear that appears to focus on understanding and applying some of the GoF design patterns – pretty much the exact course I wanted to take. I’ve studied and applied some of the GoF patterns before, and I’m really looking forward to learning the syllabus and having my work critically reviewed.

As an aside, it looks like the Manchester CS department is completely re-working its taught MSc Advanced Computer Science proposition, organising the taught modules into ‘pathways’ like Artificial Intelligence and Natural Language Processing. Looks like a good move to me, helpful for students choosing modules.

The lecture material introduced the Strategy, State, Proxy and Item Description patterns. The first three are pretty well known, but it’s the first time I’ve come across the last one.

Coursework material involves UML Class diagrams and designing a system to solve a loosely defined business problem. Unfortunately, it seems that good UML tools are tough to find. After a few days of battling working with the Eclipse project’s UML2 plugin I’ve come to the conclusion that I don’t much like it for simple diagramming. I’ve tried a few other tools with limited success, just a couple left to try. It might be you do have to pay $$$ to get a good one – but we’ll see.

Finishing up the Machine Learning Module

Well, the Machine Learning exam was this morning… another 5:30 am start to get to Manchester in plenty of time.

My Top Tip for distance learning today has to be: if you have to attend classes, labs, exams – you know, stuff that you can’t really afford to miss, aim to be there an hour early.

Today, I didn’t realise that the exam wasn’t in the same building that I’ve had every lecture, lab and exam so far. In fact, it was on the other side of the campus, and it’s not a small University. I was very glad of having 45 minutes from checking my information to the exam starting! Totally my own fault, of course – focussed on studying and the date and time of the exam, I made an assumption – but these things happen. If you’re there early and everything works out fine you have time to relax and centre yourself. On the other hand, if there is a problem, you’ll be very glad of that time.

The course itself was a fascinating introduction to several aspects of automated learning. Starting out with linear and nonlinear classifiers, moving on to decision trees, then probabilistic classifiers, unsupervised learning and finally sequence learning, we covered a large set of knowledge with significant maths pre-requisites.

Most of the material was quite approachable (now that I’m largely over my irrational fear of mathematical symbols – I wonder if there’s an official phobia for that?), with the notable exception of the probabilistic stuff. I’m not sure why I had such a problem with it and even after some serious digging in books I’m still not totally clear on some of it. More work needed there in the future, I fear.

Funny thing about the maths stuff – it has taken/is taking me a lot of effort to penetrate the notation. Once I can read it, though, the concept hiding underneath tends to be fairly intuitive. Go figure.

So how did the exam go? As with the last one, I can easily imagine how it might have been much tougher. Feels like it went OK, but you never know do you?

Anyway, now the immediate study pressure is off for a few weeks I’m hoping to catch up on some reading (right now, a quarter of the way through Code Complete 2, by Steve McConnell – I’d like to finish that off) and get a few more blog posts in.

Top 5 Cool Machine Learning Links

I’ve seen so much awesome stuff in my forays into Machine Learning as part of the course I’m doing, I thought I’d present for your entertainment and information my top 5 machine learning resources. (Kayla Harris suggested this infographic if you’re looking for a quick introduction to how ML is used in industry).

No, come back – some of this stuff is actually quite cool, I promise!

Here goes, in no particular order:

How to make $1M with Machine Learning and Netflix

Netfix offered a $1M prize for a team that could best their video classification technology by a significant margin.
The netflixprize page tells the official story, and the presentation attached to this blog post is well worth a look.

Detexify – Handwritten symbol recognition

For those of you that use LaTeX, you’ll know the pain of trying to find the code for a symbol if you can’t remember it. Detexify lets you draw a symbol on the screen, then uses machine learning techniques to recognise it and tell you which codes it thinks you need. The accuracy is astonishing – a really good showcase for the potential of the techniques.

Detexify in action
Detexify in action

Lego Mindstorms Robots that Learn

This JavaWorld article takes Lego Mindstorms and adds a pinch of Machine Learning to make a robot that learns to follow a path on the ground.

I highly recommend this article for a casual read, it’s very nicely written and accessible but does delve into the theory and mathematical foundations of the Perceptron algorithm at the heart of the article.

Machine Learning at videolectures.net

There are 794 presentations and lectures – that’s not a typo, seven hundred and ninety-four – on every aspect of machine learning you could dream of here, at videolectures.net, from a range of sources. Many are quite approachable for the layperson.

The Singularity Summit

To wrap up, the Singularity Summit seems to be the forum for the players in the general Artificial Intelligence arena to talk about the past, future and philosophical implications of AI.

The Conversations Network hosts a free podcast series for the summit – personally, I really enjoyed James Hughes’ twenty-odd minute talk, in which he answers one of  the great unanswered questions – if you’re standing on a railway bridge, are you safer stood next to an artificial intelligence or a human being?

That’s All Folks

I hope there’s something in there that’s given you some food for thought. If you have any stuff that you think is awesomely cool in this space, drop me a comment so I can check it out!

Machine Learning – After the Project

My project for machine learning got handed in on time. It took hours to strip it down from the 8 pages I had when I’d finished to the 6 pages the spec asked for. Careful stripping out of any unnecessary waffle and merging of plots and charts was the order of the day.

I ended up scaling back my plans to explore text mining or ensemble learning to a simple comparison of some of the learning algorithms we learnt about on the course, with some exploration of slightly more advanced statistical comparison methods than we covered. The thinking was that it’d be better to try and demonstrate sound understanding of the basic algorithms and the experimental method – time will tell whether that was the right call.

Unlike how I used to do this kind of work when I was an undergraduate (start with a couple of days to go to deadline), this time I used most of the three weeks allowed to explore the options, work on the software, gather results and produce the paper. Hopefully, the work will show in the result, but I suspect it’s more a case of getting the approach right to minimise the time taken figuring out what to do.

But I guess it’s another example of work expanding to fill the time allowed!

Machine Learning – Day 5

So that’s the end of the taught course in Machine Learning, finishing up learning about Markov Chains and Hidden Markov Models.

Yep, those are just links to the Wikipedia articles, and it’s quite possible that if you clicked on them and you’re anything like me, the crazy-looking maths stuff at the other end made the sensible part of your brain run off and sit in a corner, clutching its knees and rocking gently back and forth.

Probably muttering to itself.

To be honest, I can’t really explain what this stuff is about just yet – I’ve had a lot crammed into my head over the past few weeks, and I think I need a really good night’s sleep before I can comprehend the deeper aspects of this last bit. Suffice to say for now that it seems like some really interesting and powerful ideas are in play, and when I’ve got my head round it I’ll blog up my thoughts.

I’ve now got one more homework assignment on today’s material to complete by next Wednesday, and the project we’ve been assigned to do is then due on Friday 6th November – a nice surprise, as a typo on the schedule had us believe it was due on the previous Tuesday.

I’m sorry the taught part of the course is done, to be honest. Although I’m not sure I could have taken any more at the pace it was being taught, I’ve thoroughly enjoyed the material.

In fact, I’d say I feel a little inspired.

And, as James Brown might say – it feels good.

Machine Learning – Day 4

Day 4 covered methods of automatically identifying clusters in data – and some of the issues that arise using those techniques.

Doing this automatic identification is called unsupervised learning, because it doesn’t depend on having a set of labelled data examples to hand. The learning is done purely based on the statistical and probabilistic properties of the data itself.

I got to say, I’m struggling with the probablistic side of things – my intuition isn’t helping me much, so I’ve been doing the books thing to try and really get my head round it. So little time…

We also covered some techniques involving reducing the dimensionality of data – say a dataset has a thousand properties, and the computational overhead of processing increases with the number of properties. You’ll need some way of reducing the number of properties, whilst retaining the maximal information they encoded. We were looking at selecting features a couple of weeks ago, but today we looked at PCA – Principle Component Analysis, a technique to ‘project’ information into a smaller number of dimensions, or properties.

I quite like this paper on PCA, if you’re looking for somewhere to get an idea what it’s about.

And that, if you were reading this blog a few weeks ago, is where the eigenvalues and eigenvectors come in.

We also have a project to complete in the next couple of weeks, so time is very much of the essence right now. I suspect that as with the Semi-Structured Data and the Web course last year, the deeper concepts behind some of this material will only become clear to me when I’ve completed the set material and start to work on the revision of what we’ve covered.

Back in the day, revision was time off between lessons and exams – these days, not so much!

Machine Learning – Day 3

Getting through the coursework was a challenge – my computers have never worked so hard.

The last section involved performing a computation over a data set that took a few seconds per run to exhaustively search for the optimal settings for two parameters in the computation’s algorithm. Searching over 25 possible settings doesn’t sound like a lot, but two of ’em means 625 runs – times a few seconds is quite a wait.

Oh, wait – there was also a requirement to randomly shuffle the input data for the algorithm ten times and do some simple stats, to give some chance of reducing potential bias brought about by the order in which the data appears in the data set. So that’d be 10 runs per pair of parameter settings, which is 6250 runs. Or several hours with a CPU core maxed out and a nice, toasty computer.

But hey. I got some neat 3-d mesh plots out of it, showing the performance of the algorithm over the parameter search space. Proper science, this! Sure it has faults, but Matlab’s plotting functionality is pretty neat and easy to use. Plots like the following are a doddle:

Matlab 3D Plot

Figure 1. Gratuitous use of a 3D plot for fun and profit

The goal of the exercise was to identify the most relevant ‘features’ in the data set for assigning the data into an appropriate class. Imagine you had a big table of information about a set of people, where the first column (could call it ‘Feature 1’)  was their heights, the second was the time of day the height was measured, and you were trying to predict their sex. You and I would guess that height would be a good indicator of sex and the time of day would be irrelevant, but we’d be cheating by applying information about the meaning of the data that exists in our heads and wasn’t supplied with the problem.

By applying a variety methods to our table of data, a computer might be able to recognise relationships between the features and what we’re trying to predict, without knowing anything else about the information. In doing, it could remove the features that do not appear to have any relationship and save the computational effort and time that would otherwise be spent processing useless data. The approaches that can be applied are various, and some degree of tuning needs to be applied to to ensure that removing features doesn’t compromise the goal in subtle ways.

Today’s lectures moved on to machine learning techniques using the perplexing mathematics of probability (perplexing for my tiny brain, at any rate), in preparation for the last two weeks where unsupervised learning is the order of the day. The usual lab afternoon was focussed on kicking off a three week project involving applying the techniques we’re learning to do something with bunch of data in the style of a research paper.

Time to polish up the LaTeX from last year then…

Machine Learning – Day 2

Day 2 of the Machine Learning MSc module at Manchester saw us learning about Decision Trees and the role that entropy, linear correlation and mutual information can play.

It’s all about categorical data (like name, a set of fixed values), whereas last week was about the automated classification of continuous data (like temperature, a smooth range of values). The algorithms we were looking at to automatically build decision trees using the inherent statistical and probabilistic properties of a set of data to try and maximise the decision accuracy with the minimum overhead of computation and memory.

Today’s stuff didn’t seem too tricky, and last week’s lab assessment went pretty well.

This week, we need to use the mi() and h() fuctions from the a Matlab Mutual Information library here. Sounds great, but – I’m getting problems using it referring to undefined symbols that may be related to the 64-bit OS on this machine, so I’ll need to try a few options to work around that. Need to get that working!

Well, it’s been a long day so I’ll call a close here. Cheers!

Machine Learning – Day 1

So I made it in on time for the first day of my Machine Learning course. The train was fantastic, particularly in comparison to the tiny cattle carriage that I ended up last Wednesday. Tip of the day – even on the same routes, not all trains are equal!

After the usual stop at the butty shop for a sausage and egg sandwich plus a coffee, I was in room 2.15 and ready for action.

So what’s Machine Learning then? Sounds very Skynet and The Matrix, doesn’t it? Dr. Gavin Brown started out explaining how ML is a subset of the field of Artificial Intelligence, which focuses on software that can analyse and adapt to the underlying patterns in the information it sees.

Other areas under the AI banner include reasoning, robotics and vision, to name but a few. This breakdown of the big, amorphous ‘thinking machines’ field as it was in the 60s into these sub-areas is why we have made huge leaps forward in the field since the past couple of decades.

What progress? Today, Machine Learning technology finds use everywhere – some examples are the Amazon online store (selecting your recommended stuff), tuning computer games, filtering spam emails and fraud detection in banking. If you’d like to know more about the motivation behind studying this stuff, you can check out these introductory slides.

The format for this module is very different to the Semi-Structured Data and the Web module. It’s still every Tuesday for five weeks, but there are no full days of lectures. Instead , the mornings are lectures and the afternoons are lab sessions.

Assessment is also different – there’s still an exam, but the coursework consists of assessed verbal lab reports for 20% and a project for 30%. The exam counts 50%. Whereas in the last module, we were assigned to groups of two and much of the coursework was largely joint in nature, this time it’s all individual work.

The labs use a matrix-based programming language called Matlab. Takes a bit of getting used to, but usable enough when you start to get the hang of it.

Day 1 covered terminology, the ‘Perceptron’ algorithm (will find a dividing line between two classes of data, if one exists) and Support Vector Machines (tries to find the ‘best’ such line, using hairy maths). If you’re interested in knowing more, Googling for ‘Machine Learning’ and these terms will find papers, communities and video talks and lectures. It looks like a really active area of research.

I get the feeling the focus is to be on understanding what’s going on more than any implementation details. That’s a good and a bad thing for me – I know implementation, and you can largely tell if you’ve got an implementation functionally correct by whether it does what it’s supposed to do.

This time it might be a bit less clear cut whether I’m right or wrong before I get to the assessment phase!

Induction Week 2009

Before each academic year, there’s ‘Induction Week’, where alongside the orientation stuff going on for the new students, the academics running the courses sell their wares to the students who’ve signed up to do an MSc. There’s a choice of 25-odd courses which seem to cluster around formal methods, artificial intelligence, high performance computing and the semantic web.

This year, I’ve saved up a few days’ holiday to let me attend the Wednesday and Thursday, when the course talks are going on. It also lets me sort out library books, admin stuff and the like. The 05:45 starts to get to Manchester on time are painful, mind.

I’ve transferred most of the introductory talks I was interested in seeing to my Google Calendar, so that I had my agenda for the day on my phone. That saved me potentially missing anything I wanted to see without me having to sit in the same room all day. In theory anyone who’s interested in what’s going on should be able to view my MSc calendar here. I haven’t tried to give out links to a personal public Google calendar before though, so let me know if you want to look and it doesn’t work.

It certainly felt very different this year from last. Knowing where everything is and seeing a few friendly faces makes everything much easier and more comfortable.

As for the courses, I confirmed what I want to study this year, so it’s time to get stuck into maths and Matlab ready for Machine Learning, which starts on Tuesday.