Well, the website is up…

After a week of beavering away with JSP, HTML and CSS, I reckon my project website is about ready.

I was in Manchester on Tuesday, meeting my project supervisor and one of the guys who runs the taught module associated with the project. There don’t seem to be any problems, and it helped to clarify some of the vagueness I referred to previously.

So, the website content needed to include a statement of the project aims and objectives, a summary of the progress to date, the project plan (significantly cut down from the previous detailed exposition) and a summary of the literature search so far – bringing together what I’d already done, about a week’s work so far. I also decided to take a middle road between the simplistic html-in-a-zip approach and an all-singing-all-dancing one. I’m not going to get any more marks for going nuts on this thing, so I just took the aspects that mitigate risks or save time – for example, using a custom tag library to template out the elements that would otherwise need to be duplicated, thus saving time especially when they needed to be changed. I also decided not to compromise on the HTML/CSS separation, again in the interests of making changes to stylistic aspects as simple as possible.

All three elements of the project to date save data in a text-based format: the summary is written in LaTeX; the plan saves an XML document; and the website of course is a structure made up of HTML, CSS and JSP files. This means that all three play nicely with a version control system, and I decided to give Git a whirl at the outset. In a nutshell, I’ve been making small changes, then storing those changes along with messages as part of a ‘commit’ process. These messages can be extracted, providing a kind of timeline of what I’ve been doing for the past few weeks much better than I would have done in my own notes. I can take those timestamped messages and push them into the website during the build process, then use a simple renderer to print them out on the site when certain links are clicked. Seemed like a good way to augment the ‘summary to date’ deliverable.

I’ve also spent a few hours updating and tidying up this blog as I’ve linked appropriate posts into the site as another way of tracking progress and my hosting provider took it down over the weekend, as well as a nasty surprise with my original EC2 instance… maybe good for another post.

A Very Geeky Dilemma

A new module has appeared on the University of Manchester CS horizon, and it’s temping me away from wrapping up the taught course with my previous front-runner ‘Ontology Engineering for the Semantic Web‘.

Yep, COMP61032 ‘Optimization for Learning, Planning and Problem Solving‘ has appeared in my field of vision and it looks a bit hardcore. It’s part of the ‘Learning from Data’ theme – I guess optimisation is a natural partner to machine learning approaches, owing to the need to chew up a whole lot of information as quickly as possible.

Why is it tempting? Lots of algorithms and computational complexity going on – it’s one of those modules that’s shouting “Bet you can’t pass me”. More than that though, it’s modules with that computational theory slant that have shown me moments of catch-your-breath clarity in the way that messy practicality distils to elegant mathematical beauty. It’s a great sense of satisfaction when you persevere and get to see it.

So – Ontology engineering, or Optimisation? Hey, I warned you it was geeky.

Optimization for learning, planning and problem-solving

Machine Learning Turing Lecture in Manchester

Dr. Christopher Bishop will be giving the Turing Lecture this year on the topic of Machine Learning.

Dr. Bishop is a highly respected figure in the Machine Learning discipline and wrote Pattern Recognition and Machine Learning, a great place to start if you’re interested in the subject. It’s certainly on my bookshelf.

He’s giving the lecture in London, Cardiff, Edinburgh and Manchester, and Manchester’s lecture is on the 17th March.

Machine Learning – Day 5

So that’s the end of the taught course in Machine Learning, finishing up learning about Markov Chains and Hidden Markov Models.

Yep, those are just links to the Wikipedia articles, and it’s quite possible that if you clicked on them and you’re anything like me, the crazy-looking maths stuff at the other end made the sensible part of your brain run off and sit in a corner, clutching its knees and rocking gently back and forth.

Probably muttering to itself.

To be honest, I can’t really explain what this stuff is about just yet – I’ve had a lot crammed into my head over the past few weeks, and I think I need a really good night’s sleep before I can comprehend the deeper aspects of this last bit. Suffice to say for now that it seems like some really interesting and powerful ideas are in play, and when I’ve got my head round it I’ll blog up my thoughts.

I’ve now got one more homework assignment on today’s material to complete by next Wednesday, and the project we’ve been assigned to do is then due on Friday 6th November – a nice surprise, as a typo on the schedule had us believe it was due on the previous Tuesday.

I’m sorry the taught part of the course is done, to be honest. Although I’m not sure I could have taken any more at the pace it was being taught, I’ve thoroughly enjoyed the material.

In fact, I’d say I feel a little inspired.

And, as James Brown might say – it feels good.

Machine Learning – Day 4

Day 4 covered methods of automatically identifying clusters in data – and some of the issues that arise using those techniques.

Doing this automatic identification is called unsupervised learning, because it doesn’t depend on having a set of labelled data examples to hand. The learning is done purely based on the statistical and probabilistic properties of the data itself.

I got to say, I’m struggling with the probablistic side of things – my intuition isn’t helping me much, so I’ve been doing the books thing to try and really get my head round it. So little time…

We also covered some techniques involving reducing the dimensionality of data – say a dataset has a thousand properties, and the computational overhead of processing increases with the number of properties. You’ll need some way of reducing the number of properties, whilst retaining the maximal information they encoded. We were looking at selecting features a couple of weeks ago, but today we looked at PCA – Principle Component Analysis, a technique to ‘project’ information into a smaller number of dimensions, or properties.

I quite like this paper on PCA, if you’re looking for somewhere to get an idea what it’s about.

And that, if you were reading this blog a few weeks ago, is where the eigenvalues and eigenvectors come in.

We also have a project to complete in the next couple of weeks, so time is very much of the essence right now. I suspect that as with the Semi-Structured Data and the Web course last year, the deeper concepts behind some of this material will only become clear to me when I’ve completed the set material and start to work on the revision of what we’ve covered.

Back in the day, revision was time off between lessons and exams – these days, not so much!

Machine Learning – Day 3

Getting through the coursework was a challenge – my computers have never worked so hard.

The last section involved performing a computation over a data set that took a few seconds per run to exhaustively search for the optimal settings for two parameters in the computation’s algorithm. Searching over 25 possible settings doesn’t sound like a lot, but two of ’em means 625 runs – times a few seconds is quite a wait.

Oh, wait – there was also a requirement to randomly shuffle the input data for the algorithm ten times and do some simple stats, to give some chance of reducing potential bias brought about by the order in which the data appears in the data set. So that’d be 10 runs per pair of parameter settings, which is 6250 runs. Or several hours with a CPU core maxed out and a nice, toasty computer.

But hey. I got some neat 3-d mesh plots out of it, showing the performance of the algorithm over the parameter search space. Proper science, this! Sure it has faults, but Matlab’s plotting functionality is pretty neat and easy to use. Plots like the following are a doddle:

Matlab 3D Plot

Figure 1. Gratuitous use of a 3D plot for fun and profit

The goal of the exercise was to identify the most relevant ‘features’ in the data set for assigning the data into an appropriate class. Imagine you had a big table of information about a set of people, where the first column (could call it ‘Feature 1’) was their heights, the second was the time of day the height was measured, and you were trying to predict their sex. You and I would guess that height would be a good indicator of sex and the time of day would be irrelevant, but we’d be cheating by applying information about the meaning of the data that exists in our heads and wasn’t supplied with the problem.

By applying a variety methods to our table of data, a computer might be able to recognise relationships between the features and what we’re trying to predict, without knowing anything else about the information. In doing, it could remove the features that do not appear to have any relationship and save the computational effort and time that would otherwise be spent processing useless data. The approaches that can be applied are various, and some degree of tuning needs to be applied to to ensure that removing features doesn’t compromise the goal in subtle ways.

Today’s lectures moved on to machine learning techniques using the perplexing mathematics of probability (perplexing for my tiny brain, at any rate), in preparation for the last two weeks where unsupervised learning is the order of the day. The usual lab afternoon was focussed on kicking off a three week project involving applying the techniques we’re learning to do something with bunch of data in the style of a research paper.

Time to polish up the LaTeX from last year then…

Machine Learning – Day 2

Day 2 of the Machine Learning MSc module at Manchester saw us learning about Decision Trees and the role that entropy, linear correlation and mutual information can play.

It’s all about categorical data (like name, a set of fixed values), whereas last week was about the automated classification of continuous data (like temperature, a smooth range of values). The algorithms we were looking at to automatically build decision trees using the inherent statistical and probabilistic properties of a set of data to try and maximise the decision accuracy with the minimum overhead of computation and memory.

Today’s stuff didn’t seem too tricky, and last week’s lab assessment went pretty well.

This week, we need to use the mi() and h() fuctions from the a Matlab Mutual Information library here. Sounds great, but – I’m getting problems using it referring to undefined symbols that may be related to the 64-bit OS on this machine, so I’ll need to try a few options to work around that. Need to get that working!

Well, it’s been a long day so I’ll call a close here. Cheers!

Machine Learning – Day 1

So I made it in on time for the first day of my Machine Learning course. The train was fantastic, particularly in comparison to the tiny cattle carriage that I ended up last Wednesday. Tip of the day – even on the same routes, not all trains are equal!

After the usual stop at the butty shop for a sausage and egg sandwich plus a coffee, I was in room 2.15 and ready for action.

So what’s Machine Learning then? Sounds very Skynet and The Matrix, doesn’t it? Dr. Gavin Brown started out explaining how ML is a subset of the field of Artificial Intelligence, which focuses on software that can analyse and adapt to the underlying patterns in the information it sees.

Other areas under the AI banner include reasoning, robotics and vision, to name but a few. This breakdown of the big, amorphous ‘thinking machines’ field as it was in the 60s into these sub-areas is why we have made huge leaps forward in the field since the past couple of decades.

What progress? Today, Machine Learning technology finds use everywhere – some examples are the Amazon online store (selecting your recommended stuff), tuning computer games, filtering spam emails and fraud detection in banking. If you’d like to know more about the motivation behind studying this stuff, you can check out these introductory slides.

The format for this module is very different to the Semi-Structured Data and the Web module. It’s still every Tuesday for five weeks, but there are no full days of lectures. Instead , the mornings are lectures and the afternoons are lab sessions.

Assessment is also different – there’s still an exam, but the coursework consists of assessed verbal lab reports for 20% and a project for 30%. The exam counts 50%. Whereas in the last module, we were assigned to groups of two and much of the coursework was largely joint in nature, this time it’s all individual work.

The labs use a matrix-based programming language called Matlab. Takes a bit of getting used to, but usable enough when you start to get the hang of it.

Day 1 covered terminology, the ‘Perceptron’ algorithm (will find a dividing line between two classes of data, if one exists) and Support Vector Machines (tries to find the ‘best’ such line, using hairy maths). If you’re interested in knowing more, Googling for ‘Machine Learning’ and these terms will find papers, communities and video talks and lectures. It looks like a really active area of research.

I get the feeling the focus is to be on understanding what’s going on more than any implementation details. That’s a good and a bad thing for me – I know implementation, and you can largely tell if you’ve got an implementation functionally correct by whether it does what it’s supposed to do.

This time it might be a bit less clear cut whether I’m right or wrong before I get to the assessment phase!

Eigenvalues and Eigenvectors

Yeah, exactly. Eigen-whats?

Welcome to the primer material for the Machine Learning module. It looks pretty mathsy, specifically Linear Algebra (think matrix algebra and Eigen-dooflabs), Differentiation and Integration and some probability and information theory.

Yeah, it looks tough. But I’m intrigued, too. Studying the material, I can’t wait to find out how these things actually apply to machine learning. Something inside my head that romantically pursues elegance in this stuff is thinking of some analogy of resonance and harmonics – but applied to learning algorithms. Probably way off base, but hey. Soon I will be highly learned in these things.

The tutors actually have a dedicated website for the course here, which is where all the primer material, previous years’ notes and past exam papers can be found. It looks like a great resource for prospective sudents like myself, so hats off to the tutors on this one.

Category: Machine Learning

Well, the website is up…

A Very Geeky Dilemma

Optimization for learning, planning and problem-solving

Machine Learning Turing Lecture in Manchester

Top 5 Cool Machine Learning Links

How to make $1M with Machine Learning and Netflix

Detexify – Handwritten symbol recognition

Lego Mindstorms Robots that Learn

Machine Learning at videolectures.net

The Singularity Summit

That’s All Folks