My last exam… hopefully

My MSc. consists of six taught modules, and I sat the exam for the module #6 Optimization for Learning, Planning and Problem Solving this morning. It seemed to go pretty well, nothing in there that I hadn’t prepared for so with any luck there’ll be no resits and that was my last exam. At least, for this MSc, anyway.

I usually post up about each day as I’m doing a module, but I didn’t this last time. The module was pretty heavy on the coursework, involving a bigger than usual time investment, plus trying to balance that with my day job and my dissertation project is tough going. To be honest, trying to split my focus over these things and still retain some semblance of a home and social life was taxing, and it felt a little like it was maybe a bit too much. That’s a depressing feeling, but hey. With this last module down, there’s one less thing I need to split my time over.

The optimization module was actually very good, covering a pretty wide range of material in enough depth to be implementable. The lecturer, Dr Joshua Knowles, made all the course materials are available at the site I linked to above, as well as details about further reading, self-test questions, background materials and the like, broken down by week. If you want to know what a CS module at Manchester is like, I don’t think you can do better than familiarising with the background stuff on there and then trying to follow the course in sequence completing the coursework as you go.

I might post up more about how I found the course sometime later. Right now, it’s time to get back on top of my project.

A Very Geeky Dilemma

A new module has appeared on the University of Manchester CS horizon, and it’s temping me away from wrapping up the taught course with my previous front-runner ‘Ontology Engineering for the Semantic Web‘.

Yep, COMP61032 ‘Optimization for Learning, Planning and Problem Solving‘ has appeared in my field of vision and it looks a bit hardcore. It’s part of the ‘Learning from Data’ theme – I guess optimisation is a natural partner to machine learning approaches, owing to the need to chew up a whole lot of information as quickly as possible.

Why is it tempting? Lots of algorithms and computational complexity going on – it’s one of those modules that’s shouting “Bet you can’t pass me”. More than that though, it’s modules with that computational theory slant that have shown me moments of catch-your-breath clarity in the way that messy practicality distils to elegant mathematical beauty. It’s a great sense of satisfaction when you persevere and get to see it.

So – Ontology engineering, or Optimisation? Hey, I warned you it was geeky.

Optimization for learning, planning and problem-solving

Text Mining – Day 4

Between prep for my MSc. project, getting married, snowed under at work, starting the my next MSc. module and being full of cold, there hasn’t been much time for blogging…

So today was day 4 of the Text Mining module. As a friend put it, “Text Mining? What – like using grep?”

Text Mining is defined as finding previously unknown information in unstructured data. Unknown – as in never explicitly written down.

So by ‘text’, we mean un- or partially-structured data, like word documents or this blog page. There’s some structure here, headings, subheadings, lists and the like. but it’s not ‘structured’ in the sense that database tables are, with fields and columns and a type system.

Tools like grep can match words (more generally, expressions describing relatively simple patterns of characters called regular expressions), so whilst they’re fairly easy to use (so long as you don’t try to push them too far), they are limited in the complexity of what they can do.

For example, you can’t easily use grammatical ideas, like identifying documents that are about fish (a fish), but not fishing (I fish). You can’t search for documents related to a concept, and recognising generic names or technical terms is out. You can’t build structures like indices to help with searches, which means that over reasonably large collections of documents, grep is too slow to be very useful.

I’m still getting my head around how it hangs together, but text mining seems like a set of gloriously messy, pragmatic and seemingly pretty successful ways to let computers listen in on the languages that humans have evolved.

Logic and Applications – Tough Exam!

I took the Logic and Applications exam last Friday. I think I’m ready now to talk about the ordeal…

It wasn’t so bad really, I guess. I made a bad call as to which questions to answer (it was one of those answer three of four kind-of-things) and ran out of time. One of the questions I initially chose had what was for me a brick wall towards the midway point, and on a two hour exam, spending 20-25 mins heading down a dead end isn’t the best idea!

I guess the two frustrations I felt with this exam were firstly that the course covered so much material so quickly, but each of the topics turned out to be a bit of a rabbit-hole when I got to thinking about it during the revision process – the more I thought about it, the more questions I found!

On top of that, one of the key aspects of a course like this is transformation of formulae into alternative forms which have properties we want – usually, more efficient solving algorithms. These transformations are rather like the algebraic manipulation of mathematical formulae we did at school – progressing in unit steps, painstakingly copying out each new form as you go. That consumes a lot of time, especially when the formulae don’t give out easily, but it doesn’t really seem to prove much about the student’s skills – the pages-of-transforms kind of work was all hammered pretty hard in the coursework, after all. Then again, maybe I just screwed something up early doors and that led to the extensive transform.

The course was new this year anyway, so maybe it takes a little time for the exams to settle in terms of difficulty. Or I’m just a dumbass. Anyway, it’s too late to worry about all that now. Hopefully, I passed – that’s the main thing, right?

Preparing for the Logic and Applications Exam

It feels like a long time between finishing the Logic and Applications course back in early November and the exam, which is next week on the 27th January. In between, I’ve done a little work on my project proposal in the meantime, but certainly since late December I’ve been focussing more on preparing for the exam.

It’s always a bit surprising when I start revising how much stuff we covered in a five-week course and this one was no exception. The syllabus is here on the UoM CS website. It’s also a new course this year, so there aren’t any specific past papers (exam papers from previous years) to get a feel for how the exam will be phrased and what kind of content has been examined before.

The nearest course in previous years was the Automated Reasoning course, which covered similar stuff but also included some aspects of logic programming in Prolog. In this course we used theorem provers SPASS and MiniSAT for the small amount of experimental work involved. Hopefully there won’t be any ‘remember-the-syntax’ style questions…

Logic and Applications Day 5

So I’ve finished the Logic and Applications module now – the last coursework has been submitted and I’ve been enjoying a couple of days of doing things other than schoolwork!

The course was very focused on satisfiability – given a set of logical statements, is it possible to find an interpretation, a set of assignments for the logical statements, that satisfies them? That might not sound particularly useful, but it’s easy to express a question about a system this way.

As an example, consider the Minesweeper game. (Sorry if you’ve not come across this game before, check out the Wikipedia page for an example if so) Initially, we know very little and what we do know isn’t directly useful – how many squares there are. Once we’ve tried a couple of random clicks, we have some information we can use and express in these logics.

Perhaps we know that the square at 7 across and 12 up contains a mine and that the square at 8 across and 12 up has one adjacent mine. The question we might ask is whether 8,12 contains a mine and this is where satisfiability comes into play. Given logical statements encoding the rules of minesweeper, and the two details we mentioned specific to this game, is it possible that 8,12 contains a mine?

In other words, is the set of logical statements describing the rules, this game (to this point) and a statement expressing that 8,12 contains a mine satisfiable? The answer of course is easy for human to figure out.

The ability to automate the search for satisfiability allows these kinds of problems to be solved quickly and accurately using tried and tested reasoning tools. The general approach we used on the course was to search for a satisfying interpretation, which quickly leads to a problem – combinatorial explosion. The number of possible interpretations of a set of logical statements grows exponentially as the number of logical variables grows.

As such, a big focus on the course was algorithms and approaches to battle this exponential complexity – there are many techniques to reduce the space of possible interpretations and bring the complexity problem under control. That said, one of the impressions I come away with is that for a given problem no generalized technique will be able to guarantee to determine satisfiability in a reasonable time.

A final point, and an important one, is that the techniques we used were based almost entirely on mathematical proofs, that is to say that we are able to deduce that the techniques are correct and have certain other properties using formal methods. Of course, mathematical proof is formed by logical deduction too, so there’s a certain recursive nature to all this.

It’ll be good to get started on a fairly relaxed process of revision for all this stuff. The sheer volume of information thrown at us on these modules is huge, and it always feels a little like riding a rollercoaster, so it’s good to get back in there and review the material at a more relaxed pace.

No Building Web Applications

I tried the first couple of weeks of the Building Web Applications, but I won’t be continuing with the module.

I was hoping that there’d be some deep insight into the pros and cons of the Java web applications and the JSF framework and RESTful Web Services, but it was pretty clear that we weren’t going to cover enough ground to make the course worthwhile. There’s also some pretty idiosyncratic approaches to writing Java code going on too, so it was all a bit strange.

I could have carried on through the module as it should have been easy to get a good grade but then I’m doing the course for the challenge and opportunities to learn more than for the letters after my name. It’d be a shame to lose out on learning something else.

The software engineering modules have been a little disappointing, to be honest – where I found the Computer Science modules assumed a challenging knowledge of maths and computing, the software engineering modules seem to assume little or no prior knowledge. Maybe that’s just my perception, having been working on-and-off in software development for the past 4-5 years.

On the bright side, I did get a lead on what looks like an excellent book to get stuck into JSF 1.2 in Core JavaServer Faces by Geary and Horstmann.

That means that I just need to revise the Patterns for e-Business course foe this set of exams. In Autumn I’ll continue with Computer Science modules and it’ll be once more unto the breach with the whole logic thing, so I’ve pulled a couple of books out of the library to get started with that again.

Manchester University’s CS Legacy

When I chose Manchester University for my Computer Science MSc, it was partially because of its reputation but I realized I didn’t actually know anything specific about that legacy.

I thought I’d find out a little more about some of the computing cornerstones that were laid in Manchester’s labs. Did you know that the first Random Access Memory was created there? Fast, random access memory is a core part of computer systems today. Having enough of it is crucial to making your laptop or desktop run all those applications quickly for you.

The Williams (or Williams-Kilburn) Tube was the first random access memory that could access at speeds suitable for a computer. It was the ancestor of the multi-gigabyte cards you’ll find in your computer today.

Back in the days before TVs were two inches thick, the moving pictures on the screen were drawn by magnetic fields and streams of electrons in a glass tube called a Cathode Ray Tube, or CRT. Did you ever hold your hand near the screen of a CRT television and feel the static tingle? Somewhere around 1946, Tom Kilburn and Freddie Williams at Manchester University used the charge on a CRT’s phosphorescent coating to store ones and zeroes (effectively as dots), where they could be detected by a ‘pickup plate’ which lay over the ‘screen’.

As the electron beam hit the screen, a positive charge would be left behind at that position. Not for long mind you, as the charge would dissipate, but the information read by the pickup plate was used to refresh the tube before the charge had chance to leak away. This refreshing process is still required by the RAM chips in your computer today.

If you’re interested in knowing more, you can read all about it on Wikipedia and computer50.org, the sources I used to get this information.

To test the Williams Tube, the folks at Manchester built the first stored-program computer, a pretty important milestone in its own right. Maybe more on that some other time.

Pattern-Based Software Dev – Day 4

The material for day 4 focussed on Business Process Modelling. This sits orthogonally to Patterns for e-Business, defining business functions over their architecture.

There are two notations for Business Processes put forward – BPMN and UML Use Case/Activity Diagrams. My part of the coursework assignment is to apply BPM to the johnlewis.com some processes on the website, for which I’ve chosen Activity Diagrams and Visual Paradigm for UML. I did take a look at the implementations of BPMN, but I found a familiar pattern – they either didn’t work or cost $$$. Fortunately, VP is still serving me well.

The lab session was spent working with my team on the coursework and setting up tasks for the rest of week. As we’re producing a large report and taking different sections, we’ve set up a Google Docs site to drop working drafts onto to help us collaborate. It’s the first time I’ve used Google Docs like this and so far I like it, it’s responsive, intuitive and it’s easy to share a folder with a group of people, so for this kind of work it’s looking good.

In other news, the marks for the Machine Learning module are in and I’m very happy to have passed! That’s two modules, or one-quarter of my MSc done.

Pattern-Based Software Dev – Day 3

Today was hand-in day for the first part of this module’s coursework – to design a shop, based on four requirements, using Object-Oriented design principles. Specifically, we had to use the State, Strategy and Item Description patterns, although I also worked in a couple of other patterns I like – Decorator (solves problems of composing functionality using recursion) and Iterator (hides implementation details of collections behind a simple object you can only iterate over).

I quite enjoyed having a simple pet problem like this, with a real reason to work through some aspects of it. If you’re interested in having a go yourself, the assignment reminded me of the first pragprog Code Kata – Supermarket Pricing.

As a result of the coursework, I’ve found a UML tool I can live with – Visual Paradigm. I’ll probably do a bit of a review and compare with the other tools I tried soon, but suffice to say it was by some margin the most pleasant and easy-to-use tool of the 5 or 6 I tried. £40 needed to get rid of the invasive watermark, but it looks like when it comes to CASE tools you get what you pay for.

The lectures are proving tricky to keep on top of – the pace is kinda slow (maybe that’s just me), so I find myself struggling to maintain attention. Still, the lecture notes are very detailed, so I spent a a few hours reviewing last week’s notes creating myself some revision material. I’ve been using a piece of software called Freemind to do ‘mind-mapping’, something I found out about in a presentation by Steve Brett in last years’ unsheffield unconference. It seems to work pretty well for the way I do revision, here’s a screenshot if you’ve not seen a mind map before.

Freemind Screenshot

So anyway, it’s all good. Coursework part 2 starts now, two more lectures in this module.