Ubuntu, Fedora or Mint?

About a month ago after I finished my last module, I upgraded to the latest Ubuntu release, 11.04 or ‘Natty Narwhal’. My first impressions¬† over the course of a week or two were sufficient to have me go looking elsewhere.

There were some big problems.

Ubuntu 11.04

The new Unity interface, whilst it’s very pretty, is totally unfamiliar and feels rather like a toy. The menus I used to start applications from are gone, the taskbar I used to see what was running and place shortcuts on is gone. Now to start a program there’s a glossy, full screen… thing… it’s a bit like a menu… but takes up the whole screen with big Fisher-Price icons. To see what’s running at a glance… I can’t. The idea where the title bar of a window with the window buttons and menus isn’t attached to the window and appears at the top of the screen… seriously? I hear that this idea is nicked from Apple – but it really doesn’t work for me.

I guess the idea is that you type the name of the application instead of finding it in the menus. Nicked from Windows 7, I think. If I want to find and launch applications by typing their names, I use the command line – I’m not sure I get how search instead of menus is a step forward.

Then there was the speed, or rather, the total lack thereof. Using my computer went from effortless to wading through treacle. In snowshoes. I notice performance tips and tweaks guides for 11.04 starting to appear out there, so it’s not just me. The poor performance was the dealbreaker.

Fedora 15

I downloaded Fedora 15, having previously been a user of that distro. I know that 15 ships with Gnome 3, but I didn’t realise it would be so similar to Unity, with all the same bizarre UI quirks. On the bright side, it was a lot snappier… but all in, still not really usable.

Mint

So yesterday, I pulled Linux Mint 11 off the shelf and I’m happy to say that it is a joy to use. Menus, task bars, windows that work properly, fast, easy to set up. Back to business as usual. If you’re not loving the Gnome 3/Unity thing, I can recommend Mint (so far, based on 24h usage… mileage may vary!)

Serious or Casual?

With my immediate problems addressed, the direction that Gnome and Unity are taking for Linux is interesting. Are we seeing the Linux windowing systems fragment into serious and casual usecases? I can see how the new UI might be familiar and easy for someone who is used to their tablet or their smartphone. Maybe it’s also good angle for relatively small screen devices like netbooks and tablets – certainly the apparent ‘every pixel is precious’ mindset doesn’t make much sense on a big widescreen monitor.

I expect that broadening the appeal of an operating system is a good thing, and perhaps Ubuntu and Fedora are setting their stalls out as ‘for the casual user’. If that’s so, then thank goodness for distros like Mint that give folks who use their computers to do work the power of old(er) school Linux without the pain.

Essays on the State of the Art and Future of Text Mining

The coursework for this Text Mining module has been quite challenging. Each week we had a task to complete, along the lines of evaluating training of a part-of-speech tagger (a piece of software that tries to tag words with the part of speech they serve), or create a named entity recogniser (a piece of software that tries to work out that some sequences of words have meaning above their component parts – for example “New York” means something different to “new” and “York”) using various methods. As I’ve worked through though, the goals have become clear – we were building up components that could work in sequence to process text. Neat.

One aspect of the coursework that was unusual was that it is all to be handed in together at the end, rather than week by week. If I’m honest it’d probably have been a little easier if I’d done the coursework in step with the lecture days – I actually fell a little behind because of various commitments.

Then there was the essay. A 3,000 word essay on the state of the art of text mining and my views for the future of the field.

I’ve not written an essay for at least 15 years now, and getting started was a real challenge. Text mining and Semantic Web maybe? Sentiment analysis is the future? I was pulling my hair out, trying to find an angle that I could argue cleanly though, citing academic research and the like. I’ve been screwing up outlines on bits of paper about a week now!

That said, when I headed into Manchester yesterday and sat in my lectures, I had something of an epiphany. I guess the problem was that I feel the field has huge untapped potential, and I struggle to argue through a point of view I care about when I can’t see the current approaches panning out. I’m going to take a bit of a risk, and write an essay that (constructively) criticises some aspects of text mining today, proposing and arguing through a slightly different approach.

We’ll see how it goes – the last few bits of paper have so far avoided a one-way ticket to the bin. Hopefully I can produce a well-argued, reasonably interesting essay that I’ll get some marks for!

Text Mining – Day 4

Between prep for my MSc. project, getting married, snowed under at work, starting the my next MSc. module and being full of cold, there hasn’t been much time for blogging…

So today was day 4 of the Text Mining module. As a friend put it, “Text Mining? What – like using grep?”

Text Mining is defined as finding previously unknown information in unstructured data. Unknown – as in never explicitly written down.

So by ‘text’, we mean un- or partially-structured data, like word documents or this blog page. There’s some structure here, headings, subheadings, lists and the like. but it’s not ‘structured’ in the sense that database tables are, with fields and columns and a type system.

Tools like grep can match words (more generally, expressions describing relatively simple patterns of characters called regular expressions), so whilst they’re fairly easy to use (so long as you don’t try to push them too far), they are limited in the complexity of what they can do.

For example, you can’t easily use grammatical ideas, like identifying documents that are about fish (a fish), but not fishing (I fish). You can’t search for documents related to a concept, and recognising generic names or technical terms is out. You can’t build structures like indices to help with searches, which means that over reasonably large collections of documents, grep is too slow to be very useful.

I’m still getting my head around how it hangs together, but text mining seems like a set of gloriously messy, pragmatic and seemingly pretty successful ways to let computers listen in on the languages that humans have evolved.

Logic and Applications – Tough Exam!

I took the Logic and Applications exam last Friday. I think I’m ready now to talk about the ordeal…

It wasn’t so bad really, I guess. I made a bad call as to which questions to answer (it was one of those answer three of four kind-of-things) and ran out of time. One of the questions I initially chose had what was for me a brick wall towards the midway point, and on a two hour exam, spending 20-25 mins heading down a dead end isn’t the best idea!

I guess the two frustrations I felt with this exam were firstly that the course covered so much material so quickly, but each of the topics turned out to be a bit of a rabbit-hole when I got to thinking about it during the revision process – the more I thought about it, the more questions I found!

On top of that, one of the key aspects of a course like this is transformation of formulae into alternative forms which have properties we want – usually, more efficient solving algorithms. These transformations are rather like the algebraic manipulation of mathematical formulae we did at school – progressing in unit steps, painstakingly copying out each new form as you go. That consumes a lot of time, especially when the formulae don’t give out easily, but it doesn’t really seem to prove much about the student’s skills – the pages-of-transforms kind of work was all hammered pretty hard in the coursework, after all. Then again, maybe I just screwed something up early doors and that led to the extensive transform.

The course was new this year anyway, so maybe it takes a little time for the exams to settle in terms of difficulty. Or I’m just a dumbass. Anyway, it’s too late to worry about all that now. Hopefully, I passed – that’s the main thing, right?

Why I didn’t write any software for Windows Mobile

A few year ago, around 2006 at a guess, I saved up a bit of my hard-earned dollar and bought a Dell Axim X51v. It was a wonderful little device for the time and I fancied having a go at writing software for it.

So I went to the Microsoft website to find out how to do that, where I was confronted with a request for more cash. In order to write a line of code for Windows Mobile at that time, you had to shell out for licenses to use Microsoft’s IDE and developer tools. That’s on top of whatever fees that MS was getting from Dell and the license I’d bought with the device to actually run Windows Mobile.

Naturally, I baulked at the idea and never gave it a go.

Nor have I bought anything from Microsoft since – although that wasn’t a conscious decision. It’s just that since then, there hasn’t been anything that wanted to do in terms of development that mandated some kind of payment. Case in point – my faithful little HTC Magic, succeeded by my Samsung Galaxy S mobile phones. These phones are thoroughly awesome bits of kit which run on Android technology, and recently I had my first dabble in Android development.

Of course, everything you need to write software for Android is freely available on the web, and you can expect a post of two about how that’s going.

Out of curiosity, I checked back in on Microsoft, and it sure looks like you can write for Windows Mobile these days for free. Would it still cost money to write for Windows Mobile if the competition wasn’t giving away their goodies for free? I also had a look at Apple’s tooling to build stuff for the iPhone but I couldn’t work out if it’s free right now or not. (I couldn’t be bothered to look for more than a minute or two to be honest – any readers know?)

I wonder if my decisions since then would have played out any differently if I’d been able to just download the stuff I’d needed to have a go back on ’06? Who knows, I might have gotten hooked on the Microsoft toolset like Visual Studio.

Preparing for the Logic and Applications Exam

It feels like a long time between finishing the Logic and Applications course back in early November and the exam, which is next week on the 27th January. In between, I’ve done a little work on my project proposal in the meantime, but certainly since late December I’ve been focussing more on preparing for the exam.

It’s always a bit surprising when I start revising how much stuff we covered in a five-week course and this one was no exception. The syllabus is here on the UoM CS website. It’s also a new course this year, so there aren’t any specific past papers (exam papers from previous years) to get a feel for how the exam will be phrased and what kind of content has been examined before.

The nearest course in previous years was the Automated Reasoning course, which covered similar stuff but also included some aspects of logic programming in Prolog. In this course we used theorem provers SPASS and MiniSAT for the small amount of experimental work involved. Hopefully there won’t be any ‘remember-the-syntax’ style questions…