London, day 0.5

September 3rd, 2010, 12:16 am UTC by Greg

As I write this, it is the morning of our first full day in London. We got in yesterday mid-day.

So far, so good. We are saying in a hotel just around the corner from Trafalgar Square, which is walking-distance to a lot of stuff.

We found the hotel with not more than 10 minutes of walking in the wrong direction. After checking in, we walked the neighbourhood for a bit and walked by Buckingham Palace.

My initial impressions of London:

  • I miss streets that meet at right-angles. Take for example this intersection near our hotel. It appears on our pocket map as five streets that come together, but when on the ground, is 100 m of roundabout where “we want to go straight” is not a useful thing to have deduced from the map.
  • When in China, the dominant feeling was “wow, everything’s big”. Tienanmen Square, for example, is almost incomprehensibly huge. Here: everything so far has been smaller than I imagined it. Buckingham palace: not all that big. I think I should be more in the New York mindset: everything very dense and close together.
  • British pubs are funny places. Need to investigate further.

Europe!

August 31st, 2010, 11:09 am UTC by Greg

Earlier this year, Kat got invited to speak at a conference in England, which is awesome. What’s more awesome (from my perspective, at least) is that I’m not teaching in the fall. If you put two and two together, you can see that we have half of our trip to England paid for, and time to spend if we go.

So, we’re going.

We leave tomorrow, and are seeing London, Brighton (where the conference is), Barcelona, and a Mediterranean cruise. On the way back, we’re making a pit stop in Ontario to see my parents. All of that will take most of September: we return Sept 25.

I don’t have much to say about it at this point, other than this is why I haven’t been returning anybody’s emails: too much to get ready before we go, and no time to see anybody either.

We have given preference to hotels with internets, so there is some hope we’ll post some updates during the journey.

CMPT 470: feedback wanted

August 26th, 2010, 4:21 pm UTC by Greg

Along with my first offering of CMPT 383, I just finished my 13th offering (!) of CMPT 470. I haven’t changed the backbone of the course much in that time: it mostly feels good to me, and other than moving with shifting web technologies, I haven’t felt the need to change the course style.

But now I’m taking a good hard look at the course. I still like the overall flow, but there are some things I want to change.

I did a survey of the current students to get some feedback, but they lack perspective, having just finished the course. I figure I can get some eyeballs from course alumni here and am looking for some more meaningful feedback.

Question 1: Weekly Exercises and Grading Scheme

When I did CMPT 383, I gave weekly exercises, thinking that they might feel a little bit hand-holdey for an upper-division course. Much to my surprise, they worked better there than they do in 120 and 165: more-senior students are in a much better position to appreciate the micro-lessons that the exercises encapsulate and better understand why they are helpful. It’s also a chance to give problems on everything, not just a few things in major assignments.

I have realized that I want to do weekly exercises in CMPT 470, replacing the three assignments. The problem is: the assignments are worth 30% of the course. The weekly exercises would receive minimal marking and feedback (likely marking scheme: 2=most/everything correct, 1=some stuff done, 0=little/nothing done). With that little “grading”, 30% is too much to give to them: 20% is more reasonable.

So, I have 10% of the final grade to reallocate somewhere. Any suggestions about where an extra 10% of weight should be distributed? (The old grading scheme is online.)

[To give you an idea, I'm imagining that some of the exercises will be like "learn these three important CSS techniques and use each to style this sample page"; "find security holes in this sample mini-app I have created for you"; "pick Rails/Django/whatever and do the tutorial on their site"; "deploy your tutorial code on your group's web server"; "do something with jQuery"]

Question 2: Content

I have certainly done my best to keep with the times, and talk about new web-related topics as they have become relevant. But like I said before: the overall backbone of the course has remained the same.

Are there things that I should have spent more lecture time on than I did? Things that took up too much time?

I definitely want to move JavaScript stuff a little earlier in the course: it deserves to be at least a little more front-and-centre than it has been.

Question 3: Other Stuff?

I have a few other smaller tweaks in mind, and am open to other feedback.

In particular, I plan to (explicitly) open the technology evaluation to a wider array of technologies: JavaScript frameworks, databases. This past semester, I started to realize that the server-side frameworks (Django, Rails, Cake, …) are all fundamentally the same (at the depth that’s possible in the techeval). There are other pieces of technology that are more interesting choices at this point, and they might as well evaluate those.

I’m happy to take any half-baked thoughts on any of this here, or by email.

And that’s how you teach CMPT 383

August 22nd, 2010, 10:48 pm UTC by Greg

I have now completed my first offering of CMPT 383, Comparative Programming Languages.

I had forgotten how much work a new course prep is, particularly as I am anal-retentive enough to not be able to make much use of any other instructor’s course materials. Other instructors just do things… wrong. The only way for a course to feel right is to do it my way, for myself. Giving lectures from somebody else’s notes is like wearing somebody else’s underwear: technically probably just fine, but you just feel dirty.

That’s not to say other people who teach the same courses I do do a bad job: they are generally excellent instructors teaching excellent courses. They just do it wrong, is all.

But, looking at my plan for 383, I came in pretty close to the plan. The final balance of topics was more like 6 weeks, 4 weeks, 3 weeks, but that’s astonishingly close for somebody who usually just stops somewhere around the midterm and thinks “does that feel like about half of the material? Okay good.”

Overall, I’m very happy with it. First offerings of a course are supposed to be bumpy and full of things that you wish you could have done better. Honestly, this was one of my favourite course offerings ever: there are tweaks I’d do for my next offering, but all are fairly minor.

Specifics:

  • The weekly exercises were (to my mind, at least) a total win. My goal throughout was basically to say “remember that thing I talked about this week? Practice it” and I think it worked for the students. I liked them to the point that I’m planning that every course I teach from now on will have weekly exercises, including 470. (More on 470 in a later post.)
  • Some of the more involved examples I put together were among my favourite learning objects ever. (God, I can’t believe I just used the term “learning objects“. I have become everything I hate.)
  • I think I actually convinced them that Haskell was practical. Was that irresponsible?
  • Prolog sucks, but I’m still convinced it’s a worthwhile exercise.
  • The “language concepts” section felt a bit like a laundry list of topics. I don’t know that there’s really any way around that. Maybe I could re-order things a bit so they flow together better.
  • The project was interesting for all concerned. I’d probably cut down to three or four language choices in the future, just to keep the TA from losing his mind.
  • I’m not particularly happy with the exams, but I’m never happy with my exams.
  • Ted was an invaluable sounding board throughout the semester, taking time he didn’t have to listen to my meanderings on the course. Thanks be to Ted, who will do an excellent job teaching the course in the fall. (Excellent, but wrong.)

The feedback I have had from the student side has been very good so far (with the real teaching evaluations still outstanding). I have never before had so many students who had nothing to do with a course talk to me about it. Random students in the hall thought my project was a good idea; everybody and their dog knew about my first assignment; people with friends in the course want to know when I’m teaching it again.

I’ll take that as creating a “buzz” and call it a good thing.

P ≠ NP

August 7th, 2010, 8:21 pm UTC by Greg

An email I was recently forwarded (a couple of steps removed) from Vinay Deolalikar from HP Labs:

Dear Fellow Researchers,

I am pleased to announce a proof that P is not equal to NP, which is attached in 10pt and 12pt fonts.

The proof required the piecing together of principles from multiple areas within mathematics. The major effort in constructing this proof was uncovering a chain of conceptual links between various fields and viewing them through a common lens. Second to this were the technical hurdles faced at each stage in the proof.

This work builds upon fundamental contributions many esteemed researchers have made to their fields. In the presentation of this paper, it was my intention to provide the reader with an understanding of the global framework for this proof. Technical and computational details within chapters were minimized as much as possible.

This work was pursued independently of my duties as a HP Labs researcher, and without the knowledge of others. I made several unsuccessful attempts these past two years trying other combinations of ideas before I began this work.

Comments and suggestions for improvements to the paper are highly welcomed.

The paper is about 100 pages, and looks serious (but being a decade away from last thinking about complexity, I am unable to give any more useful evaluation than that). I’ll refrain from posting the paper itself.

Deciding P ≠ NP is a Millennium Prize Problem and I don’t think I’d get much argument to say it is the biggest open problem in computing science.

Update: I see someone else Deolalikar has uploaded the paper. I should point out that in the email thread I got, Stephen Cook said “This appears to be a relatively serious claim to have solved P vs NP.”

Update: Huh, slashdotted. I think “broke” the story is a little strong, but anyway… any media wanting comment on this story, I’d suggest my colleagues David Mitchell (whose work was cited by Deolalikar in this paper), Valentine Kabanets, or Pavol Hell (who also do research in this area).

Update 08/09: Richard Lipton is posting excellent commentary in his blog.

Everything I know about databases is wrong. Also, right.

June 24th, 2010, 12:48 pm UTC by Greg

I have been teaching CMPT 470 for six years now, with my 13th offering going on right now. Anybody doing that is going to pick up a thing or two about web systems.

I was there for the rise of the MVC frameworks and greeted them with open arms. I watched Web 2.0 proclaim “screw it, everything is JavaScript now” and listed with suspicion, but interest. I am currently watching HTML5/CSS3 develop with excitement but wondering why nobody is asking whether IE will support any of it before the sun burns out.

There’s another thing on the horizon that is causing me great confusion: NoSQL.

The NoSQL idea is basically that relational databases (MySQL, Oracle, MSSQL, etc.) are not the best solution to every problem, and that there is a lot more to the data-storage landscape. I can get behind that.

But then, the NoSQL aficionados keep talking. “Relational databases are slow” they say. “You should never JOIN.” “Relational databases can’t scale.” These things sound suspicious. Relational databases have a long history of being very good at their job: these are big assertions that should be accompanied by equally-big evidence.

So, I’m going to try to talk some of this through. Let’s start with the non-relational database types. (I’ll stick to the ones getting a lot of NoSQL-related attention.)

Key-value stores
(e.g. Cassandra, Memcachedb) A key-value store sounds simple enough: it’s a collection of keys (that you lookup with) and each key has an associated value (which is the data you want). For Memcachedb, that’s exactly what you get: keys (strings) and values (strings/binary blobs that you interpret to your whim).

Cassandra add another layer of indirection: each “value” can itself be a dictionary of key-value pairs. So, the “value” associated with the key “ggbaker” might be {"fname":"Greg", "mi":"G", "lname":"Baker"}. Each of those sub-key-values is called a “column”. So, the record “ggbaker” has a column with name “fname” and value “Greg” (with a timestamp). Each record can have whatever set of columns are appropriate.

Document stores
(e.g. CouchDB, MongoDB) The idea here is that each “row” of your data is basically a collection of key-value pairs. For example, one record might be {"fname":"Greg", "mi":"G", "lname":"Baker"}. Some other records might be missing the middle initial, or have a phone number added: there is no fixed schema, just rows storing properties. I choose to think of this as a “collection of JSON objects that you can query” (but of course the internal data format is probably not JSON).

Mongo has a useful SQL to Mongo chart that summarizes things nicely.

Tabular
(e.g. BigTable, Hbase) The big difference here seems to be that the tabular databases use a fixed schema. So, I have to declare ahead of time that I will have a “people” table and entries in there can have columns “fname”, “lname”, and “mi”. Not every column has to be filled for each row, but there’s a fixed set.

There are typically many of these “tables”, each with their own schema.

Summary: There’s a lot of similarity here. Things aren’t as different as I thought. In fact, the big common thread is certainly less-structured data (compared to the relational style of foreign keys and rigid data definition). Of course, I haven’t gotten into how you can actually query this data, but that’s a whole other thing.

Let’s see if I can summarize this (with Haskell-ish type notation, since that’s fresh in my head).

data Key,Data = String
memcacheDB :: Map Key Data
data CassandraRecord = Map Key (Data, Timestamp)
cassandraDB :: Map Key CassandraRecord

data JSON = Map Key (String | Number | … | JSON)
mongoDB,couchDB :: [JSON]

data Schema = [Key]
data BigTable = (Schema, [Map Key Data]) -- where only keys from Schema are allowed in the map
bigTableDB :: Map Key BigTable -- key here is table name

The documentation for these projects is generally somewhere between poor and non-existent: there are a lot of claims of speed and efficiency and how they are totally faster than MySQL. What’s in short supply are examples/descriptions of how to actually get things done. (For example, somewhere in my searching, I saw the phrase “for examples of usage, see the unit tests.”)

That’s a good start. Hopefully I can get back to this and say something else useful on the topic.

Computer Woes

June 10th, 2010, 10:32 am UTC by Greg

My computer at home has been locking up occasionally for the last few weeks. This has been happening since my upgrade to Ubuntu 10.04/Lucid, but I suspect this is a coincidence. (1) The lockups are hard: even the SysRq magic doesn’t do anything, so I deduce that the problem is in the kernel or below. (2) I haven’t seen any reports of the new Linux kernels being flaky. (3) I tried an upgrade from the i386 to amd46 (32-bit to 64-bit) system which I had been meaning to do anyway: no change even with a significantly different kernel.

Thus, I am of the opinion that I have a hardware problem.

As a computer scientist, I don’t enjoy hardware problems, so I’m thinking about buying my way out of them. (Also, my current system is mostly 3 years old, so it’s not a crazy time to upgrade.) My current thinking:

For about $700, that would leave me with the same case and power supply (an Antec Sonata II, 450W), my video card (nVidia 7600GT, but don’t game so who cares), my Hauppauge PCI TV tuner, and my recently-upgraded hard disks.

So, the questions for the crowd: Does my “it’s hardware” assessment sound right? Is it likely that the processor/mobo/RAM swap will fix my problems? Any other suggestions for hardware purchases?

How to not attend a lecture

May 28th, 2010, 12:06 am UTC by Greg

I teach at a university. That comes with certain parameters: most of my students are in their late teens or early twenties, the average student is reasonably bright but occasionally unmotivated, and I don’t really have any way to compel students to come to lectures.

I do my best to give interesting, informative, and entertaining lectures. I’m successful enough that most students come most of the time, and that’s awesome.

Sometimes students don’t come to lecture. They don’t need a good reason, and they don’t have to tell me about it. I’m okay with that too: part of being at university is being responsible about that kind of thing and I’m happy to assume that whatever reason they have is a good one.

But what really annoys me is when students feel the need to email me, tell me the stupid reason they didn’t come to lecture, and then ask me to tell them what I covered.

I already spent an hour (or three hours) of my time giving the lecture and they had an opportunity to attend. I put a great deal of time and effort into explaining the material in the best way I can and pointing out the things that I think are important. I did all of this because I think I can actually do a decent job of getting material across in the lecture format and I think the material I’m talking about is important.

These emails leave me with two choices: (1) reduce a carefully-prepared lecture to a pointless list of topics and thus implying that I might as well have read them the textbook, or (2) spending another hour repeating the lecture in email form. Neither one of those is very attractive, but there’s also the third option that I have started to avail myself of: telling the students to shove off.

I’ll say here what I said to my CMPT 165 class last semester: if you miss a lecture, you ask a friend in the class for their notes. If you don’t have a friend in the class, ask the person sitting beside you; if at all possible, try to do this when you are sitting beside someone who you find attractive and offer to buy them coffee in return.

Seriously… do I have to explain everything?

cf. entitlement generation.

CMPT 383: for real this time

April 19th, 2010, 6:09 pm UTC by Greg

I have mentioned here before that I was planning to teach CMPT 383. It ended up being a no-go this semester because of a very productive capstone project team (more on that later).

But, I’m on-deck to teach it in the summer. The class is full; the waiting list is full; must be time to plan a course. After much soul-searching, I have decided there will be three main topics in the course and they will be covered in this order:

Functional programming (and Haskell)

This will be most students’ first introduction to a non-imperative programming paradigm (and associated language). Every little while I think this won’t take long, then I remember the list of things that have to be introduced to get anywhere with Haskell: being really good at recursion, list comprehensions, lazy evaluation, type inference, higher-order functions, and other stuff to be discovered as I try to teach the language.

From my perspective, there are two reasons to be talking about functional programming. First, it’s finding some relevance, probably because people want to parallelize things (e.g. CouchDB). Second, there are important lessons from functional programming that can be transferred to OO programming.

“Language Features”

This section contains the big concepts of the course: type systems (static/dynamic, strong/weak, type coercion, duck-typing, late/early binding, …), interpreted vs compiled, pointers vs references, memory management, reflection, runtime environments, first-class functions, objects, exceptions, mutable/immutable objects, ….

The basic question here is: what are the real differences between the programming languages you have to choose from? How might they affect your choice of language for a project?

Logic programming (and Prolog)

As much as I am aware that Prolog is pretty much confined to old school AI researchers, I still think there’s some value in being exposed to logic programming. It should be possible to translate back to the OO world the idea of expressing a problem as a series of constraints and then looking to satisfy those constraints.

To be fair, this is the chunk I am most unsure of. Part of the reason it comes last is that if anything should fall off the end of a full semester, it’s this.

The exact balance of the topics remains to be seen. I’ll guess 5 weeks, 5 weeks, 3 weeks.

As for getting marks, I am as far as:

  • Lab exercises: weekly hour-or-two chances to practice the concrete skills.
  • Assignments: like… two of them? One Haskell, one Prolog?
  • Project: pick a somewhat obscure language from a list I provide. Explore it by writing a report and some programs with it.
  • A midterm and a final exam.

No idea what I’ll ask on the exams. Maybe Warren has some old ones I can look at.

So there it is. Unless I change my mind.

The History of HTML

March 19th, 2010, 8:46 am UTC by Greg

After a simple query from a colleague about the differences between HTML versions, I wrote this. I thought I might as well post it. Everything was from-memory, so there may be some minor errors.

HTML 1 never existed (it was the informal “standard” that the first documentation implied).

HTML 2 was a really minimal initial description of the language. The language was simple because the initial goals were simple. The browser makers made many de facto extensions to this by implementing random stuff.

HTML 3 was an abandoned attempt to standardize everything and the kitchen sink. HTML 3.2 was a really ugly standard that was basically “here’s what browsers accept today.”

Which brings us to modern history…

HTML 4 was an attempt to clean up the language: get rid of the visual stuff and make HTML a semantic markup language again. It included the transitional version (with most of the old ugly stuff) and strict version (as things should be).

HTML 4.01 was a minor change: missed errors and typos.

XHTML 1.0 is HTML 4.01 but with XML syntax: closing empty tags with the slash, everything lowercase, attribute values quotes, etc.

XHTML 1.1 contains some minor changes, but was abandoned in a practical sense because nobody saw any point to the change. XHTML 2.0 was another very ambitious change (non-backwards compatible changes to the language) that was abandoned.

HTML 5 is in-progress of being standardized now. If you ask me, there are two camps driving it. One who thinks “the web is more about more than just simple web pages now: applications and interactivity rule the day” and another who thinks “closing our tags is too hard; I don’t understand what a doctype is: make it easier. Dur.”

As a result, there are some things I like and some things I don’t. I is showing signs of something that will actually be completed and used (unlike HTML 3 and XHTML 2).

Most people don’t know that the HTML 5 standard includes an XHTML version as well. It will be perfectly legal to write HTML 5 with the XML syntax and call it “XHTML 5″.

Addendum: The moral of the story is that I have no intention of teaching HTML 5 anywhere until the standards process is done. For 165 I also need real browser support: no JS/DOM hack to get IE to work, and some defaults in the system stylesheet to let the thing display reasonably without any CSS applied. Even then I will probably teach XHTML 5 because I think it promotes the right habits.

« Previous Entries