Thoughts after LSM2241 Lecture 2

I just gave the lecture this morning, and haven’t yet reviewed student feedback, but I’m expecting it to be worse than the first week. Using other peoples’ slides is almost always a mistake, especially when the slides are so different, stylistically, from my own. In the end, I wound up with 120 slides for 90 minutes of lecture time, which is just not reasonable for how I lecture.

Further, the content was a bit unbalanced for the course. I should have noticed this when I was updating the slides. In keeping with the previous term, the PDB was used as an example database; that would be OK except that we’ll spend some of the last portion of the course discussing structure, so why spend time on the PDB now? Using the PDB as an example now precluded discussion of other equally relevant databases.

I also kept some discussion of database APIs, all of which could have been dropped since the students won’t really have an opportunity to use them in this course.

The outcome was pretty predictable: I rushed through the end of the lecture, and skipped sections that should never have been in the slide deck in the first place. Overall, it was not the lecture I wanted to give.

LSM2241 Lecture 2: A temporarily Org-mode-less lecture

My feedback from students after the first week of LSM2241 (Introductory Bioinformatics) was generally quite positive. Many commented on the well organized slides (where Org mode is my secret weapon) and provided useful suggestions. So when I sat down to make my new slides, I was pretty excited, but as it turns out I may have a tough adhering to some of their input.

The second week of LSM2241 is devoted to bioinformatics databases. There’s a high level of variability in student background: some students not have had any introduction to databases in general, so I have to cover that too. Compounding matters, this is also one of the lectures I didn’t give last term. I of course looked through slides from previous years for ideas, and also found presentations from other universities. Some clear themes emerged.

  • In previous years, lecturers had covered a lot of material in a short period
  • Screen shots, screen shots, screen shots. Wow, were there a lot of screen shots. This was true not just in previous years of LSM2241, but in other courses covering the topic.
  • It seems really easy to inadvertently include a number of out of date or inaccessible databases
  • It seemed helpful to use a few related queries to drive most of the examples across different databases.

While I wanted to start my slides from scratch, I got behind in my preparations, and eventually realized that my best bet was to use the previous years’ material as a starting point and (uggh) use PowerPoint. My revisions are pretty extensive, but the basic structure is quite similar, which leaves a huge number of slides, about three times what I’d like to target! Ironically, part of the feedback from lecture 1 was to speak more slowly. This may pose a challenge….

Next time: update R graphs and deal with screenshots

I did make a few plots for the lecture using R, and have started putting data into Google Docs for charts I’ll have to update in future years. This should make for clean Org babel R source code blocks, like

dblink <- "https://docs.google.com/spreadsheet/pub?key=0Amd94LRhVxVWdElNYVdHblVLRjZKR1lwaFFFZHVyWUE&single=true&gid=0&output=csv"
require(RCurl)
myCsv <- getURL(dblink)
dbs <- read.csv(textConnection(myCsv))
library(ggplot2)
names(dbs) <- c("Year","Databases")
dbs[[1]] <- as.factor(dbs[[1]])
p <- qplot(x=Year,y=Databases,data=dbs,geom="bar",fill="gray")
p + theme_bw() + opts(axis.text.x = theme_text(angle=90, hjust=1.2, size=16),
axis.title.x=theme_text(size=16),
axis.title.y=theme_text(size=16,angle=90),
axis.text.y = theme_text(size=16),
legend.position="none") +
ylab("Databases in NAR Database issue")

For screen shots, I’m looking at webkit2png. When it comes time to revise this lecture and put it into Org, I should be able to generate all the screen shots from Org babel code blocks. Why go through the trouble? For one thing, if I decide to change the example, I can regenerate the slide deck appropriately without manually cutting screen shots. If database interfaces change, I can see the results in the screen shots and revise just those sections. I only wish I had thought of it in time to make the slides that way this term.

Using Org mode for course development and presentations

For the last two semesters I’ve been teaching part of LSM2241, Introductory Bioinformatics at NUS. This is the first serious exposure to Bioinformatics for students in the Life Sciences at NUS, so it’s a great opportunity to help ~160 students appreciate the increasingly central role bioinformatics has in the practice of biology.

This coming term I’m giving all the lectures. I don’t consider myself a great lecturer – on the contrary, I consider this an opportunity to practice – but I think second year undergraduates don’t benefit much from team-taught lecture courses, so one lecturer of my quality is better than four lecturers of varying quality.

The workhorse of my course planning and preparation is Emacs Org mode. I use it for planning my own work, for making presentations and handouts (via Beamer and LaTeX export), for preparing exams, and for tracking my own development as a teacher.

Last term I found some huge benefits of Beamer over PowerPoint or Keynote. For example, when we were discussing dynamic programming algorithms for sequence alignment, it was pretty straightforward to write an alignment program that emitted TikZ diagrams animating the steps of filling in an alignment matrix. I ended up using this twice in the lecture: first as a worked example in the slide copies the students received, and second during the lecture itself, so we could walk through the whole classroom and perform an alignment via student participation.

The animation would have been unthinkable in PowerPoint, since it added the equivalent of ~80 slides to the deck. With Beamer I could (i) make the animation, (ii) distribute a slide deck with the filled matrix from the animation, thus satisfying the demands of today’s students to have slides ahead of time, while still minimizing excessive printing, and (iii) generate a second animation from an unseen alignment problem for the class to work through together during the lecture. Doing it this year will be as simple as changing the input to the program and regenerating the slides.

This is all done via Org-babel, the miraculous multi-lingual literate programming environment supported by Org mode.

But before getting into that level of detail, I’ll mention one tweak I use in Org-beamer export. When exporting to LaTeX or HTML, Org mode knows to do the right thing for figures and tables. So, for example,

#+CAPTION: This is a caption
[[file:/media/foo.png]]

will create a figure with a caption when exported to any supported format, including LaTeX and HTML.

However, the LaTeX export uses \caption{}, which automatically adds a Figure 1 label to the caption of the first figure in the document. Likewise for tables. In normal LaTeX documents, that’s the right thing to do, but for Beamer numbered figures and tables aren’t needed.

But I still want captions! To fix this, I make sure to include the caption package in the header with

#+LATEX_HEADER: \usepackage[justification=centering]{caption}

Then I add a hook to convert all my \caption to \caption* as the last step in my Org Beamer exporter.

(defun latex-buffer-caption-to-caption* ()
(when org-beamer-export-is-beamer-p
(replace-regexp "\\(\\\\caption\\)\\([[{]\\)" "\\1*\\2" nil
(point-min) (point-max))))

(add-hook 'org-export-latex-final-hook
'latex-buffer-caption-to-caption* 'append)

The org-export-latex-final-hook captures all the hooks that run right before saving the generated LaTeX buffer, and org-beamer-export-is-beamer-p restricts the behavior to Beamer export.