Online NLP Course

Stand­ford Uni­ver­sity is offer­ing a new online course on Nat­ural Lan­guage Pro­cess­ing taught by some of the great­est lights in this field, Christo­pher Man­ning and  Dan Juraf­sky. This course is another in the Stand­ford exper­i­ment in mas­sive online teach­ing. It will include video lec­tures (later a tran­scrip­tion is promised), assign­ments, exams, etc. It begins in Feb­ru­ary with a cou­ple of lec­tures per week. One can get a cer­tifi­cate of com­ple­tion, but no Stand­ford Uni­ver­sity credit, of course, since the course is … free! Highly rec­om­mended. You won’t get a bet­ter edu­ca­tion in this sub­ject than from these two professors!

We’re back!

At the end of August, Trop­i­cal Storm Irene “kissed” the Groves Cen­ter servers, crash­ing our main web server per­ma­nently. Through the gen­eros­ity of one of our donors, we were able to pur­chase two new hot Dells. Now we have vir­tu­al­iza­tion, lots of mem­ory and stor­age. So we decided to make a change we were think­ing about: we moved to a new Linux dis­tri­b­u­tion: from Gen­too to Ubuntu. Ubuntu is Debian based, and I’ve always been deeply impressed with Debian’s sta­bil­ity and ratio­nal choices for soft­ware con­fig­u­ra­tion. But a teach­ing gig in Europe, Soci­ety of Bib­li­cal Lit­er­a­ture annual meet­ings (along with the prepa­ra­tion needed for these events) as well as going through the unbe­liev­able learn­ing curve with becom­ing an employer for a small busi­ness all con­spired to delay the recon­struc­tion of the server. Oh, and then my desktop’s hard drive failed. So I’m writ­ing this from a hot new lap­top (i7 Quad­Core), using Ubuntu desk­top in a vir­tual machine. That means I have Win­dows 7 and Linux at the click of a mouse, instead of hav­ing to reboot into the other OS all the time. Mar­velous. But there went more time pur­chas­ing and con­fig­ur­ing the new machine (with all the learn­ing curve that implies!).

Don’t let any­one tell you that restor­ing from back­ups is pain free! Unless you are just restor­ing a disk drive mir­ror image onto a new drive, restor­ing is non-​​trivial. Chang­ing Linux dis­tros meant the loca­tion of every­thing can change. All pre­req­ui­sites and depen­den­cies have to be installed, and con­fig­u­ra­tion files mod­i­fied to reflect the new path names of direc­to­ries and files. Even after the soft­ware is “up”, there are glitches because, for exam­ple, the restora­tion of the mysql data­base that sits behind and sup­ports all the web ser­vices is not per­fect. We’re expe­ri­enc­ing “incor­rect key file” errors for Bugzilla.

But we’re mostly there, as you can see. A good start to 2012.

Reference and electronic file management

If you’re an 21st cen­tury aca­d­e­mic like me, you col­lect a lot of ref­er­ences, bib­li­og­ra­phy and elec­tronic ver­sions of arti­cles and books. My hard drive has a hier­ar­chy of fold­ers for PDFs, and so forth, but it is hard to man­age them. I’m mul­ti­ply­ing fold­ers accord­ing to sub­ject, and that just doesn’t work very well.

There are two solu­tions to this prob­lem that I have dis­cov­ered so far: Zotero and Sci­Plore.

Zotero

Zotero is a browser plu­gin to Fire­Fox. This is a great advan­tage for brows­ing: find some­thing while surf­ing the web and a cou­ple of clicks later it is book­marked. You can add bib­li­o­graphic infor­ma­tion and anno­ta­tions to the doc­u­ment entry.

To sum­ma­rize the — to me — impor­tant features:

  • bib­li­og­ra­phy and cita­tion management
  • able to anno­tate entries
  • able to search entries
  • able to han­dle web documents
  • able to attach links to PDFs
  • han­dles Bib­TeX format

There are many other fea­tures, of course. See the Zotero web­site for fur­ther details.

Sci­Plore

Sci­plore is built on top of Free­Mind, a mindmap­ping pro­gram. It adds bib­li­o­graphic and cita­tion man­age­ment to FreeMind’s graph­i­cal rep­re­sen­ta­tion of ideas. One could describe it as a “graph­i­cal Zotero.”

I have not (yet) done a com­plete fea­ture com­par­i­son, but at first glance, the only thing Zotero can do that Sci­Plore can­not is to cap­ture web doc­u­ments and even cap­ture ref­er­ences from some kinds of web doc­u­ments. Both han­dle Bib­TeX and PDF links. Sci­Plore per­mits one to arrange the infor­ma­tion in a visual way; Zotero uses lists.

Con­clu­sion

I have only just dis­cov­ered Sci­Plore (check out the intro­duc­tory video on the web­site) whereas I’ve been using Zotero for more than a year now as a URL link man­ager. I have not — until now — felt the need to use Zotero’s bib­li­og­ra­phy man­age­ment fea­tures. I am very happy with Bib­Desk on the Mac and JabRef every­where else. The com­mon ele­ment among all three is Bib­TeX. If I hadn’t dis­cov­ered Sci­Plore to play with, I was plan­ning on using Zotero more exten­sively to han­dle my every expand­ing list of PDFs. I’m a firm believer in “the right tool for the right job.” Every soft­ware pack­age has its strengths and weak­nesses. I try to use a pro­gram for its strengths and aban­don it for a bet­ter tool when it is weak. The key to mak­ing this prac­tice work is to insist that my soft­ware store and manip­u­late data in stan­dard for­mats. In the case of bib­li­og­ra­phy man­agers, that for­mat is Bib­TeX.

We’ll see how these pro­grams will adjust to my workflow.