Mercurial, Eclipse, Hard Links

Posted July 30, 2008 by namelythehedgehog
Categories: Computing

Tags: ,

For one of my major projects, I use Mercurial for version control, and Eclipse with PyDev for my primary development environment.  I frequently clone a development trunk to create new development branches (usually one per feature set I’m working on at a time).  I had a little hiccup of paranoia about what it means to clone a repository, since at least some of the newly-cloned files are not physical copies, but hard links.  And I didn’t know what Eclipse did with hard links.

A quick look at the output of hg clone --help reveals the following:

  • Running hg clone copies the tracked files and hard links the metadata (.hg directory).
  • hg clone --pull copies everything and hard links nothing.
  • cp -al copies nothing and hard links everything, which means your editor had better break hard links or else changes will be saved in two repositories at once.

In other words, an invocation of hg clone followed by editing some tracked files will always do the right thing, regardless of what your editor does.  Since I always clone repositories using hg clone (disk space isn’t too tight at the moment), it was never an issue, even using a rather complex Java IDE.

That got me wondering, though, what would have happened if I cloned using cp -al. It turns out that by default,

  • emacs breaks hard links.  Can be overridden by adding (setq backup-by-copying-when-linked t) to one’s “.emacs” file.
  • vim preserves hard links.  Can be overriden by set bkc=no in .vimrc or at the vim command line
  • Eclipse preserves hard links.  This was tested empirically rather than looked up in documentation, and there isn’t an easily-located setting to change this.
  • OS X TextEdit breaks hard links.  Also tested, and I didn’t even look for a way to change it.

The moral of the story:  when in doubt, hg clone.

Schema management with SQLAlchemy

Posted July 25, 2008 by namelythehedgehog
Categories: Computing

Tags: , ,

At work I’m developing a database system for tracking quantum mechanics calculations. It’s been done before, but a) this focuses more on job tracking than job submission and analysis, b) it’s really more about the command-line side of things with a web-based search, c) it’s more about some home-grown dynamics calculations than general-purpose “QM for the masses”, and d) it’s what a professor hired me to do, so I don’t ask questions.

The core of the system is Python-based, with database handling care of SQLAlchemy and a web interface using Pylons. Pylons is a mixed blessing. It’s relatively lean and eminently customizable, but it has the feel of a large number of developing (i.e. beta) projects glued together with chewing gum. They’re very high-quality beta, and the chewing gum is well-placed. For the most part, it’s better than inventing or re-inventing a web framework. My main gripe is that the documentation is scattered everywhere, and the bulk of Pylons-specific documentation is not a reference work, but a set of FAQs and cookbook recipes. That said, it’s the least intrusive framework I investigated, and I’m sticking with it for now. If I have spare time, perhaps I’ll contribute some development effort or some documentation.

The question of the day relates not to Pylons but SQLAlchemy. I’m working on a feature enhancement which will require some minor database restructuring, which raises the broader issue of schema migration. So far, there have only been beta deployments of this software, so hand-coded migration scripts have been much handier than anything else. As soon as the package is released to the project director’s collaborators, it’ll need to be a more automatic process.

So, just having gone public on this blog, and not knowing if there will be any readership, I pose the question: what are people’s experiences with schema migration? I’m poking around sqlalchemy-migrate. It seems well-thought-out, but I have two concerns:

  • The version management is not exactly lightweight. I currently have a one-row entry in the DB for schema version. I’d rather not have to add another table to the DB, or another entire tree in the source code.
  • The fewer dependent packages I have to make scientific end-users install, the less trouble support will be. It’d be great not to have to include another package in the installation instructions, and it’s one less thing that can break.

So, as I said…comments on experiences with schema migration would be helpful.

Un-quarantine downloaded files on OS X

Posted July 22, 2008 by namelythehedgehog
Categories: Computing

Tags:

Though I agree in principle with the idea of marking downloaded files as hazardous, it can be quite annoying when, say, extracting piles and piles of W3C documentation for future reference — especially when opening some index.html pops up a dialog box about a downloaded application. The “downloaded application” marker OS X puts on downloaded files is in fact an extended attribute.

The attribute in question is “com.apple.quarantine”, as shown below:

$ ls -l@
total 1560
drwxr-xr-x  25 mzwier  staff     850 Jul 22 18:04 html40
-rw-r--r--@  1 mzwier  staff  369830 Jul 22 18:00 html40.tgz
	com.apple.quarantine	    42

The tool to manage extended attribute data is (logically), “xattr”. xattr has no man page, but an informative-enough help option (this directly from xattr --help, reprinted for reference and discussion):

$ xattr --help
usage: xattr [-l] file [file ...]
       xattr -p [-l] attr_name file [file ...]
       xattr -w attr_name attr_value file [file ...]
       xattr -d attr_name file [file ...]

The first form lists the names of all xattrs on the given file(s).
The second form (-p) prints the value of the xattr attr_name.
The third form (-w) sets the value of the xattr attr_name to attr_value.
The fourth form (-d) deletes the xattr attr_name.

options:
  -h: print this help
  -l: print long format (attr_name: attr_value)

So, to lift the quarantine on a specific file, the proper move is

xattr -d com.apple.quarantine FILE

To lift the quarantine on a whole directory tree (say, of documentation), the move is

find DIRNAME -print0 | xargs -0 xattr -d com.apple.quarantine

piggybacking on standard tricks with find and xargs.

Automatic Backup of USB Drive

Posted July 21, 2008 by namelythehedgehog
Categories: Computing

Tags:

I’ve started to keep some regularly-updated critical information on my USB thumb drive. Apple’s Time Machine doesn’t seem to be willing to back up external disks (or at least VFAT-formatted external disks). I figured I’d piggyback on Time Machine’s hourly backups by using rsync to copy the thumb drive periodically to my MacBook’s hard drive. OS X 10.5 still supports cron, but I keep running across references to launchd. After some googling, I came up with the following script:

#!/bin/bash
test -e /Volumes/MCZ &&
  rsync -avr --delete /Volumes/MCZ $HOME/Metabackup/ >& $HOME/Metabackup/MCZ.log

and the following plist describing the new “service” I wanted launchd to handle:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>KeepAlive</key>
        <false/>
        <key>Label</key>
        <string>znet.backups.mczthumb</string>
        <key>ProgramArguments</key>
        <array>
                <string>/bin/bash</string>
                <string>/Users/mzwier/bin/backup-thumb.sh</string>
        </array>
        <key>RunAtLoad</key>
        <true/>
        <key>StartCalendarInterval</key>
        <dict>
                <key>Minute</key>
                <integer>52</integer>
        </dict>
</dict>
</plist>

This is exactly equivalent to the cron line

52 * * * * /bin/bash /Users/mzwier/bin/backup-thumb.sh

I saved the plist in ~/Library/LaunchAgents (as znet.backups.mczthumb) and ran launchctl:

launchd% load /Users/mzwier/Library/LaunchAgents/znet.backups.mczthumb

Worked on the first try. It may be useful to perform an update every time the drive is mounted, but that’s a challenge for another day.