2011-12-16

Bad Practices and Better Alternatives to Them in Humanities Computing

Generally speaking, researchers in the humanities tend to be less computer-savvy than their colleagues in the natural and social sciences. There still remain even those who speak of their low level of computer literacy as if they were proud of it. However, not only is there no contradiction between high level of computer literacy and being an excellent scholar but also is high level of computer literacy becoming an important prerequisite for excelling in research. Unfortunately, however, many researchers in the humanities, including linguists, still follow bad practices or are stuck with them, mostly for lack of knowledge of better alternatives to them. Every time I see these scholars, I cannot help feeling sorry that they are waisting their precious time and are often unaware of this very fact. The following is a random list of what I consider most common bad (i.e., inefficient) practices in (humanities) computing and possible better (i.e., more efficient) alternatives to them.

  • Bad practice 1: To use a word processor as it it were a typewriter.
    • What is wrong: The problem must be self-evident. We are already in the 21st century and not in the stone age. ;-)
    • Better alternative: Learn at least how to use "styles", autimatically prepare and update tables of contents and indices. And if you are still stuck with such bloatware as Word, consider migrating to a better open source alternative such as LibreOffice Writer.
  • Bad practice 2: To use a word processor for every possible kind of computing involving text documents.
    • What is wrong: A word processor combines manipulation of texts and their physical rendition, but for processing textual data the latter is not only unnecessary but also slows processing. Read, e.g., Word Processors: Stupid and Inefficient for futher details.
    • Better alternative: Use a text editor for processing textual data when their physical rendition is irrelevant. My recommendations are firstly EditPad Pro (Windows; commercial) and secondly EditPad Lite (Windows; free); both of them support RTL scripts and bidirectional algorithm.
  • Bad practice 3: To use a word processor or a spreadsheet program as a database program.
    • What is wrong: Neither a word processor nor a spreadsheet program is meant to be a database program, and they are limited and unbearing slow in data retrieval.
    • Better alternative 1: Again, use a text editor that supports regular expressions if you are dealing with non-structured textual data. My recommendations are the same as the above. Use also a grep tool. My recommendations are firstly PowerGREP (Windows; commercial) and secondly AstroGrep (Windows; free).
    • Better alternative 2: Use a CSV editor if you are dealing with tabular textual data. My recommendations are CSV Easy (Windows; commercial) and uniCSVed (Windows; free).
    • Better alternative 3: Use a database management system if you are dealing with more complex data. My recommendation for managing linguistic data is Fieldworks Language Explorer.
  • Bad practice 4: To send documents in proprietary formats, most notoriously in Word format, indiscriminately to everyone without his or her prior concent.
    • What is wrong: Not everyone uses (or wants to use) (mostly commercial) tools for proprietary formats.
    • Better alternative: Use one of the open document formats as described in my flowchart.