Data Is Like Wheat
by chris on Apr.10, 2012, under general
danah boyd directs us to this essay on the grammar of data. Excerpts:
The word of which ‘data’ is purportedly the plural has simply disappeared; this means two things. Firstly, passively, it creates a linguistic space into which ‘data’ can drop – there is no ambiguity in using ‘data’ in a singular sense. Secondly, and more importantly, if ‘datum’ has effectively disappeared, it tells us that ‘data’ cannot be simply its plural; unanchored, it has moved away from this simply derived meaning, to a distinct and independent meaning of its own. It has accordingly accreted usage rules of its own, unencumbered by any latin past.‘Data’ no longer means just one (damn) datum after another. Twentieth-century ‘data’ refers to a mass of raw information, which we measure rather than count, and this is as true now as it was when the word made its 1646 debut. This universal perception of data as measured rather than counted puts the word firmly and unambiguously in the same grammatical category as ‘coal’, ‘wheat’ and ‘ore’, which is that of the mass, or aggregate, noun. As such, it is always and unavoidably grammatically singular. We would never ask ‘how many wheat do you have?’ or say that ‘the ore are in the train’ if we wished to be thought a competent speaker of english; in the same way, and to the same extent, we may not ask ‘how many data do you have?’ or say ‘the data are in the file’ without committing a grammatical error.
I now have to unlearn using data as a plural and instead begin reusing it in the way I intuitively learned it; as an aggregate singular noun.