Stan Dubinsky, the linguist, shared with me this piece that drew me because of the headline, “Take My Metadata.” And not because I pictured Henny Youngman saying that and adding “please.” (I’ve always thought that was a terrible joke, I want to go on the record as saying.) As you know, the headline reflects my attitude regarding what the NSA has been doing the last few years.
But when I read it, I appreciated it most for the definition it provided of metadata:
As a preliminary shot, one could say that in any domain where data has to be recorded, there has to be a scheme of some kind for recording it, and any description of that scheme–data about the way the data is organized–can be called metadata.
Examples will help. The metadata for a book will be the information on the reverse of the title page or in the library-catalog entry: title, author, year, publisher, place of publication, ISBN, and so on (you could add all sorts of other facts). Book metadata is what Google Books has been struggling to correct since its spectacular metadata errors started being publicized by people like Geoff Nunberg and Mark Liberman on Language Log (see here and here and here, for example).
For a phone call the typical metadata would be calling number, called number, time of connection, length of call, and so on. And for an email or a text message, sender’s address, recipient’s address, sending machine, recipient’s machine, date, subject line, and so on.
Crucially, “Call me Ishmael” is not part of the metadata for Moby Dick; that’s the first three words of the content. And “Hi, honey. Are you still at the office?” (or “Lou? Tell Enzo the hit is going down tonight”) is not part of the metadata for a phone call.
It is not clear to me whether Senator Rand Paul truly believes that important freedoms are being stripped away from Americans by the actions of the National Security Agency, which up until midnight on Sunday, May 31, was systematically recording telephone-call metadata for large numbers of mostly innocent Americans. The alternative would be that he is being disingenuous: He simply thinks his status as a possible Republican presidential nominee will be enhanced if he argues against the trustworthiness of government agencies.
But it would demean him to assume that. I think it is more charitable to assume he truly believes what he says. Though that means attributing to him what I take to be a rather stupid belief….
Yes, the examples do help. Thanks.
Oh, and to the critical issue that the writer addressed parenthetically just before that passage: “(And let me warn the purists up front that in this post I am going to be treating data not as the plural of the Latin word datum, but as an English singular noncount noun like air, fun, furniture, info
Normally, I would harrumph. But when one is speaking of data in the aggregate, as a massive amount of something, rather than number of things — which is definitely what we’re talking about where the NSA is concerned — perhaps I see his point, and grudgingly decline to protest.
And… I’ll even admit that when most people say “data,” they are speaking of it similarly. But Sarah T. Kinney, my Latin teacher at Bennettsville High School, would rise up and haunt me forever were I to forget for a moment that, by all the gods on Olympus, “data” is a plural, second-declension, neuter noun.