Change of address
1 month ago in Variety of Life
Tensions over [the paper] may really boil down to something simple: The need for better communication between disciplines that previously had little to do with each other. As new data models allow mathematicians and physicists to make their own contributions about language, scientific journals need to make sure that their work is on a firm footing by involving linguists in the review process. That way, culturomics can benefit from an older kind of scholarship -- namely, what linguists already know about humans shape words and words shape humans.Beyond pointing out that linguists and other non-physicists don't already apply sophisticated mathematical models to language -- there are several entire fields that already do this work, such as computational linguistics and natural language processing -- I respectfully suggest that involving linguists at the review process is way too late. If the goal is to improve the quality of the science, bringing in linguists to point out that a project is wrong-headed after the project is already completed doesn't really do anyone much good. I guess it's good not to publish something that is wrong, but it would be even better to publish something that is right. For that, you need to make sure you are doing the right project to begin with.
Shi shi shi shi shi shi, shi shi, shi shi shi shi. Shi shi shi shi shi shi shi shi shi, shi shi shi shi shi, shi shi, shi shi shi shi shi. Shi shi shi shi shi, shi shi shi, shi shi shi shi shi shi. Shi shi shi shi shi shi, shi shi shi. Shi shi shi, shi shi shi shi shi shi. Shi shi shi, shi shi shi shi shi shi shi shi. Shi shi shi shi shi shi shi shi shi shi shi shi shi. Shi shi shi shi.As written, this is incomprehensible. Only if you write it in characters
A poet named Shi lived in a stone house and liked to eat lion flesh and he vowed to eat ten of them. He used to go to the market in search of lions and one day chanced to see ten of them there. Shi killed the lions with arrows and picked up their bodies carrying them back to his stone house. His house was dripping with water so he requested that his servants proceed to dry it. Then he began to try to eat the bodies of the ten lions. It was only then he realized that these were in fact ten lions made of stone. Try to explain the riddle.Problems with this argument
We show that major aspects of kin classification follow directly from two general principles: Categories tend to be simple, which minimizes cognitive load, and to be informative, which maximizes communicative efficiency ... The principles of simplicity and informativeness trade off against each other... A system with a single category that includes all possible relatives would be simple but uninformative because this category does not help to pick out specific relatives. A system with a different name for each relative would be complex but highly informative because it picks out individual relatives perfectly.That seems intuitively reasonable, but these are computational folk, so they formalized this with math. The details are in the paper, but roughly: They formalize the notion of complexity by using minimum description length in a representational language based on primitives like FEMALE and PARENT. The descriptions of the various terms in English and Northern Paiute are shown in parts C and D of the figure above. Communicativeness is formalized by measuring how ambiguous each term is (how many people it could potentially refer to).
I aimed my gun into the living room. (p. 109)I cannot by any means convince Dictate to print this. It prefers to convert "my gun" to "my God". For example, on my third try, it wrote:
I aim to my God into the living room.Dictate offers a number of alternatives in case its initial transcription is incorrect. Right now, it is suggesting, as an alternative to "aim to my God":
aimed to my GodPerhaps Nuance has a religious bent, but I suspect that this is a simple N-gram error. Like many natural language processing systems, Nuance figures out what word you are saying in part by reference to the surrounding words. So in general, it thinks that common bigrams (2-word sequences) are more likely than uncommon bigrams.
aim to my God and
aim to my god
aim to my gun
aimed to my God and
aim to my garden
aimed to my god
aimed to my gun
aim to my guide
aim to my God in
aimed to my God in
(1) Give me that.The reason that (2) is weird -- by convention, an asterisk marks a bad sentence -- is that the word this suggests that whatever is being requested is close to the speaker. Consider also:
(2) *Give me this.
(3) Jane came home.If we were currently at Jane's home, it would be more natural to say (3) than (4). Of course, we could say (4), but we would be shifting our perspective, treating wherever Jane was as the reference point, rather than where we are now (this is particularly common in story-telling).
(4) Jane went home.
(5) The lunchroom door slowly opened and two men walked in.These sentences describe the same event, but place the reader in a very different position. As Talmy points out, when reading (5), one gets the sense that you are in the lunchroom, whereas in (6), you get the sense that you outside of the lunchroom ... either that, or the door to the lunchroom is transparent glass.
(6) Two men slowly opened the lunchroom door and walked in.
(7) There are some houses in the valley.The first sentence implies a static point of view, far from the houses, allowing you to see all the houses at once (Talmy calls this "stationary distal perspective point with global scope of attention"), whereas (8) gives the sense of moving through the valley and among the houses, with only a few within view at any given time ("moving proximal perspective point with local scope of attention")
(8) There is a house every now and then through the valley.