Field of Science

  • in The Biology Files
  • in inkfish
  • in Life of a Lab Rat
  • in The Greenhouse
  • in PLEKTIX
  • in Chinleana
  • in RRResearch
  • in The Culture of Chemistry
  • in Disease Prone
  • in The Phytophactor
  • in The Astronomist
  • in Epiphenom
  • in Sex, Genes & Evolution
  • in Skeptic Wonder
  • in The Large Picture Blog
  • in Memoirs of a Defective Brain
  • in C6-H12-O6
  • in The View from a Microbiologist
  • in Labs
  • in Doc Madhattan
  • in The Allotrope
  • in The Curious Wavefunction
  • in A is for Aspirin
  • in Variety of Life
  • in Pleiotropy
  • in Catalogue of Organisms
  • in Rule of 6ix
  • in Genomics, Evolution, and Pseudoscience
  • in History of Geology
  • in Moss Plants and More
  • in Protein Evolution and Other Musings
  • in Games with Words
  • in Angry by Choice

Intelligent Nihilism

The latest issue of Cognitive Science, which is rapidly becoming one of my favorite journals, carries an interesting and informative debate on the nature of language, thought, cognition and learning, between John Hummel at University of Illinois-Urbana-Champaign, and Michael Ramscar, at Stanford University. This exchange of papers highlights what I think is the current empirical standstill between two very different world-views.

Hummel takes up the cause of "traditional" models on which thought and language is deeply symbolic and involves algebraic rules. Ramscar defends more "recent" alternative models that are built on associate learning -- essentially, an update on the story that was traditional before the symbolic models.

Limitations of Relational Systems

The key to Hummel's argument, I think, is his focus on explicitly relational systems:
John can love Mary, or be taller than Mary, or be the father of Mary, or all of the above. The vocabulary of relations in a symbol system is open-ended ... and relations can take other relations as arguments (e.g., Mary knows John loves her). More importantly, not only can John love Mary, but Sally can love Mary, too, and in both cases it is the very same "love" relation ... The Mary that John loves can be the very same Mary that is loved by Sally. This capacity for dynamic recombination is at the heart of a symbolic representation and is not enjoyed by nonsymbolic representations.
That is, language has many predicates (e.g., verbs) that seem to allow arbitrary arguments. So talking about the meaning of love is really talking about the meaning of X loves Y: X has a particular type of emotional attachment to Y. You're allowed to fill in "X" and "Y" more or less how you want, which is what makes them symbols.

Hummel argues that language is even more symbolic than that: not only do we need symbols to refer to arguments (John, Mary, Sally), but we also need symbols to refer to predicates as well. We can talk about love, which is itself a relation between two arguments. Similarly, we can talk about friendship, which is an abstract relation. This is a little slippery if you're new to the study of logic, but doing this requires a second-order logic, which has a number of formal properties.

Where Hummel wants to go with this is that associationist theories, like Ramscar's, can't represent second-order logical systems (and probably aren't even up to the task of the types of first-order systems we might want). Intuitively, this is because associationist theories represent similarities between objects (or at least how often both occur together), and it's not clear how they would represent dissimilarities, much less represent the concept of dissimilarity:
John can be taller than Mary, a beer bottle taller than a beer can, and an apartment building is taller than a house. But in what sense, other than being taller than something, is John like a beer bottle or an apartment building? Making matters worse, Mary is taller than the beer bottle and the house is taller than John. Precisely because of their promiscuity, relational concepts defy learning in terms of simple associative co-occurrences.
It's not clear in these quotes, but there's a lot of math to back this stuff up: second-order logic systems are extremely powerful and can do lots of useful stuff. Less powerful computational systems simply can't do as much.


The Response

Ramscar's response is not so much to deny the mathematical truths Hummel is proposing. Yes, associationist models can't capture all that symbolic systems can do, but language is not a symbolic system:
We think that mapping natural language expressions onto the promiscuous relations Hummel describes is harder than her does. Far harder: We think you cannot do it.
Ramscar identifies a couple old problems: one is polysemy, the fact that words have multiple meanings  (John can both love Mary and love a good argument, but probably not in the same way). Fair enough -- nobody has a fully working explanation of polysemy.

The other problem is the way in which the symbols themselves are defined. You might define DOG in terms of ANIMAL, PET, FOUR-LEGGED, etc. Then those symbols also have to be defined in terms of other symbols (e.g., FOUR-LEGGED has to be defined in terms of FOUR and LEG). Ramscar calls this the turtles-all-the-way-down argument.

This is fair in the sense that nobody has fully worked out a symbolic system that explains all of language and thought. It's unfair in that he doesn't have all the details of this theory worked out, either, and his model is every bit as turtles-all-the-way-down. Specifically, concepts are defined in terms of cooccurrences of features (a dog is a unique pattern of co-occurring tails, canine teeth, etc.). Either those features are themselves symbols, or they are always patterns of co-occuring features (tail = co-occurrence of fur, flexibility, cylindrical shape, etc.), which are themselves patterns of other other feature co-occurrences, etc. (It's also unfair in that he's criticizing a very old symbolic theory; there are newer, possibly better ones around, too.)

Implicit in his argument is the following: anything that symbolic systems can do that associationist systems can't do are things that humans can't do either. He doesn't address this directly, but presumably this means that we don't represent abstract concepts such as taller than or friendship, or, if we do, it's via a method very different from formal logic (what that would be is left unspecified).

It's A Matter of Style

Here's what I think is going on: symbolic computational systems are extremely powerful and can do lots of fancy things (like second-order logics). If human brains instantiate symbolic systems, that would explain very nicely lots of the fancy things we can do. However, we don't really have any sense of how neurons could instantiate symbols, or even if it's possible. So if you believe in symbolic computation, you're basically betting that neurons can do more than it seems.

Associationist systems face the opposite problem: we know a lot about associative learning in neurons, so this seems like an architecture that could be instantiated in the brain. The problem is that associative learning is an extremely underpowered learning system. So if you like associationist systems, you're betting that humans can't actually do many of the things (some of) us think humans can do.

Over at Child's Play, Dye claimed that the argument in favor of Universal Grammar was a form of Intelligent Design: we don't know how that could be learned/evolve, so it must be innate/created. I'll return the favor by labeling Ramscar's argument Intelligent Nihilism: we don't how the brain could give rise to a particular type of behavior, so humans must not be capable of it.

The point I want to make is we don't have the data to choose between these options. You do have to work within a framework if you want to do research, though, and so you pick the framework that strikes you as most plausible. Personally, I like symbolic systems.

----------
John E. Hummel (2010). Symbolic versus associative learning Cognitive Science, 34, 958-865

Michael Ramscar (2010). Computing machinery and understanding Cognitive Science, 34, 966-971

photos: Anirudh Koul (jumping), wwarby (turtles), kaptain kobold (Darwin)

22 comments:

Neuroskeptic said...

Good post. I haven't read the papers, but isn't Ramscar's argument less "turtles-all-the-way-down" than Hummels?

For Ramscar, presumably, concepts are associations of features, which are themselves associations of more basic ones, but there is a "bottom", which is basic sense data.

We know for example that red light activates some retinal cells and blue light activates others, and these get transmitted to the visual cortex; so a "red" concept is one which, however indirectly, is neurally associated with those cells.

On the other hand it's not clear that a symbolic account can make the same reference to basic sense data, because the "red" pathway doesn't encode the symbol "red", it responds to red. When you look at a red thing you actually see red, you don't just think "red".

Unmesh said...

Great summary. I think this paper is on my reading list now. One possible explanation for how neurons can give rise to symbol systems is Barsalou's Perceptual Symbol Systems (Behavioral and Brain Sciences, 22, 577-609). His idea, if I remember right, revolves around the notion of simulations as giving rise to symbols and symbol systems. Of course, his notion of a symbol system is a little different because in addition to the properties of compositionality and productivity, perceptual symbol systems also have a non-symbolic component, namely the underlying neuronal structure that supports the simulation.

GamesWithWords said...

@Neuroskeptic -- I'm not sure what Hummel's theory of concepts is, because he doesn't say. Ramscar attributes the Classical Theory to him, which goes back to Locke/Hobbs. In their account, concepts also cache out ultimately in terms of sense data. Whether that can be made to work is open for debate.

I'm not sure that the problem you identify is any more or less problematic for Ramscar or Hummel. I think these questions start to get slippery.

I'd point out again that there are numerous competing theories of concepts, such as Susan Carey's Theory Theory (concepts are defined in terms of the role they play in intuitive theories) or Jerry Fodor's Atomism (concepts are related to one another, but do not decompose into one another). On such accounts, there's no decomposition, and no turtles. Whether they work any better is again very much open to debate. (I personally like Atomism, though not the version that requires all concepts to be innate.)

Bob Carpenter said...

Very nostalgic. This is the same debate the logical positivists were having with Wittgenstein and Quine (and others) in the 1930s and 1940s. Everyone who's interested in this issue and has never read Quine's 1950 landmark paper "Two Dogmas of Empiricism" should get right to it. (When I taught philosophy of language at CMU, my co-teacher, an actual philosopher, not a linguistic semanticist like me, called it the most important paper in philosophy of the 20th century.)

If we step back 10 feet from theoretical concerns, I think it's pretty clear that language is both (a) highly structured symbolically, and (b) highly associative and flexible and metaphorical in terms of meaning. There's absolutely no reason to think these things are mutually exclusive. Or that humans have perfect understanding or share the exact same meanings for words in some Platonic/Fregean sense (pun intended).

What we need is some way to put the symbolic and associative theories together. (I hear from some friends still in the field that there are some efforts to combine "vector-based semantics" with "Montague-type semantics", but vectors used like in IR are just a data structure, not a theory of associative meaning.)

Just to clarify, everything that's computable on a Turing machine can be expressed in first-order logic. Second-order logic's even more complex, and there's a whole hierarchy of more expressive logics on top of that.

Dan said...

Did you even read the original article Hummel was responding to?

The learning theory described in that paper explicitly does not "represent similarities between objects (or at least how often both occur together)." Learning according to the Ramscar et al Cognitive Science article is information theoretic, and it is based on discrimination.

There is even a section in that paper that goes through this. It includes the following sentence:

"In word learning, from this perspective,
‘‘similarities’’ between objects do not need to be discovered; on the contrary, things appear similar because learning has not yet discriminated them."

Ramscar et al go through the math of why information and coding theory constrains how symbols work, and show that people's learning respects these constraints. They also cash this out in terms that are explicitly not associationist.

Why don't you read the paper, rather than trying to intuit its comments from Hummel?

GamesWithWords said...

@Dan: You mean the paper immediately preceding the Hummel paper in the journal? Yes, I read it, and I recognize the quote. The Ramscar et al. paper is interesting and has many interesting components that I didn't discuss in my post. It's 49 pages. My post was 1100 words.

If you can explain how your concern is relevant to the topic of my post, I'd be interested in thinking through it.

In any case, though, I think my characterization is correct. The quote you're referring to seems is about whether the model has to "discover" similarity. It's true that it doesn't have to. Partly that's because a lot of the work is hard-coded into the model (that is, the model is "born" knowing which types of shapes count as the same category and which types are different categories). And it's still the case that the model is going to label two things with the same label if and only if they are similar in a very intuitive way (modulo the training set). Maybe there is something very subtle going on here that I don't get, but even if there is, I don't think it's relevant to the topic of my post.

Melodye said...

This is an embarrassing misreading of the paper. It's absolutely clear from this that you didn't read the original paper that the reply is based on ("The Effects of Feature-Label-Order and Symbolic Learning").

For example, this is flat out wrong : "Specifically, concepts are defined in terms of cooccurrences of features." Rescorla-Wagner is not a co-occurrence model. It's a discrimination network. We are at pains in FLO to make the difference clear.

"The problem is that associative learning is an extremely underpowered learning system. So if you like associationist systems, you're betting that humans can't actually do many of the things (some of) us think humans can do."

Here again -- if you read the paper, you'd be hard pressed to say this.

This post isn't so much an argument as a "I didn't read the original paper and think I can make an argument out of thin air." Isn't this the same problem that goes on when people bow down to Chomsky without actually reading him?

Melodye said...

And re: this --"Intelligent Nihilism: we don't how the brain could give rise to a particular type of behavior, so humans must not be capable of it."

Are you for real?! The entire paper is about how we can use general learning mechanisms as a powerful means of exploring human linguistic behavior. And if you've read any of our other papers, you'll realize that we can use these same mechanisms (and this same model) to account for a truely diverse range of linguistic phenomena. We even use predictions the model makes to rapidly and dramatically improve children's color and number learning!

For what it's worth -- hostile and lazy readers are precisely why we have so much trouble publishing in this field. This is exactly the kind of thing we face in review, and it is frankly *galling.*

Melodye said...

Regarding the comment on similarity, made above -- if you had read FLO, you would know there is an entire section (S. 11, p. 939) devoted to discussing the differences between similarity and discrimination based models.

Here's a brief quote:

"First, ‘‘learning to see the similarities between things’’ is not an important aspect of category learning from this perspective. The most important aspect of the learning process in the simulations and the experiments we conducted was discrimination, and the reason that LF-training was generally poorer than FL-training was that items that were not discriminated were treated equivalently (or similarly). In word learning, from this perspective, ‘‘similarities’’ between objects do not need to be discovered; on the contrary, things appear similar because learning has not yet discriminated them."

It goes on...

GamesWithWords said...

@Melodye: I ask again: so what? I'm sure this difference is very important for explaining certain behaviors of the model, but does it allow your model to represent second-order logics? Hummel says no. Ramscar's response does not disagree.

But let's see if I understand the difference between learning based on similarity and learning based on distinction. On a similarity-based model, as I understand it, if I hear a new word (wug) applied to a blue squiggle, then I should expect wug to be applied to other things that are similarly blue and have a similar shape. The more wugs I see, the better sense I have of how much variation is allowed.

On a discrimination-based approach, as I understand you to be saying above, I would assume, at this point, that "wug" applies similarly well to blue squiggles, red square and Grandma, assuming I don't yet have words for red squares or Grandma. I need to learn additional words in order to have any way of restricting "wug" such that it doesn't apply to everything animal, vegetable or mineral.

In reading your paper, it seems like your model actually does something quite different. It already knows that "blue" and "squiggle" are kinds, and restricts "wug" at the very least to things that are either blue or squiggles. It can, through learning other words, restrict "wug" further.

So that makes it look a bit like a hybrid, right? Similarity isn't learned, but much of the important work of similarity is built in.

Interestingly, there are some other things built in. Your model doesn't consider every possible relevant feature of the object to which the word "wug" is being applied (e.g., "blue things on Thursdays"). In fact, since there are an infinite number of relevant features, I assume that your model could never learn the "correct" set of features even given infinite time (this is Quine's paradox) without you a priori restricting the options.

Again, correct me if I'm wrong, but it thus seems that with your model you can either (a) build in the kinds of innate constraints Nativists typically argue for (which is what you currently are doing), or (b) scale up the model such that it represents all infinite possible features and resign yourself to the fact that people will have considerable uncertainty about word meaning and also differ considerably from person to person in terms of what they think a word means. That's actually the conclusion Quine came to in Word and Object, so it's not a straw man.

Dan said...

In your post, you say, "associationist theories represent similarities between objects (or at least how often both occur together)."

This hasn't been true of "associationist" theories since ~1970 (since then, formal associationist theories have explained learning in terms of discrimination networks, or information theory, both of which work by essentially doing the opposite of what you describe).

Your characterization also isn't true of the theory Ramscar is defending (or of the math in the Ramscar et al paper). Given that you recognize the quote, why did you make up a theory that is the opposite of the one Ramscar is defending, and then attribute it to him in this post?

There's a deeper point to all this though: you have made a number of claims in your posts about what "associationist models" can't do. It's clear from the way you your characterize them that you have very little idea how formal learning models work. Your point about the model being "'born' knowing which types of shapes count as the same category and which types are different categories" is flat out wrong. A discrimination model begins by assuming the whole input set is in the same category. It doesn't need for the input set to be pre-sliced into wugs and squiggles, as you imply. far from it. Because it learns to discriminate based on the informativity of its input, increasing the input increases learning power. The model doesn't need to be "born" knowing anything other than its learning rule and its architecture.

Here's another example: "It's not clear in these quotes, but there's a lot of math to back this stuff up: second-order logic systems are extremely powerful and can do lots of useful stuff. Less powerful computational systems simply can't do as much." What does this mean, exactly? There are lots of things that some formalisms can do that others can't, and vice versa. Personally, I wouldn't give a second-order logic model of perceptual category learning the time of day. So if I was interested in perceptual category learning, why would I think that second-order logic models were more powerful? (As an aside - did you know that Ramscar used to teach formal logic modeling in AI at Edinburgh?)

Respectfully, wouldn't you be better off trying to learn more about modeling, and how it works, rather than keep repeating claims about models that you don't really understand?

GamesWithWords said...

Quoting Dan:

"Your point about the model being "'born' knowing which types of shapes count as the same category and which types are different categories" is flat out wrong. A discrimination model begins by assuming the whole input set is in the same category."

I think we're talking about different things. In Ramscar et al, the input set has been pre-categorized, and the vast majority of features that could in principle be tracked have been excluded. For instance, the model on page 90 knows that there are two types of shapes (shape #1 and shape #2) and that these types are distinct prior to any learning happening. And as far as I can tell, the performance on the model crucially depends on that fact.

Without checking, I'd bet money that that's true of every learning model that has ever been implemented. To allow for every one of the infinite number of features to be tracked would require that the computer implementing your simulation has infinite memory. There's no way around that.

And the reason you might care about more powerful computational systems is you might be interested in learning more than just nouns. As Hummel describes in detail.

Dan said...

As with modeling, so with Philosophy.

"[the] model doesn't consider every possible relevant feature of the object to which the word "wug" is being applied (e.g., "blue things on Thursdays"). In fact, since there are an infinite number of relevant features, I assume that [the] model could never learn the "correct" set of features even given infinite time (this is Quine's paradox) without you a priori restricting the options. "

A discrimination network only discriminates between hypotheses given a context that demands it. So, to take Quine's Gavagi example, the degree to which two networks form the same representation is a function of their histories. Which is exactly what Quine argued as well. He didn't think it a paradox. He took the Wittgensteinian view that there was no "correct" set of features. He also took the view that understanding was necessarily probabilistic (that biological critters like us didn't have a divine right to perfectly understand each other).

You write as if all this is unreasonable. As far as I'm aware, no one has ever solved the problems Quine and Wittgenstein posed about reference. Or figured out what the "correct" sets of features are. (Who got to decide them, anyway, Plato? Jerry Katz?)

People in the Cognitive Sciences just ignore these problems and blithely spout out "realist" theories of meaning and reference as if there weren't deep problems with these ideas (or as if some vague clap trap about "theories about theories" somehow magically fixed them). Is it any wonder we're at an an "empirical standstill" (your words)?

The prevailing attitude in the Cognitive Sciences appears to be 'the model is right dammit -- let's don't ask, don't tell about the problems, and hopefully no one will notice...." "Intelligent Nihilism?" just about sums it up.

GamesWithWords said...

"You write as if all this is unreasonable."

No, I didn't. I said that you were welcome to accept that outcome (read above), but you'd need a very different simulation from the one(s) in Ramscar et al. And that's why I assumed you'd find the indeterminacy of translation unreasonable.

I'm well aware of what Quine's take-home was, as you know since your quote comes from the very same comment in which I discussed Quine's views.

And I'm not sure that if you implemented the true Quinian problem in the FLO model (which you couldn't, given memory limitations), it's very much an open question as to whether the model's predictions would match actual behavior.

GamesWithWords said...

BTW "Intelligent Nihilism" was meant tongue-in-cheek, not as a slur. Sorry if it got anyone riled up. I thought it was fair game since Dye already called all Nativists Creationists, and that was, as far as I can tell, actually meant as a slur.

Dan said...

The models in the Ramscar et al paper are illustrations. They are meant to help people understand what is going on, not simulate every detail of mental processing. That's how models work. The important question is, "what features of the model are necessary to its performance?" None of the features you are fixating on are. As Ramscar notes in his paper, the simulations come out the same if the inputs are vectors of pixel values from a camera.

The "infinite features" problem is something that applies to inductive models. Discrimination models don't work that way. A discrimination network does not consider an infinite space of possibilities. It uses an input representation (how many dimensions could that take in the average human brain?) to try to predict a learners' environment. That environment is not an infinite set of features either. It can start out n=1, and the learner can use discrimination to break it down.

GamesWithWords said...

@Dan: I saw the note about pixels in the paper. Does that really help?

The model, as described, only learns about features that are currently in the training set. The paper is very clear that it learns nothing about features that have not yet been trained, correct? So you display an image and label it a wug. Then display the same image in a different location -- the pixel vector is completely different, so the model knows nothing about it, right? I'm pretty sure a kid in the same situation would still label the image a "wug," but you can run the experiment if you like.

That is, unless you're doing something really tricky I don't understand with the vector(s).

In your second paragraph, I think you're arguing my point for me. Obviously, the human brain cannot entertain every logically possible hypothesis, since that would require the brain to be infinite. So there are limitations built in.

Melodye said...

There's far more to be said here, but it looks like Dan is picking up the slack for me, and I want to save an extended response for a post of my own.

Two things, quickly --

1. The learning theory you attribute to us is the opposite of what we (quite carefully) put forth in FLO. What's more, the theory that you made up is one that no one has thought even vaguely plausible for nearly fifty years.

2. You also mischaracterize my argument here -- I didn't call Nativists creationists. I said that one of the strong arguments made in favor of linguistic nativism shares similar logic with arguments from intelligent design. My post wasn't intended as (and couldn't be) a full takedown of nativism. It was meant to inspire people (particularly scholars) to go back and reread the original texts and question the foundational logic on which their research programs rest. That's a very different thing than tarring an entire theoretical camp. What's more, I wouldn't describe myself as 'non-nativist' -- no one can be a pure 'empiricist' when it comes to the mind (I don't even know what that would mean). On the other hand, I would describe myself as someone who doesn't buy into the kind of nativism that Chomsky espoused; I don't think there is such a thing as a generative grammar or a store of innate concepts. But this is not the same thing at all.

Dexter Edge said...

The link to the Hummel article is broken.

GamesWithWords said...

@Melodye: "The learning theory you attribute to us is the opposite of what we (quite carefully) put forth in FLO."

You still haven't said in what relevant way I've mischaracterized your learning theory. The key to the argument here was that your model cannot handle second-order (or even first-order?) logics, which Dan has agreed is the case.

There was also some debate about whether or not the models in Ramscar et al. have conceptual primitives built in. Dan eventually agreed that they do, but says that could be fixed in a more complete model. Maybe, but an assertion isn't evidence.

"It was meant to inspire people (particularly scholars) to go back and reread the original texts and question the foundational logic on which their research programs rest."

So that's an interesting point. The model in Ramscar et al. 2010 is a model of how we learn concrete nouns and the names of pre-encoded features. I suspect that everybody agrees that such a model can learn these things, assuming either the learning problem is sufficiently simplified and/or the concepts are built in (as was done in your paper).

But the arguments in favor of generative grammars were never about noun-learning. Neither was Hummel's article, and neither was my post. So it's quite irrelevant to this discussion whether the model can learn concrete nouns. It is relevant whether it can learn verbs, modals, function words, and grammar.

I may be completely wrong and your model or similar models may be able to handle all these things with ease. Show me how.

Dan said...

Let's say this very slowly. The Ramscar et al paper describes a model. An abstraction made to capture an essential part of an explanation -- formally. I didn't see the part in the paper where the authors claimed it was a full simulation of homo sapean cognition. A representation was chosen for use in that model to help readers understand what was happening. Nothing that the model does relies on that representation.

So no, I do not and did not agree that the model has "conceptual primitives". You made that up.

Nor did I agree with anything you said about logic. (Your questions are ill posed.) You made that up too.

You also deleted a comment where I pointed this out. Why? Is this how you think scientific debate should be conducted. Making up things that suit you, and censoring anything that doesn't?

For what it is worth, psychologically, we can assume an input vector very much richer than the one described in the Ramscar et al paper (actually, we have no way of even thinking about just how many coding units must be available to a real learner). We can also assume an architecture, such as eyes, and dedicated circuits that learn to control eyes, and track motion (let's call this intelligent nativism -- assume that what is innate are things that we can specify, like learning mechanisms, neural architecture etc., rather than a syntax with no concrete computational features and hotline to a stack of Platonic forms)… Indeed, we don't need to make this up -- people actually build systems that solve these problems in AI, and they don't use second order logic to do it. With training, discrimination networks can be pretty tolerant of noisy input, unlike logic systems.

Given your criticism of Melodye's take on nativist arguments, it's instructive to see how you are conducting your side of this discussion. Ramscar et al put forward a model based on a learning rule for which there is a *lot* of neurological evidence, and excellent animal models. They formally applied it to make detailed predictions of human data, and to successfully solve a problem that dates back to Darwin. First you totally mischaracterize their model (seriously mischaracterize -- it neither relies on similarity nor frequency of co occurrence, the mechanisms you mention in your post, which means you'd get an F for that in class), and then, as I explain the principles by which all formal models of learning work to you, all you can do is nit pick ignorantly and say, "yeah, well you haven't explained everything…" or "what about this logical possibility?"

Leaning is like science. Science isn't about logical possibilities, it is about the law of large numbers (seriously, if your doctor prescribes you medicine, would you decline to take it on the grounds that the people who invented the medicine never took the time to rule out the logical possibility that your soul has been infected by evil spirits?").

You seem to think think that the incompleteness of a well grounded theory that has considerable coverage is evidence that anything that you choose to make up and call "innate" is equally valid. You are, of course, welcome to think this -- but let's be clear, your logic is not just similar to that of an intelligent design argument: it is identical.

GamesWithWords said...

@Dan: Chill out. I didn't delete anything. I actually thought you deleted the comment. Blogger marked your comment as spam.

The model has conceptual primitives whether you agreed or not. It has a primitive for blue, a primitive for red, and it treats them differently prior to any learning. I took your comment that it would work just the same "if the inputs are vectors of pixel values from a camera" as agreeing with my characterization. But whether you agreed won't change the model.

As I already discussed at length, the model would work the same if you used pixels *and* restricted the learning problem to objects that always appear in the same place and under the same lighting. As soon as you take that away, you need a different model. You say you can make that work, but it's up to you to demonstrate it.

I'd like to point out that this post was about a paper in which Hummel argues that such models cannot handle relational terms such as verbs. Ramscar in his response never addressed that problem, and neither have you or Melodye. Do you have a response?