Field of Science

Class Notes: Verb Islands

This is one of several posts based on readings and discussion from a graduate seminar on language acquisition that I have been attending.

Modern syntactic theory is complex. When I diagrammed sentences in middle school, it looked something like this diagram from A Walk in the WoRds blog:



That seemed complicated at the time but is child's play compared with what one finds in many syntax papers. For instance, here is a tree from Beatrice Santorini & Anthony Kroch's syntax textbook:



Here's another one from Christopher Davis's online class notes at UMass-Amherst:



Something that pops out about these trees are the various symbols that don't seem to do anything. For instance, saying "smelly dog" involves and adjective and a noun makes sense, but in Davis's tree above, there are extra symbols such as A'. The Santorini & Kroch tree makes considerable use of "inflectional phrases," which weren't a part of the sentence diagrams I learned in school.

Is grammar really so complex?

For a number of years now, Michael Tomasello has been arguing that perhaps linguistic structure is not nearly so complex and does not require nearly so many abstract components (like inflectional phrases), particularly in the case of child speech. A lot of the abstraction in linguistic theory is meant to explain how you know what constructions a given word appears in. For instance, compare the sentences below:

(1) Sarah rolled the orange.
(2) The orange rolled.
(3) Sarah ate the orange.
(4) *The orange ate.

The fourth sentence should sound ungrammatical (which is what the asterisk means). The question is: how do you know that the verb roll can be used in this way but the verb eat cannot? Theories make use of abstract grammatical structure to explain these and other generalizations (the abstractions in the sentence trees at the top of this post are motivated by other considerations, but the idea is the same).

Tomasello's point is that in fact young children and even adults typically do not use words in a wide variety of constructions -- therefore, you don't actually need such abstract linguistic theories. This is a useful push-back, and the claim has generated a lot of research. It is interesting, however, that Tomasello is presenting a theory that explains what people do say, but he is arguing against theories that explain what people can say, which turn out to be quite different things. Although nobody is likely to say the sentence in (5), we all know that it is grammatical.

(5) We all shall have told the story to the Martians.

A complete theory needs to explain that phenomenon, too.

More on spoken language

Part of Tomasello's argument is that an abstract grammar would predict more variability in the constructions people (particularly children) actually use than are seen in real life. Charles Yang has recently argued that this is not the case (see Who's Afraid of George Kingsley Zipf). In fact, people are very repetitive. Moreover, even if a given word can be used in many constructions, there may be no reason to use all those constructions. Sentence (5) was an example of this: the verb tell can be used in the first-person plural future perfect (we all shall have told), but it's hard to imagine many circumstances in which one would need to.

Despite the complex math in the paper, Yang's manuscript is more than worth the read.

Invented Languages

Those who haven't already seen it might be interested in my article about the role of invented languages in science at ScientificAmerican.com.

Results: The Best and Worst Puns

The Puntastic experiment continues to chug along. 1,376 participants have contributed 59,474 ratings of nearly 2,000 different puns. Currently, the most popular pun is

"To some, marriage is a word; to others, a sentence."

Every participant who has rated that so far has given it the maximum 5 stars. 

The second most popular is:

"The frustrated cannibal threw up his hands."

By far the least popular one is:

"People adorned with Bogus Deuterium Ingots aroused suspicion. Most people said they didn't trust anyone with BDIs."

I'm curious whether this is because people really hate this pun, or because they simply didn't get it. I actually think it's kind of funny.

I'm still collecting data, so if you haven't voted for your favorite puns yet, there is still time.

Web Experiment Tutorial: Chapter 10, Recruiting Participants

Several years ago, I wrote a tutorial for my previous lab on how to create Web-based experiments in Flash. This is the final chapter in the original tutorial.


10. Recruiting participants online


1.     Overview

So now you have an experiment implemented on the Web. All you need are participants. Where do you get them?

If you need only very small numbers of subject (50-100), this part is easy. If you want larger numbers of subjects, or if you want to run several experiments under the same URL (so as to prevent the same subject from participating in multiple versions of the experiment), this may be the most challenging part of Web-based experiments.

There are several methods you can use. I recommend using all of them. Each will be discussed below in turn, but briefly: you can list the experiments on experiment portal pages, you can recruit from within your own social network, you can buy ads, you can promote the experiments in online forums, you can blog, you can swap links with other researchers, and you can get media attention.

Media attention, if you can get it, is far more valuable than all those other methods combined.

2.     Experiment portal pages

There are several web sites that list online experiments. By far the one that has provided the most subjects to vacognition is:


The second-most useful is:


Others, much less useful, include:



Another place you can list is:


In the first 3 weeks of May, 2007, vacognition (my previous site) received 251 hits from psych.hanover, 67 from genpsylab-wexlist, 15 from onlinepsychresearch, and only 2 from language-experiments.

Here are some other lists I have not used, which may or may not be useful:



3.     Your social network

Your own friends and family are the most likely to be convinced to do your experiments. Some of them may pass along the URL to their friends and family. Every time I have sent out requests to my F&F, I get about 40-50 participants in various studies.

You can also use Web-based social networking. For instance, I have an account on Facebook.com. My page lists vacognition. A friend of mine created a Facebook “group” called “Harvard Studied My Brain.” We invited all our friends to join (about 200), and anybody on Facebook could in theory join if they found the group in a search. 35 people did join, and many more have clicked on the link.

To make the group more enticing, I created a “certificate of membership,” which members can download. Generally, it is good to think about why anybody would want to participate. What can you do to make it more fun?

Other social networking sites include Stumbleupon.com, Reddit.com and Digg.com. “vacognition” has accounts for all of those. Every time a new webpage mentions your website, it is a good idea to “vote” for that website on Stumbleupon, Reddit and Digg. This increases the likelihood people will surf to that page, and then to your page. However, these services only have an effect if a fairly large number of people vote for the site, and the traffic may or may not be high-quality. At one point, a number of people voted for vacognition on StumbleUpon. In the space of an hour, we suddenly got 150 visitors. However, most did not participate in any experiments, and the traffic died down within 90 minutes. This is likely due to the fact that users of StumbleUpon are randomly sent to the website. In contrast, users of Digg or Reddit know what sort of website they are going to and are more likely to actually be interested. Visitors we have gotten through Digg have been highly likely to participate in an experiment.

You can also add a link to your website as part of your email signature. Ask your labmates to do so as well. Ask your friends to link to your website from their websites.

4.     Purchasing ads

You can also purchase ads. One obvious place to put ads is Google. I have never tried this.

I did, however, buy adds on Facebook. For $5/day, Facebook promised to display my add to at least 10,000 people on the Oberlin network. For another $5/day, it was displayed to another 10,000 people in the Harvard network.

I bought $20 worth of ads as an experiment. Vacognition got about 80 hits. That’s 4/$1. This is not very impressive, but it may be worth it. Also, my ads may not have been very good. (Keep in mind that a hit to the website does not mean that the person completed or even started an experiment!)

5.     Online forums

Another way to recruit participants is to mention the website or a particular experiment in an online forum. Here, it is particularly important to make the post relevant to the forum discussion. Otherwise, you are spamming and may (not unfairly) receive hate mail.

There are many psychology or science forums. It may be perfectly fair to write a post called “Please help me finish this experiment.” Another option is to write about the topic you are studying (“Visual short-term memory is very limited. We are trying to find out exactly hoe limited. Please do this experiment.”). You can also be very oblique about it. Post something interesting about your area of research, and just mention your website (“It turns out 1/100 people have prosopagnosia from a young age, not as a result of stroke. This is something we’ve found through our online experiments at www.faceblind.org. In fact, blah blah blah.”)

You can also pick targeted forums. If you are studying reading, you can post on dyslexia or reading education forums. (“My colleagues and I are trying to better understand reading. The results may eventually help us better understand how to teach reading to children. We need volunteers for our short, 3-4 minute experiments. I thought that participants in this forum might be particularly interested…”)

Because I use Flash for my experiments, I have also posted on forums dealing with Flash programming (“You may be interested in this other application of Flash technology…”). Also, sometimes I have a question about Flash, and I post the question, with a link.

There are also website creation forums where you are encouraged to showcase your website.

The best success I have had with this method is when my experiment has been set up as a type of quiz. My visual short-term memory experiment gives people a score at the end. So I posted the experiment on several forums where people advertise quizzes (“How good is your short-term memory for what you see?”). Normally, a forum post generates only a small amount of traffic (0-10 hits), but these posts on quiz forums produced as many as 100 each.

Vacognition has accounts on many, many message forums.

6.     Swapping links

Reciprocal advertising is an easy-to-use but very limited strategy. Vacognition has a “links” page, where we link to other websites, mostly other Web-based experiments. In return, those websites link to ours.

This serves two purposes. First, visitors to those other sites may click on the link and come to our website. This is extremely rare.

Second, the more links there are to a website, the better its “page rank” – that is, the higher it appears in the list of search results. Swapping links improves your page rank, and thus you are easier to find through Google, etc. My data suggest that visitors that come via Google tend to be low-quality visitors – that is, they tend not to participate in experiments. However, a few do, and it doesn’t hurt.

Usually I arrange these link swaps by emailing the webmasters of websites that I think may be interested. Most do not respond, but some do.

7.     Media attention

By far the most effective method is media attention. Extremely successful online labs (like faceblind.org or the Moral Sense Test) get a lot of mainstream media attention, and they also get huge numbers of participants.

Media attention is hard to orchestrate. Ideally, your research will be so interesting that reporters will come to you. However, you can contact reporters yourself. The university can put out a press release. I got a fair amount of media attention after Georgetown wrote a press release about a paper I wrote.

In the end, though, you have to have work that is interesting to reporters and the public (see “The most important thing,” below).

8.     Blogging

Bloggers are more approachable members of the media. Bloggers of many shades and stripes may be interested in showcasing your experiments. And they are much more likely to respond to an email. Some blogs produce disappointing traffic. I guest-blogged for The New Scientist, whose blog gets a thousand hits a day, but I only got a few dozen hits out of it. However, Skepchic blogged (without my contacting her) about one of my experiments, and I got about 300 hits.

You can also write your own blogs. This will be of minimal help if you don’t attract a following, but even a blog with little following and only one new post every month or so can generate some traffic. The links from the blog to your website can also help your page rank (see “Swapping links”).



9.     Email list

We maintain an email list. On several parts of the website, visitors are encouraged to join a Google Groups email list, which now has over 100 members. The list is emailed when new experiments are posted or results have been posted, although I try to keep this to a minimum. If you overuse an email list, people tend not to read the messages and/or withdraw from the list.

Setting up a Google Groups email list is simple, and it can be set up so that anyone can join. Vacognition’s list can be found here:


10.  The most important thing

When recruiting participants, you should always keep in mind one question:

Why would anybody want to participate in this experiment?

Participants are expending resources (time, energy, and sometimes money) in order to participate. What is the product that you are selling them?

This is particularly important when trying to generate media attention – whether newspapers or bloggers. You may get your brother-in-law to blog about your experiment as a favor (mine did), but most bloggers aren’t going to write about something if they don’t find it interesting. Make it interesting. Testmybrain.org is a great example of a site that is fun, and not surprisingly it gets tons of traffic.

However, this issue is important even when using online experiment lists. Anyone who visits an online experiment list is already interested in doing online experiments. However, these lists post many experiments. No visitor is going to do all of them. So how do they choose which one(s) to do? Presumably, this is partly a function of how interesting the experiment looks. Compare:

“This experiment investigates the role of proactive interference in estimates of visual short-term memory capacity.”

with

“How much of what you see can you remember? Probably less than you thought. Take this 5 minute quiz to see how many visual objects you can remember. Typical scores are between 1 and 3 objects.”

Which experiment sounds more interesting? They are the same experiment.

You will want to craft your pitch to your audience. If you are posting on a forum for vision scientists, the 2nd description above may come across as patronizing. However, if you are posting to a forum about online quizzes and games, the 1st description will probably get you banned from the forum for spamming.

The design of the website itself also matters. An ugly, unprofessional-looking website will turn away visitors. Many participants are participating because they are interested in science. Make sure they learn something. Post results. Have pages that discuss the research topics. Make sure the debriefing is informative. Many participants find seeing their own results very motivating, so if possible, try to incorporate that into the experiment.

You can also experiment in your advertising. Try different pitches. See which work the best. Modify the website and see if the number of visitors who actually participate in experiments increases or decreases.

11. Where does GamesWithWords.org get it's traffic?

I currently use Google Analytics to track my web traffic. Here is what it shows for the top 10 referrers from Dec 1. 2009 through April 21 2010:

As you can see, the biggest chunk of traffic comes from people simply typing in the name of the site. Word of mouth seems to do a great deal. One thing to consider, however, is also the average time on site and the bounce rate. By these measures, the direct traffic is better than those who come via Google.

I should note that this traffic to dwarfed by weeks in which I get media attention. I can easily get several thousand visits per day when the site is mentioned in a prominent news source (which has not happened in the last few months, unfortunately). Notice also that while there are many other sources of traffic beyond the top 10 listed here, all the rest combined only contributed 1,385 visits.

Web Experiment Tutorial: Chapter 9, Finishing Up

Several years ago, I wrote a tutorial for my previous lab on how to create Web-based experiments in Flash. I am currently posting that tutorial chapter by chapter.









9. Finishing Up


We still haven’t finished putting the experiment online. You will recall that clicking the link in Consent.html opened two files. It opened test_in_progress.html. It also opens test_popup.php and centers it in the middle of the screen.



















































































































The only thing to be interested in here is the first command:




$link = “VSTM.swf”.

Make sure this is set to the name of your file. Notice the extension .swf. This is the compiled form of a Flash file. We’ll make it momentarily. This is what will actually be displayed.

2. Making .swf files.

Each time you “test” a .fla file in Flash, it compiles the file into a .swf file in the same directory. However, it is best to go to File->Publish to create the .swf file. It will also create an .html file (which simply runs the .swf file) and and .swd file, which does something or other. You can use this VSTM.fla file, though hopefully that is identical to the one you made.

Take that .swf file and copy it into your VSTM directory on your website. Now you can browse to your index.html file and run your experiment directly. Everything should work.

3. The exit button

Add a button to the “finish” frame of your Flash file. Change the label to “Exit”. Add this code to the button:

on (click){
     getURL("javascript:updateParent('http://URL/VSTM/done.html'); javaScript:self.close()");
}

Change “URL” so that it matches the actual path. Now, when subjects finish the experiment, they can click on that button. It will close the window and change the test_in_progress.html browser window to done.html.

VSTM.fla incorporates this.

This does not work when you are testing your experiment from your hard drive. To see this work, you must be running the experiment through the Web.

4. Instructions and debriefing.

You will want to add some debriefing information. You can do this as part of done.html, or you can do it within VSTM.fla. An advantage of doing it within Flash is that you can modify the debriefing depending on the subject’s scores. For instance, you could calculate VSTM capacity in this experiment for this subject and display it in the debriefing.

You would also want to add instructions. This should be fairly straight-forward at this point. 




As usual, please leave any questions in the comments section.

Web Experiment Tutorial: Chapter 8, Additional MySQL

Several years ago, I wrote a tutorial for my previous lab on how to create Web-based experiments in Flash. I am currently posting that tutorial chapter by chapter.

There are a few other useful things you can do in MySQL.

1. Selecting certain rows from a table.

Maybe you want to know how many people have completed your experiment. Your experiment has 6 trials, the last of which is called trial “5”. The easiest way to find out, then, is to use the following command:

mysql> select * from VSTM where trial = 5;
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
| subject_age | subject_sex | subject_vision | initials | trial | correct | stimulus | matches | probe | date       | time     | ip            | subject_id |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
|           2 | male        | no             | jkh      |     5 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:33 | NULL          |       NULL |
|          26 | male        | yes            | jkh      |     5 |       1 |        3 |       0 |     2 | 2007-06-05 | 16:01:40 | 140.247.95.39 |          4 |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
2 rows in set (0.01 sec)


Two rows were found, so two subjects completed trial #5.

You can use more complicated where statements:

mysql> select * from VSTM where trial = 5 and correct=1;


2. Deleting rows from a table.

To delete all rows from the table VSTM , type:

mysql.> delete from VSTM;

To delete only certain rows, try:

mysql> delete from VSTM where subject_id = “NULL”;


3. Copying a table.

I often run multiple version of an experiment under the same name. Subjects do not know that the experiment has changed. I do this when I don’t want people to participate in the different versions of the same experiment.

I want the data from each version to go into separate tables. For instance, suppose I’ve finished collecting data from the first version of the VSTM experiment and I want to start a second version.

mysql> create table VSTMver1 like VSTM;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into VSTMver1 select * from VSTM;
Query OK, 12 rows affected (0.00 sec)
Records: 12  Duplicates: 0  Warnings: 0

mysql> delete from VSTM;
Query OK, 12 rows affected (0.00 sec)


I now have a table called VSTMver1 with all the data that has been collected so far. The table VSTM is now empty. (Note that the subject numbers will not start over. To do that, you would have to reset VSTM_id_incrementor. I’m not sure of any easy way of doing that other than deleting VSTM_id_incrementor and creating it again. Just deleting all its rows doesn’t work.)



As usual, please leave comments if you have any questions.

Video Test -- new and improved

The new experiment that I mentioned recently had a persistent bug. I finally managed to track it down yesterday and fix it. So my apologies to anybody who wasn't able to complete the experiment due to the bug (feel free to email me to ask about how it would have ended and what it was about). For those of you who haven't taken it yet, I think this is one of the most enjoyable experiments in the bunch, so I highly recommend it to everybody.

Max Planck entering South Korea

 Germany's Max Planck Institute is starting a partnership with an institution in South Korea. This comes on the heels of another joint institution in Shanghai. Max Planck already has other full-fledged institutes in Europe outside Germany proper.

I'm a big fan of the Max Planck institutes, in that I think there is a place for relatively small, focused research institutes outside of academia, and I'm happy to see that they continue to expand. I hope some day they consider opening Max Planck Boston -- preferably focused on language acquisition, since an institute dealing with particle physics wouldn't be as useful to me.

Web Experiment Tutorial: Chapter 7, MySQL

Several years ago, I wrote a tutorial for my previous lab on how to create Web-based experiments in Flash. I am currently posting that tutorial chapter by chapter.


MySQL is a very popular and free database program. We will use it to store your data.

1. What is MySQL?

The pertinent question is: What is a database? An in depth discussion of databases is beyond the scope of this manual. For our purposes, you can think of a database as an extremely complicated collection of spreadsheets. It is in fact much, much more – especially since MySQL in fact implements relational databases. However, we aren’t going to use any of that functionality.

Each data base contains tables. For our purposes, you can think of each table as a spreadsheet. In fact, you can link the data between tables in a database, but you are unlikely to need this capability.

Each table contains rows and columns, just like a spreadsheet. For all intents and purposes, you can have as many as you want. The columns are named. Rows are not.

2. Opening MySQL.

If you have a web interface for your MySQL server, the section below won't be relevant. The web interface is sufficiently simple that I won't describe it here. 

Open a terminal. Use SSH to log onto your web server. Here’s what it looks like for me:

Last login: Tue Jun  5 13:48:36 on ttyp2
Welcome to Darwin!
dhcp-0000059136-59-ff:~ josh$ ssh vcognit@research.wjh.harvard.edu
Password:
Last login: Tue Jun  5 13:48:17 2007 from dhcp-0000059136
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
research ~>

Now, I change directories to the MySQL bin folder. If MySQL is in your path, this won’t be necessary. Open MySQL as follows:

research /opt/csw/mysql4/bin> ./mysql -u USERNAME -p DATABASENAME
Enter password:

You will need to type in your username and database name as appropriate. Your database name should have been given to you by your database administrator. If you are your database administrator, you are going to have to figure this out for yourself.

Once you’ve logged on, you will see the following prompt:

Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 39203 to server version: 4.1.9-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

3. Creating a table.

Although I now have a web interface for MySQL, I continue to use SQL script to create and modify tables, though there are ways of doing some of this through the web interface. Again, if you can do it the way described below, you should have little difficulty with the web interface. 

Now, we are going to create a table to hold the data from our experiment. Type in the following:

mysql> CREATE TABLE VSTM (
    -> subject_age INT(3),
    -> subject_sex VARCHAR(6),
    -> subject_vision VARCHAR(3),
    -> initials VARCHAR(5),
    -> trial INT(3),
    -> correct INT(1),
    -> stimulus INT(2),
    -> match INT(1),
    -> probe INT(1),
    -> date DATE,
    -> time TIME,
    -> ip VARCHAR(25)
    -> );

If you are using a web interface, you won't need the "->" codes. That is something that appears automatically in the terminal to mark that the new line is part of the same command.

The first line names the table “VSTM”. The following lines create columns. Note that the names of each column must be EXACTLY the same as the names of the variables in scriptVars. Otherwise, MySQL can’t figure out which data belongs to which column and nothing will get written. If MySQL isn’t recording your data, 95% of the time this is because you have a type-o or mismatch in your column and variable names. “date”, “time” and “ip” all come from the PHP file. Again, the names must match.

INT and VARCHAR are data types. INT means the column contains integers. VARCHAR means in contains letters and/or numbers. The number in parentheses is the size of the column. “match” is always a 0 or a 1, so it only requires a column 1 character wide. “subject_sex” is either 4 letters (“male”) or 6 letters (“female”), so the column size is set to 6. You can always have bigger columns than are necessary. DATE and TIME are also data types with rather obvious properties.

Unfortunately this code won’t work. You’ll get the following response:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'match INT(1),
probe INT(1),
date DATE,
time TIME,
ip VARCHAR(25)
)' at line 9

MySQL error messages aren’t very useful. By trial and error, you would figure out that the problem is the name “match”. “Match” is a restricted name in MySQL, and you can’t give that name to a column. So we are going to have to change that name here AND in the Flash file. Change both to “matches”. This has been done in Part6.fla.

Try again:

mysql> CREATE TABLE VSTM (
    -> subject_age INT(3),
    -> subject_sex VARCHAR(10),
    -> subject_vision VARCHAR(3),
    -> initials VARCHAR(5),
    -> trial INT(3),
    -> correct INT(1),
    -> stimulus INT(2),
    -> matches INT(1),
    -> probe INT(1),
    -> date DATE,
    -> time TIME,
    -> ip VARCHAR(25)
    -> );
Query OK, 0 rows affected (0.01 sec)

This time, MySQL says that the table has been successfully created.

4. Deleting a table.

If you want to remove a table, use the drop command:

mysql> DROP TABLE VSTM;

5. Try it out.

Make sure that submit_vars.php is at the URL cited in your Flash file. Run your experiment once through. If you have done everything correctly, data will be recorded in your MySQL database. Now, how do you get to it? 

If you run into trouble, use the Flash file Part6.fla. If there are still problems, then there are probably problems with your table or your PHP file. In the PHP file, make sure that the username, password, host name, database name and table name are all correct. In MySQL, make sure you have the right number of columns – 12 – and that they are all named the right thing. You can double check this with the following command:

mysql> describe VSTM;
+----------------+-------------+------+-----+---------+-------+
| Field          | Type        | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| subject_age    | int(3)      | YES  |     | NULL    |       |
| subject_sex    | varchar(10) | YES  |     | NULL    |       |
| subject_vision | char(3)     | YES  |     | NULL    |       |
| initials       | varchar(5)  | YES  |     | NULL    |       |
| trial          | int(3)      | YES  |     | NULL    |       |
| correct        | int(1)      | YES  |     | NULL    |       |
| stimulus       | int(2)      | YES  |     | NULL    |       |
| matches        | int(1)      | YES  |     | NULL    |       |
| probe          | int(1)      | YES  |     | NULL    |       |
| date           | date        | YES  |     | NULL    |       |
| time           | time        | YES  |     | NULL    |       |
| ip             | varchar(25) | YES  |     | NULL    |       |
+----------------+-------------+------+-----+---------+-------+
12 rows in set (0.00 sec)

mysql>

If yours looks different, fix it. You can either use the DROP command to remove the table and start over, or you can modify the columns directly. Here is an example:

mysql> alter table VSTM drop column ip;
Query OK, 6 rows affected (0.01 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> alter table VSTM add column ip VARCHAR(25);
Query OK, 6 rows affected (0.02 sec)
Records: 6  Duplicates: 0  Warnings: 0

There are also commands to simply rename a column, etc. Google “MySQL alter table” and you should be able to find good explanations of your options.

6. Viewing your data.

You have now run your experiment at least once. Hopefully MySQL now contains data. How do you see it?

mysql> select * from VSTM;
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+------+
| subject_age | subject_sex | subject_vision | initials | trial | correct | stimulus | matches | probe | date       | time     | ip   |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+------+
|           2 | male        | no             | jkh      |     0 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:06 | NULL |
|           2 | male        | no             | jkh      |     1 |       1 |        2 |       0 |     3 | 2007-06-05 | 14:57:22 | NULL |
|           2 | male        | no             | jkh      |     2 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:25 | NULL |
|           2 | male        | no             | jkh      |     3 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:27 | NULL |
|           2 | male        | no             | jkh      |     4 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:30 | NULL |
|           2 | male        | no             | jkh      |     5 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:33 | NULL |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+------+
6 rows in set (0.00 sec)


Everything looks good. The IP address is NULL because you ran it off of your desktop. When you run it from the Web, this should change.

Notice that if several people ran this experiment, you would have to use a combination of initials age and sex to tell them apart. If the same person did the experiment more than once, there would be no easy way to code for that. What we want is a subject number.

7. Creating a subject number.

Add the following code to the Intialize frame:

var getID = new LoadVars();
getID.onLoad = function(success) {
          id = this.id;
};
getID.load("http://URL/get_next_id.php");

Open get_next_id.php:


$db = mysql_connect ('HOST', 'USERNAME', 'PASSWORD');
mysql_select_db ('DATABASE');

$query1 = "INSERT INTO VSTM_id_incrementor VALUES (NULL);";
mysql_query($query1);

$id = mysql_insert_id ($db);

mysql_close();

echo "&id=" . $id;

?>

Again, change the first two lines to match your host name, username, password and database name. Notice that this file inserts a blank row into the table VSTM_id_incrementor. We need to create this table:

mysql> CREATE TABLE VSTM_id_incrementor ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
Query OK, 0 rows affected (0.01 sec)

This creates the table VSTM_id_incrementor with a single column – id – that automatically increments. That is, each time a row is added, the value of that column in that row will be one greater than in the previous row. Try this:

mysql> select * from VSTM_id_incrementor;
Empty set (0.00 sec)

mysql> insert into VSTM_id_incrementor VALUES ('NULL');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from VSTM_id_incrementor;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql> insert into VSTM_id_incrementor VALUES ('NULL');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into VSTM_id_incrementor VALUES ('NULL');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from VSTM_id_incrementor;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.00 sec)

8. Recording the subject number.

Now, each time the Flash file runs, it will insert a blank row into VSTM_id_incrementor and retrieve the new subject id. Now we need to record that when data is recorded. There is only one small change to the Flash file.

Add:

scriptVars.subject_id = id;

to prepareResults. You can see this in Part7.fla.

You need to also add a column called “subject_id” to the table VSTM:

mysql> alter table VSTM add column subject_id INT(4);
Query OK, 6 rows affected (0.01 sec)
Records: 6  Duplicates: 0  Warnings: 0

Make sure that the get_next_id.php file in your website has been updated. Now, run your experiment again.

Afterwards, you should be able to see something like this:

select * from VSTM;
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
| subject_age | subject_sex | subject_vision | initials | trial | correct | stimulus | matches | probe | date       | time     | ip            | subject_id |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
|           2 | male        | no             | jkh      |     0 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:06 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     1 |       1 |        2 |       0 |     3 | 2007-06-05 | 14:57:22 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     2 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:25 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     3 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:27 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     4 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:30 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     5 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:33 | NULL          |       NULL |
|          26 | male        | yes            | jkh      |     0 |       1 |        3 |       0 |     1 | 2007-06-05 | 16:01:26 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     1 |       1 |        2 |       1 |     2 | 2007-06-05 | 16:01:28 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     2 |       1 |        2 |       0 |     3 | 2007-06-05 | 16:01:32 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     3 |       1 |        3 |       1 |     3 | 2007-06-05 | 16:01:34 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     4 |       1 |        1 |       1 |     1 | 2007-06-05 | 16:01:37 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     5 |       1 |        3 |       0 |     2 | 2007-06-05 | 16:01:40 | 140.247.95.39 |          4 |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
12 rows in set (0.00 sec)

Notice that a subject ID is now recorded.

One problem with this method is that occasionally the same subject number will be given to two different subjects who start at roughly the same time. You can usually distinguish them by demographic information: they will have different ages. However, this is not ideal, and I've been experimenting with other options.

9. Exporting data

You can now view your data, but you can’t do very much with it. To export data to a text file, you need to be in the SSH shell (you can’t do it directly from within MySQL). Type the following:

./mysql –u USERNAME –p PASSWORD –e “SELECT * from VSTM” > /PATH/VSTM_data.txt

This will record the output from “SELECT * from VSTM” to the file VSTM_data.txt in the location /PATH/. You will need to have write privileges for that directory.

10. Screening subjects.

You would probably like to only analyze data from subjects that completed the task. Unfortunately, not all will complete the task. You can cull them by hand, but there is a fairly simple method using Microsoft Excel. Open Results.txt from the Resources folder in Microsoft Excel. These data are from a VSTM experiment I actually ran. Not all the columns are included in order to protect privacy.

You will notice that 75 subjects completed the 0th trial, but only 70 completed the 7th and final trial. We want to eliminate all data from subjects that did not complete the 7th trial. Sort the rows first by trial and second by subject_id so that it looks like this:
subject_id
trial
correct
stimtype
exemplar
probe_match
distractor
date
time
601
7
1
5
1
1
-1
4/23/07
13:16:13
602
7
1
5
1
1
-1
4/23/07
17:58:00
604
7
1
7
1
1
-1
4/23/07
22:16:30
606
7
1
8
1
0
-1
4/24/07
11:59:07
609
7
0
4
1
0
-1
4/24/07
15:08:09
610
7
0
3
1
0
-1
4/24/07
15:16:06
601
6
1
8
2
1
-1
4/23/07
13:16:06
602
6
1
7
2
0
-1
4/23/07
17:57:53
604
6
1
8
2
1
-1
4/23/07
22:16:23
606
6
1
10
2
1
-1
4/24/07
11:59:00
609
6
0
9
2
0
-1
4/24/07
15:08:01
610
6
1
2
2
1
-1
4/24/07
15:15:59
601
5
0
10
1
1
-1
4/23/07
13:15:59
602
5
0
2
1
0
-1
4/23/07
17:57:46
604
5
1
4
1
0
-1
4/23/07
22:16:15
606
5
0
11
1
0
-1
4/24/07
11:58:52

Highlight the 70 subject id’s that completed trial 7. Copy these. Go to Excel->Preferences->Custom Lists. Past the subject IDs into the “List entries:” form and then click “Add”. Then click “OK”. Select all rows and columns. Click Data->Sort. Choose sort by subject_id, ascending, and click “options”. Under “Sort Order” instead of “normal” choose the new sort order that you just created. Select “OK” twice.

Now the spreadsheet should be sorted according to your custom sort order. Now scroll to the bottom of the file. After subject 721, you will see several rows with subjects 607, 651, 663, 682 and 683. These are the subjects who did not complete the experiment – all neatly separated from the rest of the data.

As usual, please leave any question in the comments.