Field of Science

Web Experiment Tutorial: Chapter 7, MySQL

Several years ago, I wrote a tutorial for my previous lab on how to create Web-based experiments in Flash. I am currently posting that tutorial chapter by chapter.


MySQL is a very popular and free database program. We will use it to store your data.

1. What is MySQL?

The pertinent question is: What is a database? An in depth discussion of databases is beyond the scope of this manual. For our purposes, you can think of a database as an extremely complicated collection of spreadsheets. It is in fact much, much more – especially since MySQL in fact implements relational databases. However, we aren’t going to use any of that functionality.

Each data base contains tables. For our purposes, you can think of each table as a spreadsheet. In fact, you can link the data between tables in a database, but you are unlikely to need this capability.

Each table contains rows and columns, just like a spreadsheet. For all intents and purposes, you can have as many as you want. The columns are named. Rows are not.

2. Opening MySQL.

If you have a web interface for your MySQL server, the section below won't be relevant. The web interface is sufficiently simple that I won't describe it here. 

Open a terminal. Use SSH to log onto your web server. Here’s what it looks like for me:

Last login: Tue Jun  5 13:48:36 on ttyp2
Welcome to Darwin!
dhcp-0000059136-59-ff:~ josh$ ssh vcognit@research.wjh.harvard.edu
Password:
Last login: Tue Jun  5 13:48:17 2007 from dhcp-0000059136
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
research ~>

Now, I change directories to the MySQL bin folder. If MySQL is in your path, this won’t be necessary. Open MySQL as follows:

research /opt/csw/mysql4/bin> ./mysql -u USERNAME -p DATABASENAME
Enter password:

You will need to type in your username and database name as appropriate. Your database name should have been given to you by your database administrator. If you are your database administrator, you are going to have to figure this out for yourself.

Once you’ve logged on, you will see the following prompt:

Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 39203 to server version: 4.1.9-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

3. Creating a table.

Although I now have a web interface for MySQL, I continue to use SQL script to create and modify tables, though there are ways of doing some of this through the web interface. Again, if you can do it the way described below, you should have little difficulty with the web interface. 

Now, we are going to create a table to hold the data from our experiment. Type in the following:

mysql> CREATE TABLE VSTM (
    -> subject_age INT(3),
    -> subject_sex VARCHAR(6),
    -> subject_vision VARCHAR(3),
    -> initials VARCHAR(5),
    -> trial INT(3),
    -> correct INT(1),
    -> stimulus INT(2),
    -> match INT(1),
    -> probe INT(1),
    -> date DATE,
    -> time TIME,
    -> ip VARCHAR(25)
    -> );

If you are using a web interface, you won't need the "->" codes. That is something that appears automatically in the terminal to mark that the new line is part of the same command.

The first line names the table “VSTM”. The following lines create columns. Note that the names of each column must be EXACTLY the same as the names of the variables in scriptVars. Otherwise, MySQL can’t figure out which data belongs to which column and nothing will get written. If MySQL isn’t recording your data, 95% of the time this is because you have a type-o or mismatch in your column and variable names. “date”, “time” and “ip” all come from the PHP file. Again, the names must match.

INT and VARCHAR are data types. INT means the column contains integers. VARCHAR means in contains letters and/or numbers. The number in parentheses is the size of the column. “match” is always a 0 or a 1, so it only requires a column 1 character wide. “subject_sex” is either 4 letters (“male”) or 6 letters (“female”), so the column size is set to 6. You can always have bigger columns than are necessary. DATE and TIME are also data types with rather obvious properties.

Unfortunately this code won’t work. You’ll get the following response:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'match INT(1),
probe INT(1),
date DATE,
time TIME,
ip VARCHAR(25)
)' at line 9

MySQL error messages aren’t very useful. By trial and error, you would figure out that the problem is the name “match”. “Match” is a restricted name in MySQL, and you can’t give that name to a column. So we are going to have to change that name here AND in the Flash file. Change both to “matches”. This has been done in Part6.fla.

Try again:

mysql> CREATE TABLE VSTM (
    -> subject_age INT(3),
    -> subject_sex VARCHAR(10),
    -> subject_vision VARCHAR(3),
    -> initials VARCHAR(5),
    -> trial INT(3),
    -> correct INT(1),
    -> stimulus INT(2),
    -> matches INT(1),
    -> probe INT(1),
    -> date DATE,
    -> time TIME,
    -> ip VARCHAR(25)
    -> );
Query OK, 0 rows affected (0.01 sec)

This time, MySQL says that the table has been successfully created.

4. Deleting a table.

If you want to remove a table, use the drop command:

mysql> DROP TABLE VSTM;

5. Try it out.

Make sure that submit_vars.php is at the URL cited in your Flash file. Run your experiment once through. If you have done everything correctly, data will be recorded in your MySQL database. Now, how do you get to it? 

If you run into trouble, use the Flash file Part6.fla. If there are still problems, then there are probably problems with your table or your PHP file. In the PHP file, make sure that the username, password, host name, database name and table name are all correct. In MySQL, make sure you have the right number of columns – 12 – and that they are all named the right thing. You can double check this with the following command:

mysql> describe VSTM;
+----------------+-------------+------+-----+---------+-------+
| Field          | Type        | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| subject_age    | int(3)      | YES  |     | NULL    |       |
| subject_sex    | varchar(10) | YES  |     | NULL    |       |
| subject_vision | char(3)     | YES  |     | NULL    |       |
| initials       | varchar(5)  | YES  |     | NULL    |       |
| trial          | int(3)      | YES  |     | NULL    |       |
| correct        | int(1)      | YES  |     | NULL    |       |
| stimulus       | int(2)      | YES  |     | NULL    |       |
| matches        | int(1)      | YES  |     | NULL    |       |
| probe          | int(1)      | YES  |     | NULL    |       |
| date           | date        | YES  |     | NULL    |       |
| time           | time        | YES  |     | NULL    |       |
| ip             | varchar(25) | YES  |     | NULL    |       |
+----------------+-------------+------+-----+---------+-------+
12 rows in set (0.00 sec)

mysql>

If yours looks different, fix it. You can either use the DROP command to remove the table and start over, or you can modify the columns directly. Here is an example:

mysql> alter table VSTM drop column ip;
Query OK, 6 rows affected (0.01 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> alter table VSTM add column ip VARCHAR(25);
Query OK, 6 rows affected (0.02 sec)
Records: 6  Duplicates: 0  Warnings: 0

There are also commands to simply rename a column, etc. Google “MySQL alter table” and you should be able to find good explanations of your options.

6. Viewing your data.

You have now run your experiment at least once. Hopefully MySQL now contains data. How do you see it?

mysql> select * from VSTM;
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+------+
| subject_age | subject_sex | subject_vision | initials | trial | correct | stimulus | matches | probe | date       | time     | ip   |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+------+
|           2 | male        | no             | jkh      |     0 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:06 | NULL |
|           2 | male        | no             | jkh      |     1 |       1 |        2 |       0 |     3 | 2007-06-05 | 14:57:22 | NULL |
|           2 | male        | no             | jkh      |     2 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:25 | NULL |
|           2 | male        | no             | jkh      |     3 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:27 | NULL |
|           2 | male        | no             | jkh      |     4 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:30 | NULL |
|           2 | male        | no             | jkh      |     5 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:33 | NULL |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+------+
6 rows in set (0.00 sec)


Everything looks good. The IP address is NULL because you ran it off of your desktop. When you run it from the Web, this should change.

Notice that if several people ran this experiment, you would have to use a combination of initials age and sex to tell them apart. If the same person did the experiment more than once, there would be no easy way to code for that. What we want is a subject number.

7. Creating a subject number.

Add the following code to the Intialize frame:

var getID = new LoadVars();
getID.onLoad = function(success) {
          id = this.id;
};
getID.load("http://URL/get_next_id.php");

Open get_next_id.php:


$db = mysql_connect ('HOST', 'USERNAME', 'PASSWORD');
mysql_select_db ('DATABASE');

$query1 = "INSERT INTO VSTM_id_incrementor VALUES (NULL);";
mysql_query($query1);

$id = mysql_insert_id ($db);

mysql_close();

echo "&id=" . $id;

?>

Again, change the first two lines to match your host name, username, password and database name. Notice that this file inserts a blank row into the table VSTM_id_incrementor. We need to create this table:

mysql> CREATE TABLE VSTM_id_incrementor ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
Query OK, 0 rows affected (0.01 sec)

This creates the table VSTM_id_incrementor with a single column – id – that automatically increments. That is, each time a row is added, the value of that column in that row will be one greater than in the previous row. Try this:

mysql> select * from VSTM_id_incrementor;
Empty set (0.00 sec)

mysql> insert into VSTM_id_incrementor VALUES ('NULL');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from VSTM_id_incrementor;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql> insert into VSTM_id_incrementor VALUES ('NULL');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into VSTM_id_incrementor VALUES ('NULL');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from VSTM_id_incrementor;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.00 sec)

8. Recording the subject number.

Now, each time the Flash file runs, it will insert a blank row into VSTM_id_incrementor and retrieve the new subject id. Now we need to record that when data is recorded. There is only one small change to the Flash file.

Add:

scriptVars.subject_id = id;

to prepareResults. You can see this in Part7.fla.

You need to also add a column called “subject_id” to the table VSTM:

mysql> alter table VSTM add column subject_id INT(4);
Query OK, 6 rows affected (0.01 sec)
Records: 6  Duplicates: 0  Warnings: 0

Make sure that the get_next_id.php file in your website has been updated. Now, run your experiment again.

Afterwards, you should be able to see something like this:

select * from VSTM;
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
| subject_age | subject_sex | subject_vision | initials | trial | correct | stimulus | matches | probe | date       | time     | ip            | subject_id |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
|           2 | male        | no             | jkh      |     0 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:06 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     1 |       1 |        2 |       0 |     3 | 2007-06-05 | 14:57:22 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     2 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:25 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     3 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:27 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     4 |       1 |        1 |       0 |     2 | 2007-06-05 | 14:57:30 | NULL          |       NULL |
|           2 | male        | no             | jkh      |     5 |       1 |        3 |       1 |     3 | 2007-06-05 | 14:57:33 | NULL          |       NULL |
|          26 | male        | yes            | jkh      |     0 |       1 |        3 |       0 |     1 | 2007-06-05 | 16:01:26 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     1 |       1 |        2 |       1 |     2 | 2007-06-05 | 16:01:28 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     2 |       1 |        2 |       0 |     3 | 2007-06-05 | 16:01:32 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     3 |       1 |        3 |       1 |     3 | 2007-06-05 | 16:01:34 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     4 |       1 |        1 |       1 |     1 | 2007-06-05 | 16:01:37 | 140.247.95.39 |          4 |
|          26 | male        | yes            | jkh      |     5 |       1 |        3 |       0 |     2 | 2007-06-05 | 16:01:40 | 140.247.95.39 |          4 |
+-------------+-------------+----------------+----------+-------+---------+----------+---------+-------+------------+----------+---------------+------------+
12 rows in set (0.00 sec)

Notice that a subject ID is now recorded.

One problem with this method is that occasionally the same subject number will be given to two different subjects who start at roughly the same time. You can usually distinguish them by demographic information: they will have different ages. However, this is not ideal, and I've been experimenting with other options.

9. Exporting data

You can now view your data, but you can’t do very much with it. To export data to a text file, you need to be in the SSH shell (you can’t do it directly from within MySQL). Type the following:

./mysql –u USERNAME –p PASSWORD –e “SELECT * from VSTM” > /PATH/VSTM_data.txt

This will record the output from “SELECT * from VSTM” to the file VSTM_data.txt in the location /PATH/. You will need to have write privileges for that directory.

10. Screening subjects.

You would probably like to only analyze data from subjects that completed the task. Unfortunately, not all will complete the task. You can cull them by hand, but there is a fairly simple method using Microsoft Excel. Open Results.txt from the Resources folder in Microsoft Excel. These data are from a VSTM experiment I actually ran. Not all the columns are included in order to protect privacy.

You will notice that 75 subjects completed the 0th trial, but only 70 completed the 7th and final trial. We want to eliminate all data from subjects that did not complete the 7th trial. Sort the rows first by trial and second by subject_id so that it looks like this:
subject_id
trial
correct
stimtype
exemplar
probe_match
distractor
date
time
601
7
1
5
1
1
-1
4/23/07
13:16:13
602
7
1
5
1
1
-1
4/23/07
17:58:00
604
7
1
7
1
1
-1
4/23/07
22:16:30
606
7
1
8
1
0
-1
4/24/07
11:59:07
609
7
0
4
1
0
-1
4/24/07
15:08:09
610
7
0
3
1
0
-1
4/24/07
15:16:06
601
6
1
8
2
1
-1
4/23/07
13:16:06
602
6
1
7
2
0
-1
4/23/07
17:57:53
604
6
1
8
2
1
-1
4/23/07
22:16:23
606
6
1
10
2
1
-1
4/24/07
11:59:00
609
6
0
9
2
0
-1
4/24/07
15:08:01
610
6
1
2
2
1
-1
4/24/07
15:15:59
601
5
0
10
1
1
-1
4/23/07
13:15:59
602
5
0
2
1
0
-1
4/23/07
17:57:46
604
5
1
4
1
0
-1
4/23/07
22:16:15
606
5
0
11
1
0
-1
4/24/07
11:58:52

Highlight the 70 subject id’s that completed trial 7. Copy these. Go to Excel->Preferences->Custom Lists. Past the subject IDs into the “List entries:” form and then click “Add”. Then click “OK”. Select all rows and columns. Click Data->Sort. Choose sort by subject_id, ascending, and click “options”. Under “Sort Order” instead of “normal” choose the new sort order that you just created. Select “OK” twice.

Now the spreadsheet should be sorted according to your custom sort order. Now scroll to the bottom of the file. After subject 721, you will see several rows with subjects 607, 651, 663, 682 and 683. These are the subjects who did not complete the experiment – all neatly separated from the rest of the data.

As usual, please leave any question in the comments.

No comments: