What do we see first: The animal or the dog?

The topic of ‘Object Recognition’ is a highly interesting, yet intense one to learn about. The processes that occur for us to be able to see and recognise objects through vision are extremely complex and require a lot of steps in a minimal amount of time. As you can probably imagine trying to write a short blog on the topic isn’t the easiest thing to do, but I’ll give it my best shot.

Michèle Fabre-Thorpe from The Brain and Cognition Research Centre in Toulouse (CerCo) recently presented a seminar looking at the temporal dynamics of object recognition, or in terms that we can understand what do we see first: the animal or the dog? Now even I was still slightly confused by this statement because a dog is an animal right? However in order to understand how this all works first we must understand that we as humans subconsciously process images to a certain level of detail.

1Dog-rough-collie-portraitFor example take the photo to the left. When this flashed in front of us briefly we process it in four levels. First we recognise it as an animal, this is the superordinate level. Then as a dog, the basic level. The next step is that we determine it is a collie, the subordinate level. Finally we recognise it as our collie Bella (hypothetically it is our pet), the unique level. Now all this is done extremely quickly and subconsciously, however as you can clearly see, each level provides more detail and therefore takes longer to process than the preceding level1.

This all seems reasonable in theory but with some more thought a problem begins to brew. Here’s the problem: by default we automatically categorise to the basic level not the superordinate. For example we say ‘watch out, the cat might scratch you’, we don’t say ‘watch out, the animal might scratch you’. This is where Michèle’s research question comes in. What do we really see first, the animal (superordinate level) or the dog (basic level)?

The first test involved flashing two images side by side for 400ms (as this time yields the most stable results and it has been previously determined that presentation time of images has no effect on the ability to categories them2). The subjects were asked to determine which image contained an animal. The next test flashed two images side by side and ask the subject to determine which was a bird. For both of these tests the opposing image was a non-animal.

The results revealed that it look around 50 ms longer to ID to the basic level than the superordinate, however the extra time accounted for higher accuracy. Why animals you ask? Well animals are the most commonly used images in visual perception tests as humans are able to respond to whether the image contains an animal or not in around 250 ms making it one of the fastest visual detection perception tests to run3

The next tests involved flashing one image for 400 ms, then another after for 400 ms, a similar example is on the right. demoSubjects were told to ignore the first, and were asked if the second was a bird or not. The trick with this test is that categorisation is an automatic process. When shown two images in quick succession and told to ignore the first, the first is still categorised. This in turn creates interference in the process when asked to ID the second image, the interference is then amplified if objects shown are of the same level (e.g. superordinate, basic).

Take the three examples below: If both are birds the reaction time (RT) and error are the smallest as the process does not have to change between images.

Animate Animate
Animal Animal
Bird Bird

If the first is a plane and second a bird RT is decreased and error increased, as the subject has to differ between superordinate categories.

Inanimate Animate
Vehicle Animal
Plane Bird

If the first is a dog and second a bird RT and error are the largest as the subject has determine between basic categories which is the most intensive out of the three.

Animate Animate
Animal Animal
Dog Bird

So what does all this mean? Firstly, the animal is always seen before the dog; with this in mind it is important to remember we subconsciously process everything to at least the basic level meaning we do not see the animal, rather the dog. The longer processing time between each level ensures that as more details are gathered about the image the accuracy of our ability to determine what the image is increases. Without doubt the most interesting fact is the idea of automatic categorisation, this gives a wonderful insight into how extremely powerful our brains and visual systems are.

I believe the most interesting point brought out by Michèle’s work is that of the interference, brought about when two images of the same level are flashed in quick succession. Attempting to understand this ‘lag’ (if you can call it that could lead to interesting research on how much time is needed to avoid this interference. Doing so could potentially lead to the discovery of the minimum time it takes to fully process an image.

1. Macé. M et al. 2009. The Time-Course of Visual categorizations: You Spot the Animal Faster than the Bird. PLoS ONE, 4(6): e5927. 2. Poncet. M and Farbe-Thorpe. M. 2014. Stimulus duration and diversity do not reserve the advantage for superordinate-level representations: the animal is seen before the bird. European Journal of Neuroscience, 39: 1508-1516. 3. Crouzet. C et al. 2012. Animal Detection Precedes Access to Scene Category. PLoS ONE, 7(12): e51471.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s