For more than 100 years, comparative psychologists have sought to understand the evolution of human intelligence. New paradigms for studying cognitive processes in animals—in particular symbol use and memory—have, for the first time, allowed psychologists and neuroscientists to compare higher thought processes in animals and human beings. New imaging approaches have also facilitated exploring the neural basis of behavior and both animals and humans. Questions concerning the nature of animal and human cognition have defined the themes of this seminar whose members include specialists in cognition, ethology, philosophy and neuroscience.
The deep roots of language development and evolution (joint meeting with 681)
D. Kimbrough Oller, University of Memphis Abstract
Language emerges in empirically discernible steps, the first of which is recognizable in the vocal activity of infants, who produce massive numbers of speech-like vocalizations (“protophones”) from the first day of life. Based on all-day recordings, we estimate they produce ~3500 protophones/day, 4-5/minute every waking hour, even in the first two months. >90% of protophones are directed to no one, often occurring when infants are alone. Even when parents are instructed to interact with infants, the majority of protophones are not directed to the parents, and deaf infants produce protophones similarly.
Endogenous, exploratory vocalization is a prerequisite to vocal language, because without the ability and inclination to produce sounds flexibly, with no connection to immediate utilitarian needs, one could not begin down a path of development/evolution that could lead to language. Why? Because every act of language, every syllable, every word, every sentence must be accessible at any point time—people must be able to produce each string of speech for a great range of illocutionary intentions or for no social purpose at all. No other ape produces exploratory vocalizations, their repertoires instead serving immediate functions (expressing fear, aggression, affiliation, submission, etc.). Hominin infants are altricial, requiring years of protection and provisioning—consequently there has long been a special selection pressure for them to signal their fitness to potential caregivers. We propose this pressure selected hominin vocal inclinations/capacities that formed a necessary foundation for other developments that led to language.
Faculty House, Columbia University / Zoom 4:00 PM
Bigger data about smaller people: Studying language learning at scale (Joint meeting with 681)
Every typically developing child learns to talk, but children vary tremendously in how and when they do so. What predicts this variability, and what is consistent across children and across learners of different languages? In this talk, I’ll describe our efforts to create predictive models of early language learning as a way of formalizing hypotheses in this space. This goal has led us to create open data resources like Wordbank, childes-db, and Peekbank that capture data from tens of thousands of children learning dozens of different languages.
Faculty House, Columbia University 4:00 PM
Using an associative processing framework in scene understanding to predict behavior and fMRI signal in the brain
Visual scenes are rich and complex stimuli that we typically understand in a glance. In many cases, the behavioral and neural mechanisms underlying scene processing are discussed in contrast to the processing of other stimulus categories (e.g., objects, faces). In this talk, I connect the processing of a scene to other types of visual processing (what is a scene without any objects?) as well as to other cognitive domains across the human mind. Using fMRI, I demonstrate that regions of the brain functionally defined as scene selective are also, critically, sensitive to the objects embedded in the scene, thus blurring the lines of categorical selectivity. Moreover, the functional context in which we interact with a scene can modulate the extent to which scene selective regions respond to scene stimuli. I will use fMRI and behavioral data to demonstrate how the functions of scene selective regions of the brain can be explained through an associative processing framework. By using associative processing as a mechanism by which we understand scenes we can predict both behavioral results within vision and how different domains of cognition may interact with scene processing. For example, we demonstrate how one’s mindset can affect their visual processing, or how the associations of a scene can affect face perception or scene memory.
Classifying people into categories is a fundamental means by which we make sense of the social world. People can be categorized along countless dimensions, but young children develop beliefs that some ways of grouping people, more than others, reflect fundamental, objective, and meaningful ways of carving up the world. How do beliefs that some particular differences reflect fundamentally distinct kinds develop? This talk will present experimental research revealing how subtle linguistic cues both reflect and elicit representations of social kinds and thus can facilitate their spread across generations and communities. I will illustrate these processes drawing on a series of in-person and online laboratory studies examining the mechanisms underlying the transmission of these beliefs from speaker to listener, as well as on a large field experiment in the New York City Public Schools and on-going longitudinal and cross-cultural work testing how these processes unfold in children’s daily lives.
Young children have sophisticated representations of their visual and linguistic environment. Where do these representations come from? How much knowledge arises through generic learning mechanisms applied to sensory data, and how much requires more substantive (possibly innate) inductive biases? We examine these questions by training neural networks solely on longitudinal data collected from a single child (Sullivan et al., 2020), consisting of egocentric video and audio streams. Our principal findings are as follows: 1) Based on visual only training, neural networks can acquire high-level visual features that are broadly useful across categorization and segmentation tasks. 2) Based on language only training, networks can acquire meaningful clusters of words and sentence-level syntactic sensitivity. 3) Based on paired visual and language training, networks can acquire word-referent mappings from tens of noisy examples and align their multi-modal conceptual systems. Taken together, our results show how sophisticated visual and linguistic representations can arise through data-driven learning applied to one child’s first-person experience.
Psychologists and computer scientists have very different views of the mind. Psychologists tell us that humans are error-prone, using simple heuristics that result in systematic biases. Computer scientists view human intelligence as aspirational, trying to capture it in artificial intelligence systems. How can we reconcile these two perspectives? In this talk, I will argue that we can do so by reconsidering how we think about rational action. Psychologists have long used the standard of rationality from economics, which focuses on choosing the best action without considering the computational difficulty of that choice. By using a standard of rationality inspired by computer science, in which the quality of the outcome trades off with the amount of computation involved, we obtain new models of human behavior that can help us understand the cognitive strategies that people adopt. I will present examples of this approach in the context of human decision-making and planning, including complex planning problems such as the game of chess.
Faculty House, Columbia University 4:00 PM
Connecting Performance Changes on Visual Tasks to Neural Mechanisms using Convolutional Neural Networks
Behavioral studies have demonstrated that certain conditions reliably enhance performance on challenging visual tasks. These include extended image presentation time and the valid cueing of attention. Here, I will show how convolutional neural networks can be used as a model of the visual system that connects neural activity changes to performance changes. Specifically, I will discuss how different anatomical forms of recurrence can account for the dynamics of degraded object recognition. I will then show how experimentally-observed neural activity changes associated with feature attention lead to observed performance changes on detection tasks. I will also speak briefly about ongoing work studying the connection between attention and learning. In total, this work has implications for how we identify the neural mechanisms and architectures important for behavior.