[slides and audio] Concreteness ratings for 40 thousand generally known English word lemmas

Concreteness ratings for 40,000 English word lemmas were collected through a crowdsourcing study involving over 4,000 participants. The dataset includes 37,058 English words and 2,896 two-word expressions, with ratings based on participants' experiences involving all senses and motor responses. Although the instructions emphasized the importance of all sensory experiences, the ratings largely focused on visual and haptic experiences, similar to existing concreteness norms. The dataset contains lemmas known by at least 85% of the raters and can be used as a reference list for future research. Concreteness refers to the degree to which a word refers to a perceptible entity. It is central to psycholinguistic and memory research, influencing word recognition, working memory, long-term memory storage, bilingual processing, and affective connotations. The study addresses limitations of existing concreteness ratings, such as overemphasis on visual experiences and lack of action-related experiences. To overcome these, a new dataset of 63,039 English lemmas was collected, including words not previously included in large word frequency lists. The dataset includes 37,058 words and 2,896 two-word expressions, with ratings based on a 5-point scale from abstract to concrete. The data were collected using Amazon Mechanical Turk, with participants rating words based on their perceptual experiences. The study aimed to obtain concreteness ratings for a larger sample of words, ratings based on all types of experiences, and to define a reference list of English lemmas for future studies. The results showed a high correlation between the new ratings and existing MRC ratings (r = .919), indicating reliability and validity. The new ratings also correlated well with perceptual strength ratings, showing similar information despite differences in instructions. The dataset provides a comprehensive reference list of generally known English lemmas, which can be used in future research. The study highlights the importance of considering multiple sensory experiences and the limitations of existing word frequency lists. The data are available in an Excel file, including word ratings, frequency counts, and other relevant information.Concreteness ratings for 40,000 English word lemmas were collected through a crowdsourcing study involving over 4,000 participants. The dataset includes 37,058 English words and 2,896 two-word expressions, with ratings based on participants' experiences involving all senses and motor responses. Although the instructions emphasized the importance of all sensory experiences, the ratings largely focused on visual and haptic experiences, similar to existing concreteness norms. The dataset contains lemmas known by at least 85% of the raters and can be used as a reference list for future research. Concreteness refers to the degree to which a word refers to a perceptible entity. It is central to psycholinguistic and memory research, influencing word recognition, working memory, long-term memory storage, bilingual processing, and affective connotations. The study addresses limitations of existing concreteness ratings, such as overemphasis on visual experiences and lack of action-related experiences. To overcome these, a new dataset of 63,039 English lemmas was collected, including words not previously included in large word frequency lists. The dataset includes 37,058 words and 2,896 two-word expressions, with ratings based on a 5-point scale from abstract to concrete. The data were collected using Amazon Mechanical Turk, with participants rating words based on their perceptual experiences. The study aimed to obtain concreteness ratings for a larger sample of words, ratings based on all types of experiences, and to define a reference list of English lemmas for future studies. The results showed a high correlation between the new ratings and existing MRC ratings (r = .919), indicating reliability and validity. The new ratings also correlated well with perceptual strength ratings, showing similar information despite differences in instructions. The dataset provides a comprehensive reference list of generally known English lemmas, which can be used in future research. The study highlights the importance of considering multiple sensory experiences and the limitations of existing word frequency lists. The data are available in an Excel file, including word ratings, frequency counts, and other relevant information.

Concreteness ratings for 40 thousand generally known English word lemmas

| Marc Brysbaert, Amy Beth Warriner, Victor Kuperman