|
|
|
|
|
| |
|
|
| |

- Developing a comprehensive, empirically based research framework for classroom-based assessment
by Hill, K., McNamara, T.
Posted on 29 Dec, 2011
This paper presents a comprehensive framework for researching classroom-based assessment (CBA) processes, and is based on a detailed empirical study of two Australian school classrooms where students aged 11 to 13 were studying Indonesian as a foreign language. The framework can be considered innovative in several respects. It goes beyond the scope of earlier models in addressing a number of gaps in previous research, including consideration of the epistemological bases for observed assessment practices and a specific learner and learning focus. Moreover, by adopting the broadest possible definition of CBA, the framework allows for the inclusion of a diverse range of data, including the more intuitive forms of teacher decision-making found in CBA (Torrance & Pryor, 1998). Finally, in contrast to previous studies the research motivating the development of the framework took place in a school-based foreign language setting. We anticipate that the framework will be of interest to both researchers and classroom practitioners.
- Methodological and theoretical issues in the adaptation of sign language tests: An example from the adaptation of a test to German Sign Language
by Haug, T.
Posted on 29 Dec, 2011
Despite the current need for reliable and valid test instruments in different countries in order to monitor the sign language acquisition of deaf children, very few tests are commercially available that offer strong evidence for their psychometric properties. This mirrors the current state of affairs for many sign languages, where very little research is available. No previous empirical study has focused explicitly on the linguistic, methodological, and theoretical issues involved in the process of adapting a test from a source sign language to a target sign language. Problems during the adaptation process can arise from linguistic differences between the source and the target language and differences in the source and the target cultures. Both are important aspects that need to be considered in the adaptation of a sign language test from a source to a target language. This study proposes a model for sign language test adaptation, based on the adaptation of the British Sign Language Receptive Skills Test to German Sign Language. The model includes different methodological steps, with a particular focus on construct validation.
- Investigating the validity of an integrated listening-speaking task: A discourse-based analysis of test takers' oral performances
by Frost, K., Elder, C., Wigglesworth, G.
Posted on 29 Nov, 2011
Performance on integrated tasks requires candidates to engage skills and strategies beyond language proficiency alone, in ways that can be difficult to define and measure for testing purposes. While it has been widely recognized that stimulus materials impact test performance, our understanding of the way in which test takers make use of these materials in their responses, particularly in the context of listening-speaking tasks, remains predominantly intuitive. Recent studies have highlighted the problems associated with content-related aspects of task fulfilment on integrated tasks, but little attempt has been made to operationalize the way in which content from the input material is integrated into speaking performances. Using discourse data from a trial administration of a pilot for an Oxford English language test, this paper investigates how test takers integrate stimulus materials into their speaking performances on an integrated listening-then-speaking summary task, whether these behaviours are reflected in the relevant rating scale and, by implication, whether the test scores assigned according to this scale reflect real differences in the quality of oral performances. An innovative discourse analytic approach was developed to analyse content-related aspects of performance in order to determine if such aspects represent an appropriate measure of the speaking ability construct. Results showed that the measures devised, such as the number of key points included from the input text, and the accuracy with which information was reproduced or reformulated, effectively distinguished participants according to their level of speaking proficiency. The study’s findings support the use of this particular task-type and the appropriateness of the associated rating scale as a measure of speaking proficiency, as well as the utility of the devised discourse-based measures for the validation of integrated tasks in other assessment contexts.
- Linguistic competences of learners of Dutch as a second language at the B1 and B2 levels of speaking proficiency of the Common European Framework of Reference for Languages (CEFR)1
by Hulstijn, J. H., Schoonen, R., de Jong, N. H., Steinel, M. P., Florijn, A.
Posted on 29 Nov, 2011
This study examines the associations between the speaking proficiency of 181 adult learners of Dutch as a second language and their linguistic competences. Performance in eight speaking tasks was rated on a scale of communicative adequacy. After extrapolation of these ratings to the Overall Oral Production scale of the Common European Framework of Reference for Languages (CEFR) (Council of Europe, 2001), 80 and 30 participants (on average per speaking task) were found to be, respectively, at the B1 and B2 levels of this scale. The following linguistic competences were tapped with non-communicative tasks: productive vocabulary knowledge, productive knowledge of grammar, speed of lexical retrieval, speed of articulation, speed of sentence building, and pronunciation skills. Discriminant analyses showed that all linguistic competences, except speed of articulation, discriminated participants at the two levels of oral production. Subsequent comparisons showed that the distance between B1ers and B2ers was smaller in knowledge of high-frequency words than in knowledge of medium- and low-frequency words. Extrapolation from scores on the vocabulary test yielded estimations of productive vocabularies of, on average, 4000 and 7000 words for B1ers and B2ers, respectively. The grammar test assessed grammatical knowledge in 10 domains. B2ers were found to outperform B1ers on all parts of the test. Thus, the differences in lexical and grammatical knowledge of B1ers and B2ers appear to be a matter of degree, rather than a matter of category or domain. The paper ends with a research agenda for a linguistic underpinning of the CEFR.
- Predicting the proficiency level of language learners using lexical indices
by Crossley, S. A., Salsbury, T., McNamara, D. S.
Posted on 29 Nov, 2011
This study explores how second language (L2) texts written by learners at various proficiency levels can be classified using computational indices that characterize lexical competence. For this study, 100 writing samples taken from 100 L2 learners were analyzed using lexical indices reported by the computational tool Coh-Metrix. The L2 writing samples were categorized into beginning, intermediate, and advanced groupings based on the TOEFL and ACT ESL Compass scores of the writer. A discriminant function analysis was used to predict the level categorization of the texts using lexical indices related to breadth of lexical knowledge (word frequency, lexical diversity), depth of lexical knowledge (hypernymy, polysemy, semantic co-referentiality, and word meaningfulness), and access to core lexical items (word concreteness, familiarity, and imagability). The strongest predictors of an individual’s proficiency level were word imagability, word frequency, lexical diversity, and word familiarity. In total, the indices correctly classified 70% of the texts based on proficiency level in both a training and a test set. The authors argue for the applicability of a statistical model as a method to investigate lexical competence across language levels, as a method to assess L2 lexical development, and as a method to classify L2 proficiency.
- Accent, listening assessment and the potential for a shared-L1 advantage: A DIF perspective
by Harding, L.
Posted on 17 Nov, 2011
This paper reports on an investigation of the potential for a shared-L1 advantage on an academic English listening test featuring speakers with L2 accents. Two hundred and twelve second-language listeners (including 70 Mandarin Chinese L1 listeners and 60 Japanese L1 listeners) completed three versions of the University Test of English as a Second Language (UTESL) listening sub-test which featured an Australian English-accented speaker, a Japanese-accented speaker and a Mandarin Chinese-accented speaker. Differential item functioning (DIF) analyses were conducted on data from the tests which featured L2-accented speakers using two methods of DIF detection – the standardization procedure and the Mantel-Haenszel procedure – with candidates matched for ability on the test featuring the Australian English-accented speaker. Findings showed that Japanese L1 listeners were advantaged on a small number of items on the test featuring the Japanese-accented speaker, but these were balanced by items which favoured non-Japanese L1 listeners. By contrast, Mandarin Chinese L1 listeners were clearly advantaged across several items on the test featuring a Mandarin Chinese L1 speaker. The implications of these findings for claims of bias are discussed with reference to the role of speaker accent in the listening construct.
- The contribution of test-takers' speech content to scores on an English oral proficiency test
by Sato, T.
Posted on 17 Nov, 2011
The content that test-takers attempt to convey is not always included in the construct definition of general English oral proficiency tests, although some English-for-academic-purposes (EAP) speaking tests and most writing tests tend to place great emphasis on the evaluation of the content or ideas in the performance. This study investigated the relative contribution of linguistic criteria and the elaboration of speech content to scores on a test of speaking proficiency. A speaking test was designed and administered to Japanese undergraduates to determine what criteria English teachers associate with general oral proficiency. Nine raters were recruited to rate 30 students’ monologues on three topics, using intuitive judgments of oral proficiency (referred to as Overall communicative effectiveness). Following this, they assigned scores to the monologues using five criteria: Grammatical accuracy, Fluency, Vocabulary range, Pronunciation, and Content elaboration/development. The raters were also asked to provide open-ended written comments on the factors contributing to their intuitive judgments. Statistical analyses of the scores – Rasch measurement, multiple regression, and multivariate generalizability (G) theory analysis – revealed that Content elaboration/development made a substantive contribution to the intuitive judgments and composite score. The present study enriches our understanding of general oral proficiency and the construct definition of proficiency tests.
- Development and validation of Extract the Base: An English Derivational Morphology Test for third through fifth grade monolingual students and Spanish-speaking English language learners
by Goodwin, A. P., Huggins, A. C., Carlo, M., Malabonga, V., Kenyon, D., Louguit, M., August, D.
Posted on 17 Nov, 2011
This study describes the development and validation of the Extract the Base test (ETB), which assesses derivational morphological awareness. Scores on this test were validated for 580 monolingual students and 373 Spanish-speaking English language learners (ELLs) in third through fifth grade. As part of the validation of the internal structure, which involved using the Generalized Partial Credit Model for tests with polytomous items, items on this test were shown to provide information about students of different abilities and also discriminate amongst such heterogeneous students. As part of the validation of the test’s relationship to criterion, items were shown to correlate with measures of word identification, reading comprehension, and vocabulary measures. Differences in performances for fluent English students and ELLs, students of varied home language environments, and different grade levels were noted. Additionally, the task was validated using a dichotomous scoring system to provide reliability and validity information using this alternate scoring method.
|
|
|
| |
|
|
|
|
| |
| |
|
| |
|