| |
|
|
| |
A
- Abedi, J. 2004. The No Child Left Behind Act and English Language Learners: Assessment and Accountability Issues
Educational Researcher 33, 1, 4 - 44.
- Amrein, A. L., Berliner, D. C. & Rideau, S 2010.
Cheating in the first, second, and third degree: Educators' responses to high-stakes testing.
Education Policy Analysis Archives, 18, 14.
- Amrein-Beardsley, A. L. and Berliner, D. C. 2002.
High Stakes Testing, Uncertainty, and Student Learning.
Education Policy Analysis Archives, 10, 18.
- Anonymous Evaluation and Assessment Primer
Vanderbilt University
- Anonymous, 2009. Computer-based and paper-pencil test comparability.
Pearson Education: Test, Measurement and Research Services Bulletin 9
- Assessment Reform Group. (1999). Beyond the Black Box.
- Assessment Reform Group. (2002). Testing, Motivation and Learning. Cambridge: University of Cambridge Faculty of Education.
- Atkinson, T and Davies, G. 2000.
Computer Aided Assessment and Language Learning. ICLT4LT.
- Atkinson, R. C. and Geiser, S. 2010. Reflections on a century of College Admissions Tests. Educational Researcher 38, 9, 665 - 667.
- Au, W. 2007.
High Stakes Testing and Curricular Control.
Educational Researcher, 36, 5.
B
C
- Camilli, G. 1996.
Standard Errors in Educational Assessment: A Policy Analysis Perspective
Education Policy Analysis Archives 4, 4.
- Canagarajah, S. 2006.
Changing Communicative Needs, Revised Assessment Objectives: Testing English as an International Language
Language Assessment Quarterly, 3, 3, 229 - 242.
- Canale, M. and Swain, M. 1980.
Theoretical Bases of Communicative Approaches to Second Language Teaching and Testing.
Applied Linguistics 1, 1, 1 - 47.
- Carrell, P. L. 2007.
Notetaking strategies and their relationship to performance on listening comprehension and communicative assessment tasks.
TOEFL Monograph No. MS-35. Princeton, NJ: Educational Testing Service.
- Carrell, P. L. , Dunkel, P. A. and Mollaun, P. 2002.
The effects of notetaking, lecture length, and topic on the listening component of TOEFL 2000.
TOEFL Monograph No. MS-23. Princeton, NJ: Educational Testing Service.
- Celik, M. 1999.
Testing Some Suprasegmental Features of English Speech The Internet TESL Journal, 5, 8.
- Chalhoub-Deville, M. 2001.
Language Testing and Technology: Past and Future
Language Learning and Technology, Vol 5, No. 2, May 2001, 95 - 98.
- Chalhoub-Deville, M. and Fulcher, G. 2003.
The Oral Proficiency Interview: A Research Agenda
Foreign Language Annals, 36, 4, 498 - 506.
- Chapman, M. 2003.
TOEIC: Tried but Undertested.
JALT Testing and Evaluation SIG Newsletter, 7, 3, 2 - 5.
- Cimbricz, S. 2002.
State-mandated testing and teachers' beliefs and practice.
Education Policy Analysis Archives 10, 2.
- Cohen, A. D. 2001.
Second Language Assessment.
In Celce-Murcia, M. (Ed.) Teaching English as a second or foreign language. 3rd edition. Boston: Heinle & Heinle/Thomson Learning, 515 - 534.
- Cohen, A. D. 2007.
The Coming of Age for Research on Test-Taking Strategies.
In Fox, M., Wesche, D., Bayliss, L., Cheng, C., Turner, C., and Doe, C (Eds.) Language Testing Reconsidered. Otawa: Ottawa University Press, 89 - 111.
- Cohen, A. D., & Upton, T. A. 2006.
Strategies in responding to new TOEFL reading tasks.
TOEFL Monograph No. MS-33. Princeton, NJ: Educational Testing Service.
- Comber, J. 1998. Are Test Preparation Programs Really Effective? Evaluating an IELTS Preparation Course?
Unpublished MA dissertation, University of Surrey.
- Commitee on Assessment and Evaluation in Education. 2005.
The Knowledge Base for Assessment and Evaluation in Education.
Israel Academy of Sciences and Humanities; Ministry of Education, Culture and Sport;
Rochschild Foundation (Yad Hanadiv).
- Coniam, D. and Falvey, P. 1999.
Assessor training in a high-stakes test of speaking: The Hong Kong English language benchmarking initiative.
Melbourne Papers in Language Testing 8, 2.
- Coombe, C. 2002.
Self-assessment in language testing: Reliability and validity issues.
Karen's Linguistics Issues.
- Cronbach, L. J. and Meehl, P. E. 1955.
Construct Validity in Psychological Tests
Psychological Bulletin, 52, 281 - 302.
- Cumming, A. 1994.
Does Language Assessment Facilitate Recent Immigrants' Participation in Canadian Society?
TESL Canada Journal, 11, 2, 117 - 133.
- Cumming, A., Grant, L., Mulcahy-Ernt, P., & Powers, D. E. 2005.
A teacher-verification study of speaking and writing prototype tasks for a new TOEFL Test.
TOEFL Monograph No. MS-26. Princeton, NJ: Educational Testing Service.
- Cumming, A., Kantor, R., Baba, K., Eouanzoui, K., Erdosy, U., & James, M. 2006.
Analysis of discourse features and verification of scoring levels for independent and integrated prototype written tasks for the new TOEFL.
TOEFL Monograph No. MS-30. Princeton, NJ: Educational Testing Service.
- Cunningham, C. R. 2002.
The TOEIC test and communicative competence: Do test score gains correlate
with increased competence? A preliminary study. University of Birmingham,
UK: MA dissertation.
D
- Davidson, F. and Fulcher, G. 2007.
Flexibility is proof of a good 'framework'.
Guardian Weekly, 17th November.
- Davies, A. 1984.
Computer Assisted Language Testing.
CALICO Journal 1, 5.
- Davies, A. 1997.
The education (and training) of language testers. Melbourne Papers in Language Testing 6, 1.
- de Jong, H.A.L. 1990.
Standardization in Language Testing. AILA Review 7.
This is the complete text of the edited volume, and contains the following papers:
- Guest-editor's Preface
John H. A. L. DE JONG 3-5
- Language Testing in Research and Education: The Need for Standards
Peter J. M. GROOT 6-23
- The Cambridge-TOEFL Comparability Study : An example of the Cross-National Comparison of Language Tests
Fred DAVIDSON & Lyle BACHMAN 24-45
- The Australian Second Language Proficiency Ratings (ASLPR)
David E. INGRAM 46-61
- Cross-National Standards: A Dutch-Swedish Collaborative Effort in National Standardized Testing
John H.A.L. DE JONG & Mats OSCARSON 62-78
- The Hebrew Speaking Test: An Example of International Cooperation in Test Development and Validation
Elana SHOHAMY & Charles W. STANSFIELD 79-90
- EUROCERT: An International Standard for Certification of Language Proficiency
Alex OLDE KALTER & Paul VOSSEN 91-105
- Response to Alex Olde Kalter and Paul Vossen
John READ 106-107
- Dikli, A. 2006.
An Overview of Automated Scoring of Essays. Journal of Technology, Learning, and Assessment, 5, 1.
- Dooey, P. 1999.
An investigation into the predictive validity of the IELTS Test as an indicator of future academic success
.
In K. Martin, N. Stanley and N. Davison (Eds), Teaching in the Disciplines/ Learning in Context, 114-118.
Proceedings of the 8th Annual Teaching Learning Forum, The University of Western Australia, Perth.
- Dorans, N. J. 2008.
The practice of comparing scores on different tests. R&D Connections 6. Princeton, NJ: Educational Testing Service.
- Dunkel, P. A. 1997.
Computer-Adaptive Testing of Listening Comprehension: A Blueprint for CAT Development
The Language Teacher Online, 21, 10.
- Dunkel, P. A. 1999.
Considerations in developing or using
second /foreign language proficiency computer-adaptive tests
Language Learning & Technology 2, 2, 77-93
- Dunkin, M. J. 1997.
Assessing Teachers' Effectiveness. Issues in Educational Research, 7(1), 1997, 37-51.
- Dymoke, S. (no date).
Assessing Your Pupils' Poetry. Poetry Class Website Resources.
E
- Educational Testing Service.
ETS Fairness Review & ETS Standards for Quality and Fairness.
- Elder, C. (1998).
What counts as bias in language testing?
Melbourne Papers in Language Testing 7, 1.
- Embretson, S. 1983.
Construct Validity: Construct Representation Versus Nomothetic Span. Psychological Bulletin, 93, 1, 179 - 197.
- Emmerich, W., Enright, M. K., Rock, D. A. and Tucker, C. 1991.
The Development, Investigation, and Evaluation of New Item Types for the GRE Analytical Measure.
Educational Testing Service, Princeton NJ, ETS Research Report 91-16.
- Ennis, R. H. 1999.
Test Reliability: A Practical Exemplification of
Ordinary Language Philosophy. Philosophy of Education
F
- Feast, V. 2002.
The Impact of IELTS scores on performance at university.
International Education Journal, 3, 4, 70 - 85.
- Frary, R. B. 1996.
Hints for Designing Effective Questionnaires Practical Assessment, Research and Evaluation, Vol. 11
- Frary, R. B. 1995.
More Multiple Choice Item Writing Do's and Don'ts. ERIC/AE Digest Series EDO-TM-95-4.
- Frary, R. B. 2002.
A Brief Guide to Questionnaire Development.
- Fox, J. and Courchene, R. (2005). "The Canadian Language Benchmarks (CLB): A Critical Appraisal." Contact 31, 2, 7 - 28.
- Fulcher, G. (1987). "Tests of Oral Performance: the need for data-based criteria." English Language Teaching Journal 41, 4, 287 - 291.
- Fulcher, G. (1996). "Invalidating validity claims for the ACTFL Oral Rating Scale." System 24, 2, 163 - 172.
Fulcher, G. (1996). "Does thick description lead to smart tests? A data-based approach to rating scale construction". Language Testing 13, 2, 208 - 238.
- Fulcher, G. (1998). "Widdowson's model of communicative competence
and the testing of reading: An exploratory study." System 26, 3, 281 - 302.
- Fulcher, G. 1999.
Ethics in Language testing TAE SIG Newsletter - Special Conference Issue, Volume 1, No. 1
- Fulcher, G. (1999). "Assessment in English for Academic Purposes: Putting content validity in its place."
Applied Linguistics 20, 2, 221 - 236.
- Fulcher, G. (2000). "Computers in language testing." In Brett P. and Motteram, G. (Eds.) A special interest in computers: Learning and teaching with information and communications technologies. Manchester: IATEFL publications, 93 - 107. Reprinted with the kind permission of IATEFL.
- Fulcher, G. 2001.
Machines get clever at testing Education Guardian, 17 May.
- Fulcher, G. 2003.
Few ills cured by setting scores Education Guardian, 17 April.
- Fulcher, G. 2004.
Are Europe's tests being built on an 'unsafe' framework? Education Guardian, 18 March.
Read the response from Brian North
Fulcher, G. (2004). Fulcher, G. (2004). "Deluded by artifices? The Common European Framework and harmonization." Language Assessment Quarterly, 1, 4, 253 - 266.
- Fulcher, G. 2008. "Testing Times Ahead?"
Liaison Magazine, Issue 1: July, 20 - 24.
Published by the UK Subject Centre for Languages, Linguistics and Area Studies, University of Southampton.
- Fulcher, G. 2009. Test use and political philosophy.
Annual Review of Applied Linguistics, 29, 3 - 20.
- Fulcher, G. and Bamford, R. (1996). "I didn't get the grade I need. Where's my solicitor?" System 24, 4, 437 - 448.
- Fulcher, G. and Davidson, F. (2008).
"Tests in Life and Learning: A Deathly Dialogue."
Educational Philosophy and Theory, 40, 3, 407 - 417.
G
- Gebril, A. and Plakans, L. 2009.
Investigating source use, discourse features, and process in integrated writing tasks.
Spaan Fellow Working Papers in Second or Foreign Language Assessment 7, 47 - 84.
- Geisinger, Kurt F. - Carlson, Janet F. 1995.
Testing Students with Disabilities
ERIC Digest.
- Gibson, E. J., Brewer, P. W. Dholakia, A., Vouk, M. A., and Bitzer, D. L. 1995.
A Comparative Analysis of Web-Based Testing and Evaluation Systems
North Carolina University.
- Gilfert, S. 1996. A Review of TOEIC The Internet TESOL Journal 11, 8.
- Ginther, A. 2001.
Effects of the presence and absence of visuals on performance on TOEFL CBT listening-comprehension stimuli
TOEFL Research Report 66, Princeton, N.J.: Educational Testing Service.
- Glass, G. V. 1978.
Standards and criteria Journal of Educational Measurement 15, 4, 237 - 261.
- Godwin-Jones, B. 2001.
Language Testing Tools and Technology Language Learning & Technology,
Vol. 5, No. 2, May 2001, 8-12
- Goh, C. and Aryadoust, S. V. 2010.
Investigating the Construct Validity of the MELAB Listening Test through the Rasch Analysis and Correlated Uniqueness Modeling.
Spaan Fellow Working Papers in Second or Foreign Language Assessment 8, 31 - 68.
- Gorsuch, G. J. and Cox, T. 2000.
Something Old, Something New, Something Borrowed, Something....: Piloting a Computer Mediated Version of the Michigan Listening Comprehension Test
TESOL EJ 4, 4.
- Grant. S. G. 2000 Teachers and Tests:
Exploring Teachers' Perceptions of
Changes in the New York State Testing Program Education Policy Analysis Archives, 8, 14.
- Godwin-Jones, B. 2001.
Emerging Tools: Language Testing Tools and Technologies.
Language Learning and Technology, Vol 5, No. 2, May 2001, 8 - 12.
- Gorin, J. S. 2007.
Reconsidering Issues in Validity Theory. Educational Researcher 36, 8, 456 - 462.
- Grabowski, K. C. 2007.
Reconsidering the measurement of pragmatic knowledge using a reciprocal written task format. Teachers College, Columbia University Working Papers in TESOL and Applied Linguistics, 7, 1.
- Gruba, P. A. 1999.
The role of digital video media in second language listening comprehension. University of Melbourne: Unpublished PhD thesis.
H
- Haji pour Nezhad, G. R. 2002.
Item complexity and Judgment Revisited.
Unpublished PhD Thesis, Tehran University.
- Haji pour Nezhad, G. R. 2002.
Reading complexity judgments, Episode 1.
JALT Testing and Evaluation SIG Newsletter, 5, 3, 2 - 5.
- Haji pour Nezhad, G. R. 2002.
Reading complexity judgments, Episode 2.
JALT Testing and Evaluation SIG Newsletter, 6, 1, 2 - 5.
- Haji pour Nezhad, G. R. 2002.
Reading complexity judgments, Episode 3.
JALT Testing and Evaluation SIG Newsletter, 6, 2, 2 - 5.
- Hamilton, L. S., Klein, S. P., and Lorie, W. No Date.
Using Web-Based Testing for Large-Scale Assessment
Rand Education.
- Hansen, E. G., Forer, D. C., & Lee, M. J. 2004.
Toward accessible computer-based tests: Prototypes for visual and other disabilities.
.
TOEFL Research Report RR-78. Princeton, NJ: Educational Testing Service.
- Harding, L. 2008.
Accent and academic listening assessment: A study of test-taker perceptions.
Melbourne Papers in Language Testing 13, 1.
- Harlen, W. H. and Crick, R. D. 2002.
A Systematic Review of the impact of summative assessment and tests on students'
motivation for learning.
London: Institute of Education, Evidence for Policy and Practice Information
and Co-ordinating Centre.
- Hawkey, R. and Barker, F. (2004). Developing a common scale for the assessment of writing. Assessing Writing 9, 122 - 159.
- Hong, W-P, 2008.
Does high-stakes testing increase cultural capital among low-income and racial minority students?.
Educational Policy Analysis Archives, 16, 6.
- Hguyen, T. N. H. 2007.
Effects of test preparation on test performance - the case of the IELTS and TOEFL iBT Listening Tests.
Melbourne Papers in Language Testing 12, 1.
- Huitt, B., Hummel, J. and Kaeck, D. 1995. Assessment, Measurement, Evaluation and Research Valdosta State University
- Hutchison, D. and Benton, T. 2009.
Parallel Universes and Parrallel Measures: Estimating the Reliability of Test Results.
London: OFQUAL and the National Foundation for Educational Research.
I
J
- Jacobsen, M., Kremer, R., and Flores, R. 1999
WebCT in Computer Science New Currents in Teaching and Learning, 6, 3.
- Jamieson, J., Jones, S., Kirsch, I., Mosenthal, P., Taylor, C. 2000
TOEFL 2000 Framework: A Working Paper
Educational Testing Service, Princeton NJ.
- Jia, Y., and Zhang, W. 2007
Evaluating the construct validity of an EFL test for PhD candidates: A quantitative analysis of two versions
Shiken, 11, 1, 2 - 16.
- Joint Committee on Testing Practices. 2004.
Code of Fair Testing Practices in Education.
American Psychological Association.
K
- Kane, M. 2010.
Errors of Measurement, Theory, and Public Policy.
12th Annual William H. Angoff Memorial Lecture. Princeton, NJ: Educational Testing Service.
- Kang, O. 2008.
Ratings of L2 oral performance in English: Relative impact of rater characteristics and accoustic measures of accendtedness.
Spaan Fellow Working Papers in Second or Foreign Language Assessment 6, 181 - 205.
- Karavas, E., and Delieza, X. 2009.
On-site observation of KPG oral examiners: Implications for oral examiner training and evaluation.
Journal of Applied Language Studies 3, 1, 51 - 77.
- Kehoe, J. 1995.
Basic Item Analysis for Multiple-Choice Tests.
ERIC Digest.
- Kehoe, J. 1995.
Writing Multiple Choice Test Items.
ERIC Digest.
- Kenworthy, R. 2006.
Timed versus At-home Assessment Tests: Does Time Affect the Quality of Second Language Learners' Written Compositions?
.
TESOL-EJ 10, 1.
- Kenyon, D. M. and Malabonga, V. 2001.
Comparing examinee attitudes toward computer-assisted and otheroral proficiency assessments.
Language Learning and Technology, Vol 5, No. 2, May 2001, 60 - 83.
- Kim, H. J. and Shin, H. W. 2006.
A reading and writing placement test: Design, evaluation, and analyais. Teachers College, Columbia University Working Papers in TESOL and Applied Linguistics, 6, 2.
- Kirsch, I., Jamieson, J., Taylor, C., and Eignor, D. 1998.
Computer Familiarity Among TOEFL Examinees
TOEFL Research Report 59, Educational Testing Service,
Princeton NJ.
- Kitao, S. K. and Kitao, K. 1996. Testing
Communicative Competence Internet TESOL Journal, 2, 5.
- Kitao, S. K. and Kitao, K. 1996.
Testing Grammar Internet TESOL Journal, 2, 6.
- Kluitmann, S. (2008).
Testing English as a Foreign Language. Two EFL-Tests used in Germany. Philologische Fakultat, Albert-Ludwigs-Universitat Freiburg.
- Kitao, S. K. and Kitao, K. 1996.
Testing Listening Internet TESOL Journal, 2, 7.
- Knoch, U. 2008.
Collaborating with ESP Stakeholders in Rating Scale Validation: The case of the ICAO Rating Scale.
Spaan Fellow Working Papers in Second or Foreign Language Assessment 7, 21 - 46.
- Knoch, U. 2009.
The assessment of academic style in EAP writing: The case of the rating scale.
Melbourne Papers in Language Testing 13, 1.
- Koizumi, R. 2006.
Relationships Between Productive Vocabulary Knowledge and Speaking Performance of Japanese Learners of English at the Novice Level. Unpublished PhD thesis, University of Tsukuba, Japan.
- Koretz, D., Russell, M., Shin, C. D., Horn, C. and Shasby, K. 2002.
Testing and diversity in postsecondary
education: The case of California Education Policy Analysis Archives, 10, 1.
- Kunnan, A. J. 1998.
An introduction to structural equation modelling for language assessment research. Language Testing 15, 3, 295 - 332.
- Kunnan, A. J. 2005.
Language assessment from a wider context. In Hinkel, E. (Ed.) Handbook of research in second language teaching and learning, 779 - 794.
- Kyllonen, P. C. 2005.
The case for noncognitive assessments. R&D Connections 3. Princeton, NJ: Educational Testing Service.
L
- Laborda, J. G. 2007.
From Fulcher to PLEVALEX: Issues in Interface design, validity and reliability in Internet based Language Testing CALL-EJ Online 9, 1.
- Laborda, J. G. 2007.
On the Net: Introducing Standardized EFL/ESL Exams Language Learning & Technology 11, 2, 3 - 9.
- Lane, S. 1999.
Validity Evidence for Assessments Reidy Interactive Lecture Series.
- Lazaraton, A. and Wagner, S. (1996).
The Revised TSE test: Discourse Analysis of Native Speaker and Nonnative Speaker Data Research Report 96-10. Princeton NJ: Educational Testing Service.
- Lee, Y-W. 2005.
Dependability of Scores for a New ESL Speaking Test: Evaluating Prototype Tasks.. TOEFL Monograph Series MS-28. Princeton, NJ: Educational Testing Service.
- Lee, Y.-W., Breland, H., & Muraki, E. 2004.
Comparability of TOEFL CBT writing prompts for different native language groups.
.
TOEFL Research Report RR-77. Princeton, NJ: Educational Testing Service.
- Lewkowicz, J. A. 2000.
Authenticity in language testing: some outstanding questions. Language Testing 17, 1, 43 - 64.
- Lightsone, K and Smith, S. M. 2009.
Student Choice between Computer and Traditional Paper-and-Pencil University Tests: What Predicts Preference and Performance?
.
Revue internationale des technologies en pedagogie universitaire / International Journal of Technologies in Higher Education, vol. 6, 1, 2009, p. 30-45.
- Lim, G. S. 2010.
Investigating Prompt Effects in Writing Performance Assessment
.
Spaan Fellow Working Ppaers in Second or Foreign language Assessment 8, 95 - 116.
- Linn, R. L. 2003.
Performance Standards: Utilitily for Different Uses of Assessments.
Education Policy Analysis Archives
Volume 11 Number 31
- Linn, R. L. 2010. Comments on Atkinson and Geiser: Considerations for Colleage Admissions Tests. Educational Researcher 38, 9, 677 - 679.
- Linn, R. L., Baker, E. L. and Dunbar, S. B. 1991.
Complex, Performance-Based Assessment: Expectations and Validation Criteria. CSE Technical Report 331.
- Livingstone, S. A. 2009.
Constructed-response test questions: Why we use them; how we score them. R&D Connections 11. Princeton, NJ: Educational Testing Service.
- Livingston, S. A. and Zieky, M. J. 1982.
Passing Scores: A Manual for Setting Standards of Performance on Educational and Occuptational Tests.
.
Princeton, NJ: Educational Testing Service.
Warning: This is a slow download. Click and then leave it alone to download.
- Liu, O L. 2009.
Measuring learning outcomes in higher education. R&D Connections 10. Princeton, NJ: Educational Testing Service.
- Loevinger, J. 1957.
Objective tests as instruments of psychological theory. Psychological Reports 3, 635 - 694. Southern Universities Press, Monograph Supplement 9.
- Loulou, D. 1995.
Making the A: How To Study for Tests.
ERIC/AE Digest Series EDO-TM-95-10
- Low, G. No date.
Communicative Testing as an Optimistic Activity.
Manuscript from the Language Centre, University of Hong Kong.
M
- Malone, M. 2000.
Simulated Oral Proficiency Interviews: Recent Developments. ERIC Digest.
- May, L. 2006.
An examination of rater orientations on a paired candidate discussion task through stimulated verbal recall.
Melbourne Papers in Language Testing 11, 1.
- McAulay, A. 2002.
Peer and Self-evaluation in Spoken Tests: Tools and Methods Internet TESOL Journal, September.
- McLean, L., Myers, M., Smillie, C., and Vaillancourt, D. 1997.
Qualitative Research Methods: An essay review
Education Policy Analysis Archives, 5, 13.
- McClellan, C. 2010.
Constructed-Response Scoring - Doing it Right R&D Connections 13. Princeton, NJ: Educational Testing Service.
- Mehrens, A. A. No Date.
Preparing Students to Take Standardized Achievement Tests
ERIC Digest.
- Messerklinger, J. 1997.
Evaluating Oral Ability The Language Teacher Online, 21, 11.
- Mills, A., Swain, L. and Weschler, R. 1996.
The Implementation of a First Year English Placement System Internet TESOL Journal, 2, 11.
- Milton, J. 2006.
French as a Foreign Language and the Common European Framework of Reference for Languages.
Proceedngs from the Crossing Frontiers: Languages and International Dimension
conference, Cardiff University, 6 - 7 July.
- Mislevy, R. J., Behrens, J. T., Bennett, R. E., Demark, S. F., Frezzo, D. C., Levy, R., Robinson, D. H., Rutstein, D. W., Shute, V. J., Stanley, K. & Fielding, I. W. 2010.
On the roles of external knowledge representations in assessment design. Journal of Technology, Learning, and Assessment 8, 2.
- Monaghan, W. 2006.
The facts about subscores. R&D Connections 4. Princeton, NJ: Educational Testing Service.
- Monaghan, W. and Bridgeman, B. 2005.
E-rater as a quality control on human scores. R&D Connections 2. Princeton, NJ: Educational Testing Service.
- Moritoshi, P. 2001.
The Test of English for International Communication (TOEIC): necessity, proficiency levels,
test score utilization and accuracy. University of Birmingham, UK: MA assignment.
- Moritoshi, P. 2002.
Validation of the Test of English Conversation Proficiency.
University of Birmingham: MA dissertation.
- Moodie, I. 2008.
Using Pair Work Exams for Testing in the ESL/EFL Conversation Classes.
Internet TESL Journal XIV, 8.
- Mueller, J. 2003.
Authentic Assessment Toolbox. North Central College, Naperville, IL.
N
- Newfields, T. 2005.
TOEIC Washback Effects on Teachers: A Pilot Study at One University Faculty
Educational Policy Archives, 14, 1.
- Nichols, S. L. and Glass, G. V. 2006.
High-Stakes Tesing and Student Achievement: Does Accountability Pressure Increase Student Learning?
Toyo University Keizai Ronshu, 31, 1, 83 - 106
- North, B. 2004.
'Europe's framework promotes language discussion, not directives'. Education Guardian, 15 April.
A reply to Glenn Fulcher
- Norris, J. M. 2001.
Concerns with computerized adaptive oral proficiency assessment.
Language Learning and Technology, Vol 5, No. 2, May 2001, 99 - 105.
O
- Ohkubo, N. 2009.
Validating the integrated writing task of the TOEFL internet-based test (iBT): Linguistic Analysis of test takers' use of input material.
Melbourne Papers in Language Testing 14, 1.
- O'Loughlin, K. 2006.
Learning about second language assessment: Insights from a postgraduate student on-line subject forum.
.
University of Sydney Papers in TESOL 1, 71 - 85
- O'Loughlin, K. 2009.
Does it measure up? Benchmarking the written examination of a university English pathway program.
Melbourne Papers in Language Testing 14, 1.
- O'Sullivan, B. 2007.
Testing Speaking in Larger Classes
.
Humanising Language Teaching 9, 4.
- O'Sullivan, B., Weir, C. J., and Saville, N.
Using observation checklists to validate speaking-test tasks. Language Testing 19, 1, 33 - 56.
P
- Papajohn, D. 2006.
Standard setting for next generation TOEFL Academic Speaking Test (TAST): Reflections on the ETS Panel of International Teaching Assistant Developers
.
TESOL-EJ 10, 1.
- Park, T. 2004.
An investigation of an ESL placement test of writing using Many-facet Rasch Measurement
Teachers College, Columbia University Papers in TESOL and Applied Linguistics 4, 1.
- Peirce, B. N., and Stewart, G. 1997.
The Development of the Canadian Language Benchmarks Assessment TESL Canada Journal 14, 2, 17 - 31.
- Penfield, R. D. (2010).
Test-based grade retention: Does it stand up to pfoessional standards for fair and appropriate test use? Educational Researcher, 39, 2, 110 - 119.
- Pierce, L. V. and O'Malley, J. M. 1992.
Performance and Portfolio Assessment for Language Minority Students.
NCBE Program Information Guide Series Number 9.
- Phakiti, A. 2006.
Modeling cognitive and metacognitive strategies and their relationship to EFL reading test performance.
Melbourne Papers in Language Testing 11, 1.
- Poole, G. 2003.
Assessing Japan's Institutional Entrance Requirements.
Asian EFL Journal 5, 1.
- Poonpon, K. 2010.
Expanding a Second Language Speaking Rating Scale for Instructional and Assessment Purposes.
Spaan Fellow Working Papers in Second or Foreign Language Assessment 8, 69 - 94.
- Powers, D. E. 2010.
The case for a comprehensive, four-skills assessment of English-language proficiency R&D Connections 14. Princeton, NJ: Educational Testing Service.
- Praphal, K. 1990.
The relevance of language testing research in the planning of language programmes.
Thailand: Chulalongkorn University.
Q
R
- Ranali, J. M. 2002.
Comparing scoring procedures on a cloze test.
University of Birmingham, UK: MA assignment.
- Reed, D. J. and Cohen A. D. 2001.
Revisiting raters and ratings in oral language assessment.
In Elder, C et al. (Eds) Experimenting with uncertainty: Essays in honour of Alan Davies Cambridge, UK: Cambridge University Press, 82 - 96.
- Robb, T. N. & Ercanbrack, J. 1999.
A Study of the Effect of Direct Test Preparation on
the TOEIC Scores of Japanese University Students
TESOL-EJ, 3, 4.
- Roever, C. 2001.
Web based language testing.
Language Learning and Technology, Vol 5, No. 2, May 2001, 84 - 94.
- Roever, C. and Powers, D. E.. 2005.
Effects of language administration on a self-assessment of language skills.
TOEFL Monograph No. MS-27. Princeton, NJ: Educational Testing Service.
- Rosenfeld, M., Leung, S., & Oltman, P. K. . 2001.
The reading, writing, speaking, and listening tasks important for academic success at the undergraduate and graduate levels.
TOEFL Monograph No. MS-21. Princeton, NJ: Educational Testing Service.
- Rosenshine, B. 2003.
High Stakes Testing: Another analysis.
Education Policy Analysis Archives
Volume 11 Number 24
- Ross, J. A. 2006.
The Reliability, Validity, and Utility of Self-Assessment.
Practical Assessment, Research and Evaluation
Volume 11 Number 10
- Rudner, L. 1994.
Questions to ask when evaluating tests.
ERIC Clearinghouse on Assessment and Evaluation.
- Rudner, L. 1998.
An Online, Interactive, Computer Adaptive Test Tutorial.
ERIC Clearinghouse on Assessment and Evaluation.
- Rudner, L. 2001.
Reliability. ERIC Clearinghouse on Assessment and Evaluation.
- Rudner, L. 2006.
An evaluation of IntelliMetric Essay Scoring System. Journal of Technology, Learning, and Assessment 4, 4.
- Russell, M.1999. Testing On Computers:
A Follow-up Study Comparing Performance On
Computer and On Paper Education Policy Analysis Archives, 7, 20.
- Russell, M. and Haney, W. 1997.
Testing Writing on Computers: An Experiment Comparing Student Performance on Tests Conducted
via Computer and via Paper-and-Pencil Education Policy Analysis Archives, 5, 3.
- Russell, M. and Haney, W. 2000.
Bridging the Gap between Testing and Technology in Schools.
Education Policy Analysis Archives, 8, 19.
S
- Sanders, W. and Horn, S. P. 1995. Educational Assessment
Reassessed: The Usefulness of Standardized and Alternative Measures of Student
Achievement as Indicators for the Assessment of Educational Outcomes Education Policy Archives, 3, 6.
- Sarle, Warren S. 1995. Measurement theory:
Frequently asked questions From the Disseminations of the International Statistical Applications Institute, 4th edition, Wichita: ACG Press, 61-66.
Also available at: ftp://ftp.sas.com/pub/neural/measurement.html
- Sasaki, M., and Hirose, K. 1996.
Explanatory Variables for EFL Students' Expository Writing. Language Learning 46, 1, 137 - 174.
- Sawaki, Y. 2001.
Comparability of Conventional and Computerized Tests of Reading in a Second Language. Language Learning and Technology
Vol. 5, No. 2, May 2001, pp. 38-59 .
- Sawaki, Y. and Nissan, S. 2009.
Criterion-related validity of the TOEFLiBT Listening Section. TOEFL Research Report 09-02. Princeton, NJ: Educational Testing Service.
- Scharber, C., Dexter, A. and Riedel, E. 2008.
Students' Experiences with an Automated Essay Scorer. The Journal of Technology, Learning, and Assessment.
- Shaw, S. and Falvey, P. 2008.
The IELTS Writing Assessment Revision Project: Towars a revised rating scale. Cambridge: University of Cambridge ESOL Examinations Research Report 1.
- Sireci, S. G. 2007.
On Validity Theory. Educational Researcher 36, 8, 477 - 481.
- Skehan, P. 1990.
Communicative Language Testing. Journal of TESOL France 10, 1, 115 - 127.
- Sokolik, M. and Duber, J. 2002.
Grow Your Own: Online Placement Testing TESL-EJ, 6, 1.
- Stansfield, C. W. 1992. ACTFL Speaking Proficiency Guidelines Washington D.C.: ERIC Clearinghouse on Languages and Linguistics.
- Stansfield, C. W. 1996. Content Assessment in the Native Language Washington D.C.: ERIC Clearinghouse on Languages and Linguistics.
- Stansfield, C. W. & Kenyon, D. 1996. Simulated Oral Proficiency Interviews: An Update Washington D.C.: ERIC Clearinghouse on Languages and Linguistics.
- State of Illinois. 1995.
Assessment Handbook. A Guide for Developing Assessment Programs in Illinois Schools
Springfield, IL: Illinois State Board of Education.
- Stricker, L. J. 2002.
The Performance of Native Speakers of English and ESL Speakers on the Computer-Based TOEFL
and the GRE General Test.
Princeton NJ: Educational Testing Service, TOEFL Research Report 69.
- Swain, M., Huang, L-S, Barkaoui, K., Brooks, L., and Lapkin, S. 2009.
The Speaking Section of the TOEFL iBT: Test-takers' Reported Strategic Behaviors.
Princeton NJ: Educational Testing Service, TOEFLiBT Research Report 09-30.
T
- Tannenbaum, J. 1996. Practical Ideas On Alternative Assessment For ESL Students Washington D.C.: ERIC Clearinghouse on Languages and Linguistics.
- Tannenbaum, R. J. and Wylie, E. C. 2008. Linking English language test scores onto the Common European Framework of Reference: An application of standard setting methodology. TOEFL iBT Report iBT-06. Princeton, N.J: Educational Testing Service.
- Tasdemir, M., Tasdemir, A., and Yildirim, K. (2009)
Influence of Portfolio Evaluation in Cooperative Learning on Student Success.
Journal of Theory and Practice in Education, 5, 1, 53 - 66.
- Taylor, C. S. and Nolan, S. B. 1996.
What does the psychometrician's classroom look like? Reframing assessment concepts in the context of learning.
Educational Policy Archives, 14, 7.
- Taylor, C., Jamieson, J., Eignor, D., & Kirsch, I. 1998.
The relationship between computer familiarity and performance on computer-based TOEFL test tasks.
.
TOEFL Research Report RR-61. Princeton, NJ: Educational Testing Service.
- Templer, B. 2004. High-Stakes Testing at High Fees: Notes and Queries on the International English Proficiency Assessment Market.
Journal for Critical Education Policy Studies, 2, 1.
- Thompson, G. 2009.
Reevaluating the Test Specifications for an Oral Proficiency Test? The Journal of Kanda University of International Studies 21, 233 - 260.
- Tsang, S. L., Katz, A. and Stack, J. 2008.
Achieving Testing for English Language Learners, Ready or Not?
Educational Policy Archives, 16, 1.
- Tuzi, F. 1997. Using Microsoft Word to Generate Computerized Tests Internet TESOL Journal, 3, 11.
U
V
W
- Wagner, A. No Date.
Don't Messick around with Test Validity until you know what you're doing.
- Wagner, E. 2002.
Video listening tests: A pilot study.
Teachers College, Columbia University Working Papers in TESOL and Applied Linguistics, 2, 1.
- Wagner, E. 2007.
Are They Watching? Test-Taker Viewing Behavior During an L2 Video Listening Test.
Language Learning and Technology, 11, 1.
- Walker, M. E. 2007.
Is test score reliability necessary? R&D Connections 5. Princeton, NJ: Educational Testing Service.
- Wall, D., & Horak, T. 2006.
The impact of changes in the TOEFL examination on teaching and learning in central and eastern Europe. Phase I: The baseline study .
TOEFL Monograph No. MS-34. Princeton, NJ: Educational Testing Service.
- Wall, D., & Horak, T. 2008.
The impact of changes in the TOEFL examination on teaching and learning in central and eastern Europe. Phase 2: Coping with change .
TOEFL iBT Report No. iBT-05. Princeton, NJ: Educational Testing Service.
- Wang, J., and Brown, M. S. 2007.
Automated Essay Scoring Versus Human Scoring: A Comparative Study. Journal of Technology, Learning, and Assessment, 6, 2.
- Wendler, C. and Powers, D. 2009.
What does it mean to repurpose a test? R&D Connections 9. Princeton, NJ: Educational Testing Service.
- Wilson, N. 1998.
Educational Standards and the Problem of Error.
Educational Policy Archives, 6, 10.
- Wolfe, E. W., Matthews, S., and Vickers, D. 2010.
The effectiveness and efficiency of distributed online, regional online, and regional face-to-face training for writing assessment raters. Journal of Technology, Learning, and Assessment 10, 1.
- Wolfe, E. W. and Manalo, J. R. 2004.
Composition Medium Comparability in a Direct Assessment of Non-native English Speakers.
Language Learning and Technology, 8, 1, 52 - 65.
- Wright, P. W. D. and Wright, P. D. 2004.
Understanding Tests and Measurements for the Parent and Advocate.
LDOnline.
- Wylie, E.
An overview of the International Second Language Proficiency Ratings (ISLPR).
Australia: Griffith University Centre for Applied Linguistics and Languages.
X
Y
- Yen, D. A. and Kuzma, J. No date.
Higher IELTS score, higher academic performance? The validity of IELTS in predicting the academic performance of Chinese students. Mimeo: University of Worcester.
- Yoff, L. 1997. 'An overview of ACTFL proficiency interviews. A test of speaking ability.' JALT Testing and Evaluation SIG Newsletter,
1, 2, 3 - 9.
- Young, J. W. 2008.
Ensuring valid test content tests for English language learners. R&D Connections 8. Princeton, NJ: Educational Testing Service.
- Young, J. W. and King T. C. 2008. 'Testing Accommodations for English Language Learners: A Review of State and Disctrict Policies. New York: College Board.
- Yu, E. 2006. A Comparative Study of the Effects of a Computerized English Oral Proficiency Test Format and a Conventional Speak Test Format. Unpublished PhD Thesis: Ohio State University.
Z
- Zechner, K. and Xi, X. 2008.
Towards automatic scoring of a test of spoken language with heterogeneous task types. Proceedings of the Third ACL Workshop on Innovative Use of NLP for Building Educational Applications Association for Computational Linguistics, Columbus Ohio, 98 - 106.
- Zimmerman, D. W. and Zumbo, B. D. 2009.
Hazards in choosing between pooled and separate variances t tests. Psicologica 30, 371 - 390.
- Zumbo, B. D. 2009.
Validity as Contextualized and Pragmatic
Explanation, and Its Implications for Validation Practice. In Robert
W. Lissitz (Ed.) The Concept of Validity: Revisions, New Directions
and Applications, (pp. 65-82). IAP - Information Age Publishing,
Inc.: Charlotte, NC.
|
|
| |
|
|
|