Candidates who take the ETOE™ oral performance examination will not receive preliminary results upon completion of the exam since this exam requires human scoring. Candidates who take the ETOE™ oral performance examination will receive official results within approximately six to eight (6-8) weeks from the last date of the corresponding testing window (when all testing is done, not from the date of the candidate’s exam) via email.
The ETOE™ oral performance examination consists of 23 candidate’s audio recorded responses, and 22* of them are scored by human raters.
Raters score the examination by applying the Behaviorally Anchored Rating Scales which was developed and validated by CCHI’s Subject Matter Experts under the guidance of a nationally-recognized psychometrician.
Scoring of each of the item types (“activities”) on the ETOE™ examination requires a specific rubric, each of which is comprised of three to five of the following scales:
For example, item types Memory Capacity, Restate the Meaning, Equivalence of Meaning, and Reading Comprehension are evaluated using all five scales. While “Listening Comprehension” is evaluated using these four scales: Quality of Speech, Accuracy and Cohesion/Coherence, Lexical Content, and Grammar. And “Shadowing” is evaluated using these three scales: Quality of Speech, Task Completion, and Accuracy and Cohesion/Coherence.
All scales have equal weight and are applied independently. The main criteria across all scales is how accurately candidates maintain the meaning of the original speech/text (aka “prompt”).
The brief description below is provided as a general guidance. Raters receive extensive continuous training regarding how to apply these scales, and their performance is rigorously monitored to assure its validity and reliability.
Raters do not know candidate identities when scoring examinations. Each oral response is scored by two raters independently. Raters do not score the entire exam of one candidate; they score individual responses. Additionally, if two raters disagree by one point on a particular score for a particular response, that response is then scored by a third rater. Raters do not know if a candidate passes or fails the exam because they do not score a whole exam and have no access to other rater’s scores or the final score.
Total scores for each of the exam’s subdomains (“activities”) are weighted according to CCHI’s proprietary formula based on the ETOE™ exam specifications. The passing score (passing standard) is determined by the teams of Subject Matter Experts (SMEs) and the CCHI Commissioners through a standard setting process (see its detailed explanation below). The raw score is then scaled (via a mathematical formula) to the distribution of 300 to 600 with the passing score set at 450. Since different forms of the test may differ slightly in difficulty, a statistical procedure called equating is used to ensure that the passing score of 450 is comparable from form to form (see explanation of the equating procedure below).
The Score Report, in addition to the overall test score, indicates how candidates scored on the exam subdomains (Listening Comprehension, Shadowing, Memory Capacity, Restate the Meaning, Equivalence of Meaning, and Reading Comprehension) to help identify strengths and weaknesses for future study.
Keep in mind that the Score Report states two separate things: the overall test score, and how well you did in specific parts of the test. There is no relationship between the percentages reported for the parts of the test (subdomains) and the overall scaled score.
We report the percentage correct for 6 subdomains: Listening Comprehension, Shadowing, Memory Capacity, Restate the Meaning, Equivalence of Meaning, and Reading Comprehension. The percentage correct for a part of the test (subdomain, e.g., Memory Capacity) is computed as the portion of the points that you earned relative to the number of points it is possible to earn in that part. For example, if the maximum number of points that it is possible to earn in a part of the test is 72 and you earned 51 points, the percentage on your score report would be 71% out of 100% possible in that subdomain.
Your total score is not the average of your performance in subdomains. The total score is based on the full examination. There is no pass or fail status associated with an individual content area (subdomain). The percentages reported for subdomains are intended only as a guide and should be interpreted cautiously due to the small number of items included in each content area. In order to improve your score, if you failed an exam, you need to practice and improve all types of activities. For more information on the domains, see the Test Content Outline.
Explanation of Standard Setting
To establish the passing score for the ETOE exam, CCHI uses the Extended Modified Angoff method that has an established history of determining credible passing standards for credentialing examinations, and, additionally, the Beuk Relative-Absolute Compromise method.
The Extended Modified Angoff method involves two basic elements: conceptualization of a minimally competent candidate and the estimation, as assigned by SMEs, of the average score a minimally competent candidate would receive on each item. A minimally competent candidate is described as an individual who would be able to demonstrate just enough knowledge and skills to pass the examination. In general, such a candidate has enough interpreting skills to practice safely and competently, but does not demonstrate the skill level to be considered an expert.
SMEs provide ratings for each test item estimating a score a minimally competent candidate would get on the item. Then they compare their ratings with empirical data collected during the pilot phase for each item and discuss their ratings as a group, with the goal to reach as close a consensus as possible. The SMEs’ ratings are then averaged, and this “provisional cut score” is further reviewed and validated.
To establish an operational cut score, SMEs are also asked to make a specific prediction about the test as a whole. This prediction is then used to adjust the panel-recommended rating and is known as the Beuk Relative-Absolute Compromise method.
For more information about the standard setting methods, see:
Explanation of Equating
Following the best testing practices, CCHI has several versions of the same exam (called test forms) administered to candidates. One of the reasons for this is to be fair to candidates who take the exam for the first time and to those who retake the exam. Ideally, each candidate should have a new to them version of the exam.
Different test forms may be of slightly different difficulty, because of the natural variations in the language of the test items (e.g., dialogs). And, again, it is important to be fair to all candidates regardless of which form they took. To achieve this fairness, the test forms undergo a procedure called equating.
Equating is a mathematical calculation that ensures that the test forms have the passing points at the same level of the candidate’s performance, i.e., that the forms are “equal” and “fair.” Test forms are equated to the “standard.” The “standard” is the form that the SMEs used to establish the passing score, and all subsequently developed forms are equated to it. Let’s say the standard is Form 1, and Forms 2 and 3 are equated to Form 1. Forms 2 and 3 will have different raw passing points because of this equating but they will be then scaled to represent the same passing score of 450 points. As a result of equating, a slightly easier form will require the candidate get higher points on some test items (called “raw scores”) to pass the exam. And a slightly more difficult test form will allow the candidate get lower points on some test items to pass.
Equating calculations are done by psychometricians and then reviewed and approved by CCHI.
As an analogy, if a second grade mathematics test included both addition and multiplication problems, you might expect the addition problems to be easier and multiplications problems to be harder. Let’s say Class 1 has an exam with 75 addition questions and 25 multiplication ones, whereas class 2 has an exam with 65 addition and 35 multiplication questions. Then, to be fair for both classes, the final grade on two exams would have to be mathematically adjusted. Let’s say the addition question is worth 1 point, and the multiplication question is worth 4 points. Now imagine these 4 students:
To conclude, the percent scoring should be seen more as an indication that you did better in one domain than another on that particular test. You cannot compare between tests because they have a mix of test items with differing difficulties and, therefore, different weights for the final overall exam score.
When CCHI applies for accrediting and re-accrediting its exams, the equating procedures are submitted for review to the accrediting body. Accreditation is a form of final review and confirmation that the accredited exam meets all the requirements to be fair and reliable.
* The last response, consisting of the audio response in the candidate’s Language Other Than English, is not scored. It is collected by CCHI and will be used in a required continuing education activity for CoreCHI-P certificants. More information about this usage will be available once the first CoreCHI-P credentials are awarded.