As in oxford dictionary, evaluation means “decide on the value or quality”. Metric can be defined as a method for assessing the translation quality. Translation evaluation has traditionally been based on error detection (Conde: 2011). Evaluation is also an effort to measure the value or quality of activity, program, or project which compare the purpose and the process.
There are two kinds of evaluation of output quality of machine translation; Automatic and Manual evaluation. Automatic machine translation evaluation is a means of scoring the output from a machine translation system with respect to a small corpus of reference translations. The examples of automatic evaluation methods are BLEU, NEVA, WAFT, Word Accuracy and Meteor.
Another evaluation is manual evaluation. The example of manual evaluation method is SAE J2450 standard. SAE J2450 issued by SAE in 2001 which has aimed to evaluate the translation output both human and machine translation.
The procedure of SAE J2450 metrics, as written in SAE’s publication, consists of five actions which are summarized as follows:
a) Mark the location of the error in the target text with a circle.
b) Indicate the primary category of the error.
c) Indicate the sub-classification of the error as either “serious” or “minor:
d) Look up the numeric value of the error.
e) Compute the normalized score. (SAE, 2001)
Table 2.1 Error Categories, Classifications, and Weights (SAE, 2011).
The categories selected are explained as follows:
a) Wrong Term (WT)
Terms here refers to single word, multi-word phrase, abbreviation, acronym, number and proper names.
b) Syntactic Error (SE)
Errors in SE include errors related to grammar and structures, either structure in sentence or phrases.
c) Ommision (OM)
OM calculates the words that are deleted in the target language.
d) Word Structure or Agreement Error (SA)
SA refers to the mistakes in morphological forms of a word, including case, gender, suffix, prefix, infix, and other inflections.
e) Misspelling (SP)
SP includes misspellings and inappropriate writing systems. For example in Swedish for “health sciences” is “vårdvetenskap” even though vård means health and vetenskap means sciences, the two words should be a combined word.
f) Punctuation Error (PE)
PE calculates whether there is an error in punctuation rules in the text.
g) Miscellaneous Error (ME)
ME includes other errors that could not quite fit in the other attributes. Some of the example includes literal translation of idioms, culturally offensive words, and extra words that have no meaning related to the text.
An important rule (or called Meta-rule) to be considered in this metric is when the error is ambiguous, always choose the earliest category and when in doubt, always choose serious over minor.
Reference :
There are two kinds of evaluation of output quality of machine translation; Automatic and Manual evaluation. Automatic machine translation evaluation is a means of scoring the output from a machine translation system with respect to a small corpus of reference translations. The examples of automatic evaluation methods are BLEU, NEVA, WAFT, Word Accuracy and Meteor.
Another evaluation is manual evaluation. The example of manual evaluation method is SAE J2450 standard. SAE J2450 issued by SAE in 2001 which has aimed to evaluate the translation output both human and machine translation.
The procedure of SAE J2450 metrics, as written in SAE’s publication, consists of five actions which are summarized as follows:
a) Mark the location of the error in the target text with a circle.
b) Indicate the primary category of the error.
c) Indicate the sub-classification of the error as either “serious” or “minor:
d) Look up the numeric value of the error.
e) Compute the normalized score. (SAE, 2001)
Table 2.1 Error Categories, Classifications, and Weights (SAE, 2011).
The categories selected are explained as follows:
a) Wrong Term (WT)
Terms here refers to single word, multi-word phrase, abbreviation, acronym, number and proper names.
b) Syntactic Error (SE)
Errors in SE include errors related to grammar and structures, either structure in sentence or phrases.
c) Ommision (OM)
OM calculates the words that are deleted in the target language.
d) Word Structure or Agreement Error (SA)
SA refers to the mistakes in morphological forms of a word, including case, gender, suffix, prefix, infix, and other inflections.
e) Misspelling (SP)
SP includes misspellings and inappropriate writing systems. For example in Swedish for “health sciences” is “vårdvetenskap” even though vård means health and vetenskap means sciences, the two words should be a combined word.
f) Punctuation Error (PE)
PE calculates whether there is an error in punctuation rules in the text.
g) Miscellaneous Error (ME)
ME includes other errors that could not quite fit in the other attributes. Some of the example includes literal translation of idioms, culturally offensive words, and extra words that have no meaning related to the text.
An important rule (or called Meta-rule) to be considered in this metric is when the error is ambiguous, always choose the earliest category and when in doubt, always choose serious over minor.
Reference :
Conde, Tomás,. (2011). Translation evaluation
on the surface
of texts: a preliminary analysis. The Journal of Specialised Translation. 15 : 69 – 86.
SAE. (2001). Surface Vehicle Recommended Practice. Accessed
on August 1, 2016, from APEX: http://www.apex-translations.com/documents/sae_j2450.pdf
Untuk order jasa penerjemah klik disini
Post a Comment
Post a Comment