Untuk mengukur sejauh mana kemampuan dan pemahaman peserta didik terhadap materi yang telah diberikan selama 1 semester , maka di akhir semester pendidik akan memberikan sebuah tes kepada peserta didik yang kemudian dari hasil tersebut dapat memberikan sebuah gambaran terhadap pendidik seberapa jauh keberhasilan capaian tujuan pembelajaran. Sebuah tes dapat dikatakan baik atau layak apabila tes tersebut dapat dipertanggung jawabkan kesahihan, ketafsiran, kelayakan, kebergunaan, keterpercayaan, maupun efektivitas butir soal yang meliputi tingkat kesulitan dan daya beda yang baik, karena tidak semua butir soal dapat sesuai dengan kriteria tersebut. Rumusan masalah dalam penelitian ini adalah bagaimanakah tingkat kesulitan butir soal, daya beda dan efektivitas distraktor jawaban dari soal Ujian Akhir Semester Ganjil (UAS) bahasa Jerman kelas X SMAN 12 Surabaya TP 2019/2020. Penelitian yang saya lakukan di SMAN 12 Surabaya ini merupakan penelitian kuantitatif dengan cara menganalisis tingkat kesulitan, daya beda dan efektivitas distraktor dengan menggunakan rumus IF (Indeks Facility) dan ID (Indeks Discrimination). Dari hasil penelitian yang telah dilakukan, diketahui bahwa dari 50 butir soal pilihan ganda Ujian Akhir Semester Ganjil bahasa jerman kelas X SMAN 12 Surabaya terdapat 21 butir soal telah memenuhi syarat ITK, dan 13 butir soal telah memenuhi syarat IDB. Dan hasil akhir dari perhitungan indeks tingkat kesulitan dan indeks daya beda diketahui hanya 12 butir soal yang dapat dinyatakan layak. Dari hasil analasis tingkat kesulitan butir soal, telah terbagi 3 golongan soal yaitu mudah ,sedang dan sulit. Dari UAS ini diketahui 3 soal tergolong sulit, 7 soal tergolong sedang dan 40 soal tergolong mudah. Namun dari 40 soal tergolong mudah ini telah di analisis bahwa 29 soal tergolong terlalu mudah sehingga dinyatakan tidak berfungsi. Untuk hasil analisis Indeks Daya Beda telah diketahui terdapat 37 butir soal yang tidak layak dan diketahui juga dari soal tidak layak tersebut terdapat 13 butir soal yang indeks daya beda yang minus yaitu kelompok rendah lebih banyak menjawab benar daripada kelompok tinggi sehingga 13 soal tersebut tidak bisa di revisi dan diharapkan untuk dibuang dan diganti soal yang layak. Untuk sisa soal yang dinyatakan tidak layak dapat diperbaiki atau di revisi. Hasil Analisis Distraktor dari 250 opsi jawaban dan pengecoh, menunjukkan 133 opsi jawaban yang berfungsi baik, dan 117 tidak berfungsi baik.
Kata kunci : analisis butir soal, tingkat kesulitan, daya beda dan distraktor.
To measure the extent to which the ability and understanding of students of the material that has been given during 1 semester, then at the end of the semester the educator will give a test to students who then from the results can provide an overview of the educator how far the success of learning objectives. A test can be said to be good or feasible if the test can be accounted for validity, interpretation, appropriateness, usefulness, trustworthiness, and effectiveness of items that include a good level of difficulty and different power, because not all items can be in accordance with these criteria. The formulation of the problem in this research is how the level of difficulty of the items, the different power and effectiveness of the answer distractor from the German Final Examination (UAS) class X in SMAN 12 Surabaya TP 2019/2020. The research that I did at SMAN 12 Surabaya is a quantitative study by analyzing the level of difficulty, different power and effectiveness of the distractor by using the formula IF (Index Facility) and ID (Index Discrimination). From the results of the research that has been carried out, it is known that of the 50 multiple choice questions at the end of the German language semester examination at SMAN 12 Surabaya there are 21 items that have met ITK requirements, and 13 items have met IDB requirements. And the final results of the calculation of the difficulty level index and the different power index are known only 12 items that can be declared feasible. From the results of the analysis of the difficulty level of the items, 3 groups of questions were divided namely easy, medium and difficult. From this UAS, 3 problems are classified as difficult, 7 questions are classified as moderate and 40 questions are classified as easy. However, of the 40 questions classified as easy, it has been analyzed that 29 questions were classified as too easy to declare not functioning. Then for the results of the analysis of the different power index it is known that 37 items are not feasible, and it is also known from the improper questions that there are 13 items with a minus power index difference, namely the low group answers more correctly than the high group so that 13 questions cannot be revised and expected to be discarded and replaced with appropriate questions. For some of the remaining questions, it can be revised or corrected so that it is feasible. Whereas for the analysis of the distractors provided in these 50 items, there were 250 answer and deception options, and there were 133 answer options that functioned well, and the remaining 117 did not function properly because none of the test participants chose any of the answer options. For answer options that don't work well, you can fix them.
Keywords: item analysis, level of difficulty, different power and distractor