Abstract
Evaluating Artificial Intelligence (AI) and data science models is crucial to ensure their reliability, fairness, and applicability in real-world scenarios. This paper highlights best practices for model evaluation, emphasizing the importance of selecting appropriate metrics aligned with business or research goals. Key considerations include using robust validation strategies (e.g., cross-validation), monitoring for overfitting, and ensuring data splits preserve class distributions. Fairness, interpretability, and reproducibility are essential, particularly in high-stakes domains like healthcare or finance. Additionally, evaluating models across multiple datasets or demographic subgroups helps uncover biases and improve generalizability. Adopting standardized reporting practices and
open-source benchmarks further strengthens the evaluation process. By adhering to these practices, practitioners can build more trustworthy and effective AI systems.
open-source benchmarks further strengthens the evaluation process. By adhering to these practices, practitioners can build more trustworthy and effective AI systems.
| Originalsprache | Englisch |
|---|---|
| Titel | INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam |
| Redakteure/-innen | Ulrike Lucke, Stefan Stieglitz, Falk Uebernickel, Anna-Lena Lamprecht, Maike Klein |
| Seitenumfang | 9 |
| Erscheinungsort | Bonn |
| Herausgeber (Verlag) | Gesellschaft für Informatik e.V. |
| Erscheinungsdatum | 2025 |
| Seiten | 1211-1219 |
| DOIs | |
| Publikationsstatus | Erschienen - 2025 |
Bibliographische Notiz
Publisher Copyright:© 2025 Gesellschaft fur Informatik (GI). All rights reserved.
ASJC Scopus Sachgebiete
- Angewandte Informatik
Fingerprint
Untersuchen Sie die Forschungsthemen von „Best Practices in AI and Data Science Models Evaluation“. Zusammen bilden sie einen einzigartigen Fingerprint.Dieses zitieren
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver