Analyzing and Mitigating Dataset Artifacts in Natural Language Inference Models Using ELECTRA

Himanshu Joshi

doi:10.54660/IJMRGE.2024.5.6.1279-1286

Analyzing and Mitigating Dataset Artifacts in Natural Language Inference Models Using ELECTRA

Author(s): Himanshu Joshi

Published: 2024

Volume: 5 | Issue: 6 | Pages: 1279-1286

Subject: Engineering

Country: United States

DOI: https://doi.org/10.54660/IJMRGE.2024.5.6.1279-1286

License: CC BY 4.0

Full Text (PDF)

Open Access - Free to Download

Download Full Article (PDF)

Alternative download link

Abstract

This paper investigates the challenges posed by dataset artifacts in Natural Language Inference (NLI) models, focusing on ELECTRA, a state-of-the-art transformer model. Dataset artifacts such as hypothesis-only biases, lexical overlap issues, and frequent label imbalances significantly impact model generalization, leading to erroneous predictions. We propose and evaluate a range of strategies, including adversarial training, data augmentation, instance weighting, and artifact-aware regularization, to mitigate these issues. Extensive experimental results demonstrate up to a 6% improvement in robustness and generalization, providing valuable insights for creating artifact-resistant NLP models.

How to Cite This Article

Himanshu Joshi (2024). Analyzing and Mitigating Dataset Artifacts in Natural Language Inference Models Using ELECTRA . International Journal of Multidisciplinary Research and Growth Evaluation (IJMRGE), 5(6), 1279-1286. DOI: https://doi.org/10.54660/IJMRGE.2024.5.6.1279-1286

Export Citation:

BibTeX RIS EndNote

Publication Information

Journal: International Journal of Multidisciplinary Research and Growth Evaluation (IJMRGE)

Publisher: Anfo Publication House

ISSN: 2582-7138 (Online)

Frequency: Bimonthly

Language: English

Open Access: Yes - This article is distributed under the terms of the Creative Commons Attribution 4.0 International License

International Journal of Multidisciplinary Research and Growth Evaluation

Analyzing and Mitigating Dataset Artifacts in Natural Language Inference Models Using ELECTRA

Full Text (PDF)

Abstract

How to Cite This Article

Publication Information

Share This Article:

Company

Useful Links

Follow Us