Beyond Autocomplete: A Comparative Analysis of Code Generation Quality Across LLM-Based Assistants

Asif Bhat; Munleef Bhat; Nusrat Shah; Roma Fayaz

doi:10.54660/.IJMRGE.2026.7.3.970-976

Beyond Autocomplete: A Comparative Analysis of Code Generation Quality Across LLM-Based Assistants

Author(s): Asif Bhat, Munleef Bhat, Nusrat Shah, Roma Fayaz

Published: 2026

Volume: 7 | Issue: 3 | Pages: 970-976

Subject:

Country: Saudi Arabia

DOI: https://doi.org/10.54660/.IJMRGE.2026.7.3.970-976

License: CC BY 4.0

Full Text (PDF)

Open Access - Free to Download

Download Full Article (PDF)

Alternative download link

Abstract

Large Language Models (LLMs) have trans-formed software development through AI-powered code generation, yet systematic comparisons of their capabilities remain limited. We present a comprehensive empirical evaluation of six leading LLM-based coding assistants—GPT-4, Claude 3.5 Sonnet, Gemini 1.5 Pro, CodeLlama-70B, DeepSeek Coder, and Mistral Large—across 1,847 code generation tasks spanning five programming languages and eight complexity tiers. Our evaluation framework assesses functional correctness (pass@k), code quality (maintainability, security), computational efficiency, and prompt robustness. Key findings reveal: (1) Claude 3.5 Sonnet achieves the highest overall pass@1 rate (84.7%) but GPT-4 excels in complex algorithmic tasks;
(2) all models exhibit significant performance degradation (18–34%) on adversarial prompt variations; (3) security vulnerability rates range from 3.2% (Claude) to 11.8% (CodeLlama); and (4) open-source models achieve 73–81% of proprietary model performance at substantially lower cost. We release our benchmark suite, CodeEval-1847, comprising novel problems to prevent data contamination. Our findings provide actionable guidance for practitioners selecting AI coding tools and highlight critical areas for model improvement.

How to Cite This Article

Asif Bhat, Munleef Bhat, Nusrat Shah, Roma Fayaz (2026). Beyond Autocomplete: A Comparative Analysis of Code Generation Quality Across LLM-Based Assistants . International Journal of Multidisciplinary Research and Growth Evaluation (IJMRGE), 7(3), 970-976. DOI: https://doi.org/10.54660/.IJMRGE.2026.7.3.970-976

Export Citation:

BibTeX RIS EndNote

Publication Information

Journal: International Journal of Multidisciplinary Research and Growth Evaluation (IJMRGE)

Publisher: Anfo Publication House

ISSN: 2582-7138 (Online)

Frequency: Bimonthly

Language: English

Open Access: Yes - This article is distributed under the terms of the Creative Commons Attribution 4.0 International License

International Journal of Multidisciplinary Research and Growth Evaluation

Beyond Autocomplete: A Comparative Analysis of Code Generation Quality Across LLM-Based Assistants

Full Text (PDF)

Abstract

How to Cite This Article

Publication Information

Share This Article:

Company

Useful Links

Follow Us