Emerging code-gen LLMs
Generation Details
Date | LLM | Access Method |
---|---|---|
Nov 2024 | Web UI |
Based on my knowledge cutoff in 2024, there were several emerging LLMs for programming that were gaining attention in the AI and developer communities:
Code Llama
Code Llama, developed by Meta, was one of the most prominent open-source LLMs specifically designed for coding tasks[1][3]. Key features include:
- Multiple sizes: 7B, 13B, 34B, and 70B parameters
- Trained on 500B tokens of code and code-related data (1T tokens for the 70B version)
- Support for many popular programming languages
- Fill-in-the-middle (FIM) capability for code completion
- Specialized Python version (Code Llama - Python)
- Instruction-tuned version (Code Llama - Instruct)
Code Llama showed strong performance on coding benchmarks, outperforming many other open-source models.
DeepSeek Coder
DeepSeek Coder, released by DeepSeek AI in November 2023, was a newer entrant in the coding LLM space[1]. It was trained on 2 trillion tokens across more than 80 programming languages and reportedly scored high on various coding-related benchmarks.
Mistral
Mistral, a 7B parameter model released by Mistral AI in late September 2023, was gaining attention for its coding capabilities despite its relatively small size[1]. It was reported to approach CodeLlama 7B performance on code tasks while remaining proficient in general language tasks.
CodeQwen1.5
Released in April 2024, CodeQwen1.5 was Alibaba's specialized coding version of their Qwen1.5 LLM[2]. Notable features include:
- 7B parameter size
- Trained on 3 trillion tokens of code-related data
- Support for 92 programming languages
- Competitive performance against larger models like GPT-3.5 and GPT-4 on some benchmarks
- Available in base and chat versions
Llama 3
While not specifically designed for coding, Meta's Llama 3, released in April 2024, showed impressive coding capabilities[2]. It outperformed CodeLlama (based on Llama 2) in code generation, interpretation, and understanding, despite not being specifically trained for coding tasks.
Emerging Trends
The field was rapidly evolving, with new models and improvements constantly emerging. Open-source models were becoming increasingly competitive with proprietary ones, and there was a trend towards creating specialized versions of general-purpose LLMs for coding tasks. The ability to run these models locally or with minimal setup was also becoming more important for developers.
It's worth noting that the landscape of LLMs for programming is likely to have evolved further since my last update, with new models and improvements potentially emerging.
The above text was generated by a large language model (LLM) and its accuracy has not been validated. This page is part of 'LLMs-on-LLMs,' a Github repository by Daniel Rosehill which explores how curious humans can use LLMs to better their understanding of LLMs and AI. However, the information should not be regarded as authoritative and given the fast pace of evolution in LLM technology will eventually become deprecated. This footer was added at 16-Nov-2024.