Meta AI has released a new version of its advanced code generation model, Code Llama 70B. The new platform, one of the largest open source AI models for code generation, is a significant improvement over its predecessor, making it significantly faster and more accurate.
Code Llama 70B has been trained on 500 billion code tokens and code-related data, and has a large context window of 100,000 tokens, allowing it to process and generate longer and more complex code in a variety of languages, including C++, Python, PHP. and Java.
Based on Llama 2, one of the world's largest general-purpose large language models (LLM), Code Llama 70B has been fine-tuned for code generation using a technique called self-attention that can better understand code relationships and dependencies.
Uphill battle
Another highlight of the new model is CodeLlama-70B-Instruct, a variant optimized to understand natural language instructions and generate code accordingly.
Meta AI CEO Mark Zuckerberg said: “The ability to code has also proven important for AI models to process information in other domains more rigorously and logically. “I am proud of the progress here and look forward to including these advancements in Llama 3 and also future models.”
Code Llama 70B is available for free download under the same license as Llama 2 and previous Code Llama models, allowing both researchers and commercial users to use and modify it.
Despite the improvements, Meta has the tough challenge of trying to win over developers who are currently using GitHub Copilot, the number one AI tool for developers created by GitHub and OpenAI. Many developers are also suspicious of Meta and its data collection processes, and many are not fans of AI-generated code in the first place. This can often require serious debugging and produce code that non-programmers are happy to use but don't understand, leading to problems down the road.