Generative AI has enormous potential to revolutionize business, create new opportunities and make employees more efficient in the way they work. According to McKinsey, more than a quarter of company leaders say generative AI is an agenda item at the board level, while 79 percent of respondents have already used generative AI.
These technologies are already disrupting the software industry: IDC found that 40 percent of IT executives think generative AI “will allow us to create much more innovative software,” while GBK Collective estimates that 78 percent of Companies hope to use AI for software development within the sector. the next three to five years. Around half of video game companies already use generative AI in their work processes, according to research from the Game Developer Conference.
All of these signs show that the use of generative AI is increasing. However, the number of developers with the right skills to work on creating AI-powered generative applications is limited. For companies that want to build and operate their own AI-powered generative services, rather than consuming a service from a provider, integration will be essential to make more effective use of company data.
Head of Developer Relations at DataStax.
Where are the gaps?
So what are the challenges that exist around generative AI? The first of them is about how to prepare data for generative AI systems. The second is how to integrate these systems and how to develop software around generative AI capabilities.
For many companies, generative AI is inextricably linked to large language models (LLM) and services like ChatGPT. These tools take text input, translate it into a semantic query that the service can understand, and then provide responses based on your training data. For simple queries, a ChatGPT response may be appropriate. But for companies, this level of general knowledge is not enough.
To solve this problem, techniques such as Recovery Augmented Generation (RAG) are needed. RAG covers how companies can take their data, make it available for queries, and then deliver that information to the LLM for inclusion. This data can exist in multiple formats, from company knowledge bases or product catalogs to text in PDF files or other documents. Data must be collected and converted into vectors, which encode them into numerical values that preserve information and semantic relationships.
This process involves a process called chunking: dividing text into discrete units that can then be represented using vectors. There are several possible approaches here, from looking at individual words to sentences or paragraphs. The smaller the amount of data you use, the more capacity and cost it will require; conversely, the larger each chunk is, the less accurate you end up with data. Data sharding is still a very new area and best practices are still developing here, so you may need to experiment with your approach to get the best results.
Once your data is fragmented and converted into vectors, you'll need to make it available as part of your generative AI system. When a user request arrives, it is converted into a vector that can then be used to perform a search on your data. By comparing your user's search request with your company's vector data, you can find the best semantic matches. These matches can then be shared with your LLM and used to provide context when the LLM creates the response for the user.
RAG data has two main benefits: first, it allows you to provide information to your LLM service for processing, but without adding that data to the LLM so that it can be used in any other response. This means you can use generative AI with sensitive data, as RAG allows you to maintain control of how that data is used. Second, you can also provide more time-sensitive data in your responses: you can keep updating the data in your vector database to make it as fresh as possible, and then share it with clients when the correct request arrives.
Implementing RAG is a potential challenge as it relies on multiple systems that are currently very new and developing rapidly. The number of developers who are familiar with all the technology involved (data sharding, vector embeddings, LLM, and the like) is still relatively small, and there is a lot of demand for those skills. Therefore, making it easier for more developers to work with RAG and generative AI will help everyone.
This is where challenges can arise for developers. Generative AI is most associated with Python, the software language used by data scientists when creating data pipelines. However, Python is only third on the list of most popular languages according to Stack Overflow research for 2023. Expanding support for other languages such as JavaScript (the most popular programming language) will allow more developers to participate in creating generative AI applications or their integration. with other systems.
Abstracting AI with APIs
One approach that can make this process easier is to support the APIs that developers want to work with. By looking at the most common languages and providing them with APIs, developers can get to grips with generative AI more quickly and efficiently.
This also helps solve another of the biggest problems for developers around generative AI: how to get all the constituent parts to work together effectively. Generative AI applications will cover a wide range of use cases, from extending current customer service robots or search functions to more autonomous agents that can take over entire work processes or customer requests. Each of these steps will involve multiple components working together to fulfill a request.
This integration work will be a significant overhead if we cannot abstract it using APIs. Every connection between system components should be managed, updated, and modified as more functionality is requested or more new elements are added to the AI application. By using standardized APIs, work will become easier for developers to manage over time. This will also open up generative AI to more developers as they can work with components through APIs as services, rather than having to create and run their own instances for vector data, data integration, or sharding. Developers can also choose the LLM they want to work with and switch if they find a better alternative, rather than being tied to a specific LLM.
This also makes it easier to integrate generative AI systems into front-end development frameworks like React and Vercel. Empowering developers to implement generative AI in their apps and websites combines front-end design and delivery with back-end infrastructure, so simplifying the stack will be essential to get more developers on board. The full stack of recovery augmented generation technologies, or RAGStack, will need to be made easier if companies are going to use generative AI in their businesses.
We have introduced the best AI writer.
This article was produced as part of TechRadarPro's Expert Insights channel, where we feature the best and brightest minds in today's tech industry. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing, find out more here: