The competition to access data for LLM generative AIs is already escalating

At the start of this year, PAC posted a detailed blog on separating the hype from the reality of ChatGPT. The globally popularised onset of large language model (LLM) generative AIs like ChatGPT has led to the technology industry equivalent of an “arms race” to see who can develop these types of solutions into a profitable and scalable business channel the fastest whilst limiting any reputational brand damage that can come from its unique AI responses. Amazingly a lot has happened in the past two months with a range of announcements from OpenAI, Microsoft, Google, and a wide range of other technology companies integrating into LLM AI services. Interestingly, before recent developments, the prior PAC blog on this topic ended with the following statement:

“Over the coming years, PAC believes some governments and organisations will become wary of what and how much information is publicly available to ChatGPT, and similar technologies, at no cost and no advantage to the source that created the data. This could drive both islands of private data and forms of competition that ultimately limit how sophisticated a technology like ChatGPT can become.”

Whilst PAC still believes this statement will continue over the coming years, Microsoft has also considered this and has reportedly changed its stance regarding access to Bing search data. On March 22, 2023, it was reported by Bloomberg that Microsoft had told two unnamed companies, which use Bing to power its search engines, that it would restrict them from accessing Bing’s search altogether if they continued to use it with their AI products and services. As of writing this blog, neither company has been named nor has Microsoft made any formal statement to confirm or deny this. However, PAC considers this, if true, a fascinating development relating to the prior blog’s statement referred to above. As the use and integration of LLM generative AIs rapidly mature and proliferate across software and solutions, PAC’s perspective on islands of private data driving new forms of competition is more likely to occur.

This move has raised concerns in the tech industry, as it can potentially impact the development and performance of AI chatbots that rely on Bing search data and other similar sources. For example, AI chatbots are becoming increasingly popular for organisations dealing with customer service, e-commerce, and healthcare. These systems rely on large amounts of data to function properly, and search engine data is often used to improve their accuracy and effectiveness. PAC understands that Microsoft’s Bing search engine is one of the largest data sources for AI chatbots, and many companies rely on it to train and improve their chatbot systems. Microsoft’s reported but unverified move to potentially cut off access to Bing search data for rival AI chatbots is likely driven by a desire to maintain a competitive advantage in the burgeoning AI chatbot market. By potentially limiting access to this data, Microsoft may be trying to ensure that its chatbot services maintain a competitive edge. However, this move could have serious implications for companies relying on AI chatbots to interact with customers. Without access to Bing search data, these chatbots may be less accurate and effective, leading to a decline in customer satisfaction and loyalty and harming a company’s bottom line.

First and foremost, organisations that rely on Bing search data for their chatbots should explore alternative data sources. While Bing is a popular search engine, other search engines and data sources can be used to train and improve chatbots. Organisations should begin evaluating these alternatives and determining which ones will be the most effective for their use case. Another option for organisations is to consider developing their own proprietary data sets. This can be a more time-consuming and costly approach, but it can also provide more tailored and customised data specific to the business’s needs. By developing their own data sets, organisations can ensure that their chatbots are trained on data relevant to their specific industry or market. In addition to exploring alternative data sources, organisations should also consider reaching out to Microsoft to discuss how to integrate Bing search data as a service into various forms of generative AI. It’s important to approach these discussions with an open mind and a willingness to collaborate in finding a mutually beneficial solution.

Ultimately this reported situation underscores the importance of data ownership and access. As AI technologies evolve, organisations must proactively identify alternative data sources and explore new ways to collect and analyse data. By doing so, they can ensure that their chatbots remain effective and competitive in a rapidly evolving market. Furthermore, the broader implications around issues accessing LLM generative AIs and search engine data may already signal a shift towards increased competition and less collaboration among tech companies in the AI space. This could make it more difficult for companies to access the data and resources they need to develop and improve their AI systems, which could slow down innovation and progress in the field.

Share via ...