Recently, Stability AI announced StableLM, its first-ever large language model (LLM) and a direct competitor to ChatGPT and the GPT-n series developed by OpenAI in cooperation with Microsoft. This is just one of the recent alternatives to OpenAI’s technology that has entered the market.
It’s worth looking at Stability AI and other base-level LLMs. Even if you’re not currently evaluating or buying, having an overview of AI options is critical for business decisions moving forward, especially if and when your company needs to build the technology into your product or operations.
Stability AI first released its famous text-to-image model Stable Diffusion in August 2022, with version 2.0 released only months later. They also built popular generative image products like Lensa, Wonder, and NightCafe. They have since developed models for many different modalities, including image, audio, video, and 3D. Their LLM is their first foray into AI-generated text, a competitive field currently occupied by OpenAI, Microsoft, Stanford University, Google, Meta, and others.
Risks of Proprietary Base-Level Models
Unlike these other players, however, Stability AI takes a more open approach. StableLM is open-sourced under the CC BY-SA-4.0 license, whereas base-level models like DALL-E as developed by OpenAI aren’t. This means you are free to examine the Stability AI model in depth, share it, use it anywhere, and make your own modifications — as long as you credit the creators.
Stability AI encourages developers to examine the StableLM base model and how it was trained. OpenAI, despite its name, has not offered similar transparency.
Other companies including Meta and Google build AI into their products with little to no explanation of the underlying models or their quality. The ramifications of this for early adopters of generative AI are significant: If you build base-level models like those of OpenAI into your product stack, you naturally take on the risk that the model may not function how you intended, the risk that OpenAI discontinues the service, or that their service just doesn’t meet your specific needs. On the other hand, using an open-source model like StableLM gives you more control and technical insight that you can leverage to make sure the technology is doing what it should.
Which companies are the most important vendors in AI and hyperautomation? Check out the Acceleration Economy AI/Hyperautomation Top 10 Shortlist.
Base-Level Model Provider Options
Stability AI is not the only other option. Many other organizations are building base-level models for images, text, 3D, and more. Menlo Park-based Together recently released RedPajama, another open-sourced model developed in partnership with Ontocord AI, Stanford CRFM, ETH DS3Lab, Hazy Research, and the MILA Québec AI Institute.
Cohere is another LLM provider worth checking out, as they recently partnered with Oracle to continue providing full-stack enterprise-grade AI software.
California-based Hippocratic AI recently announced a $50 million seed round co-led by Andreessen Horowitz and General Catalyst to build LLMs for the healthcare industry and its specific needs around safety, privacy, and reliability.
When I was on-location at the HIMSS conference in late April, I noticed such safety concerns were the major roadblock to AI adoption by the industry. For example, in the training process of their LLM, Hippocratic AI engaged healthcare professionals to guide and train the LLM by rating its responses in what is called “reinforcement learning through human feedback” (RLHF). Ultimately, the work of companies like Hippocratic AI enables healthcare institutions to use AI technology safely to save lives. It also highlights a trend of industry-specific base-level models which we may see more of.
Anthropic is another base-level model provider that has a heavy emphasis on embedding cutting-edge AI safety research directly into their AI assistant Claude, which you can deploy quite simply with an API.
According to their website, Anthropic is pursuing research around scaling human supervision of model performance, mechanistic interpretability (i.e. explainability), process-oriented learning, testing for dangerous failure modes, predicting societal impacts and necessary regulation, and understanding how AI systems generalize. In a world where profit-driven corporations are deploying AI systems at lightning speed with little research done to evaluate those systems’ impact on society at large, Anthropic’s approach is a breath of fresh air.
Parallels Between AI and the Early Internet
Although I wasn’t around at the time, I imagine the current state of base-level AI models is similar to the early days of the Internet and the “protocol wars” that took place. Before we settled on the TCP/IP-based tech stack in the late 80s, we had a huge list of alternative protocols, including the Xerox Network System (XNS) and the Open Systems Interconnection (OSI) suite as developed by the International Organization for Standardization.
Competition at that stage was critical to building an Internet that benefited all rather than a select few. Over time, commercial pressure favored the IP standard as it was promoted by DARPA and the NSF, and the OSI effort faded. The world of AI right now seems very similar.
At the time, it was unclear how the Internet would evolve, what standards it would be built on, and what future applications beyond email (the first killer app) it would support. Choosing the right protocol was also a process of balancing application-specific technology versus general use — as we see now with organizations facing the decision between OpenAI’s general-use LLMs or Hippocratic AI’s healthcare-specific models.
But the future is yet untold. Look beyond the most visible providers and the fanciest models. You might just find the perfect solution.