Tech

Elon Musk’s xAI releases Grok supply and weights, taunting OpenAI

fusion technewsMarch 18, 2024

0 8 3 minutes read

[ad_1]

An AI-generated image released by xAI during the launch of Grok — Enlarge / An AI-generated picture launched by xAI through the open-weights launch of Grok-1.

On Sunday, Elon Musk’s AI agency xAI launched the bottom mannequin weights and community structure of Grok-1, a big language mannequin designed to compete with the fashions that energy OpenAI’s ChatGPT. The open-weights launch by GitHub and BitTorrent comes as Musk continues to criticize (and sue) rival OpenAI for not releasing its AI fashions in an open method.

Announced in November, Grok is an AI assistant just like ChatGPT that’s out there to X Premium+ subscribers who pay $16 a month to the social media platform previously generally known as Twitter. At its coronary heart is a mixture-of-experts LLM referred to as “Grok-1,” clocking in at 314 billion parameters. As a reference, GPT-3 included 175 billion parameters. Parameter depend is a tough measure of an AI mannequin’s complexity, reflecting its potential for producing extra helpful responses.

xAI is releasing the bottom mannequin of Grok-1, which isn’t fine-tuned for a particular job, so it’s possible not the identical mannequin that X makes use of to energy its Grok AI assistant. “That is the uncooked base mannequin checkpoint from the Grok-1 pre-training part, which concluded in October 2023,” writes xAI on its launch web page. “Which means the mannequin just isn’t fine-tuned for any particular software, equivalent to dialogue,” which means it is not essentially delivery as a chatbot. However it should do next-token prediction, which means it should full a sentence (or different textual content immediate) with its estimation of essentially the most related string of textual content.

“It isn’t an instruction-tuned mannequin,” says AI researcher Simon Willison, who spoke to Ars by way of textual content message. “Which suggests there’s substantial additional work wanted to get it to the purpose the place it might function in a conversational context. Will likely be attention-grabbing to see if anybody from exterior xAI with the talents and compute capability places that work in.”

Musk initially introduced that Grok could be launched as “open supply” (extra on that terminology under) in a tweet posted final Monday. The announcement got here after Musk sued OpenAI and its executives, accusing them of prioritizing earnings over open AI mannequin releases. Musk was a co-founder of OpenAI however is not related to the corporate, however he frequently goads OpenAI to launch its fashions as open supply or open weights, as many imagine the corporate’s identify suggests it ought to do.

On March 5, OpenAI responded to Musk’s allegations by revealing old emails that appeared to recommend Musk was as soon as OK with OpenAI’s shift to a for-profit enterprise mannequin by a subsidiary. OpenAI additionally mentioned the “open” in its identify means that its ensuing merchandise could be out there for everybody’s profit moderately than being an open-source method. That very same day, Musk tweeted (cut up throughout two tweets), “Change your identify to ClosedAI and I’ll drop the lawsuit.” His announcement of releasing Grok brazenly got here 5 days later.

Grok-1: A hefty mannequin

So Grok-1 is out, however can anyone run it? xAI has launched the bottom mannequin weights and community structure underneath the Apache 2.0 license. The inference code is available for download at GitHub, and the weights might be obtained by a Torrent link listed on the GitHub web page.

With a weights checkpoint measurement of 296GB, solely datacenter-class inference {hardware} is prone to have the RAM and processing energy essential to load all the mannequin without delay (As a comparability, the biggest Llama 2 weights file, a 16-bit precision 70B model, is round 140GB in measurement).

To this point, we have now not seen anybody get it working domestically but, however we have now heard studies that persons are engaged on a quantized model that may scale back its measurement so it could possibly be run on client GPU {hardware} (doing this can even dramatically scale back its processing functionality, nevertheless).

Willison confirmed our suspicions, saying, “It is arduous to guage [Grok-1] proper now as a result of it is so huge—a [massive] torrent file, and then you definately want an entire rack of pricy GPUs to run it. There could be community-produced quantized variations within the subsequent few weeks which can be a extra sensible measurement, but when it is not at the least quality-competitive with Mixtral, it is arduous to get too enthusiastic about it.”

Appropriately, xAI just isn’t calling Grok-1’s GitHub debut an “open-source” launch as a result of that time period has a specific meaning in software program, and the business has not but settled on a time period for AI mannequin releases that ship code and weights with restrictions (like Meta’s Llama 2) or ship code and weights with out additionally releasing coaching knowledge, which suggests the coaching means of the AI mannequin can’t be replicated by others. So, we usually name these releases “supply out there” or “open weights” as a substitute.

“Probably the most attention-grabbing factor about it’s that it has an Apache 2 license,” says Willison. “Not one of many not-quite-OSI-compatible licenses used for fashions like Llama 2—and that it is one of many largest open-weights fashions anybody has launched to this point.”

[ad_2]

Source

fusion technewsMarch 18, 2024

0 8 3 minutes read