Exabits has demonstrated its capability to train large language models (LLMs), partnering with MyShell to dramatically reduce training costs from billions to under $100,000.

JetMoE-8B is trained at less than a $0.1 million cost but outperforms LLaMA2-7B from Meta AI (multi-billion dollar compute cost)

 

MyShell: Achieving LlaMA2 performance with the $100,000 JetMoE model, inspired by the sparse activation architecture of ModuleFormer, signifies a remarkable milestone in machine learning. The JetMoE-8B, with its 8 billion parameters and sophisticated structure of 24 blocks, each housing two MoE layers (Attention Head Mixture and MLP Experts Mixture), showcases advanced efficiency and computational intelligence. Each layer’s selective activation of 2 out of 8 experts per input token demonstrates a refined utilization of the Sparse Mixture of Experts (SMoE) framework, enhancing the model’s responsiveness and resource management. 

 

The efficiency of JetMoE-8B, with its 2.2 billion activation parameters, significantly lowered training costs while delivering robust performance. The model’s effectiveness is illustrated in the subsequent figure: JetMoE-8B achieved state-of-the-art results in five categories on eight evaluation benchmarks, outperforming competitors like LLaMA-13B, LLaMA2-7B, and DeepseekMoE-16B.

On the MT-Bench benchmark, JetMoE-8B scored 6.681, surpassing models with larger capacities, such as LLaMA2 and Vicuna, which possess 13 billion parameters.

 

But what superpowers this architectural sophistication is Exabits’ contribution of an accelerated and stabilized cluster of 12 H100 GPU nodes (96 GPUs). Exabits’ platform played a pivotal role in powering the JetMoE model, ensuring stable, ultra-available and robust performance at a fraction of the cost of “big compute.” This synergy between JetMoE’s innovative design and Exabits’ cutting-edge GPU technology not only exemplifies a leap in machine learning capabilities but also highlights the effectiveness of combining advanced model architectures with Exabits’ cloud compute infrastructure.

 

Breaking the Myth: Decentralized GPU Platform for LLM Training

Exabits has disproved the skepticism that decentralized GPU platforms are unsuitable for LLM training. With a sophisticated technical stack, efficient middleware, and a robust supply chain of computational resources, Exabits has demonstrated that LLM training and inference are not only possible but also efficient and deeply cost-effective on such a platform.

Exabits, a decentralized cloud compute platform, overcomes the limitations of standard decentralized platforms by serving as the infrastructure base layer of AI computing and offering a full-stack solution. It does this by aggregating, accelerating, and stabilizing consumer-grade GPUs to match enterprise-grade GPU performance to almost parity. This approach taps into a vast, yet largely idle reserve of consumer GPUs, easing the GPU shortage crisis. Also, Exabits’ extensive experience in the data center sector provides unique access to coveted enterprise-grade H100 and A100 GPUs, and soon the B200s, further advancing the democratization of AI development. Partnerships with major projects in decentralized cloud compute have helped Exabits to seed and establish a widespread, interconnected decentralized compute network. This super-network has the potential to stand against the giants of centralized, traditional cloud compute, making AI accessible to anyone who wants to build in the space. 

 

The Future of LLM Training with Exabits

Exabits is not just a technological platform; it is a beacon for the future of LLM training, embodying affordability, accessibility, and environmental consciousness. The success of JetMoE-8B underlines the feasibility of this platform in executing high-end model training, paving the way for more sustainable and inclusive advancements in AI research and development.

In conclusion, Exabits stands as a revolutionary force in the AI domain, challenging big compute and proving that cloud compute platforms in the web3 space can indeed support real LLM training efficiently and cost-effectively. This not only opens up new avenues for AI research and application but also sets a new standard in the computational economy, heralding a new era of innovation and collaboration in the field of web3 and artificial intelligence.

Media contact

Contact: Roy Evans

Company Name: ExaBITs Network LTD.

Phone: +1 650 642 8104

Website: https://www.exabits.ai

Email: contact@exabits.ai

 

Contact Person: Zengyi Qin

Company Name: MyShell

Website: https://myshell.ai

Email: charles@myshell.ai

Disclaimer: The information provided in this press release is not a solicitation for investment, or intended as investment advice, financial advice, or trading advice. It is strongly recommended that you practice due diligence (including consultation with a professional financial advisor) before investing in or trading securities and cryptocurrency.

This press release was originally published on this site

You May Also Like

Ironwood Academy Launches to Redefine Financial Learning

How Is Ironwood Changing the Way People Learn Finance? In a world…

Lunar Digital Assets Highlights Breakout Year for the Litecoin Ecosystem

Lunar Digital Assets, the leading full-stack blockchain venture studio, has declared 2025…

MetaWin Unveils ‘MetaWin Millionaire’: A Revolutionary $1 Million Cryptocurrency Giveaway

London, United Kingdom, November 16th, 2023, Chainwire MetaWin, a trailblazer in the…

NaaS Technology Inc. Announces Plans to Acquire Swedish Leading EV Charging Infrastructure Supplier Charge Amps for $66.4M

STOCKHOLM, AUGUST 22, 2023 – NaaS Technology Inc. (NASDAQ: NAAS), a leading electric…

Community-Driven Rage Coin ($RAGE) Sets a New Standard for Uselessness, Promising Simplicity and Zero Taxes

Today, the crypto community witnesses the rise of a new king in…

Unlocking Potential: Furrever Token’s 25% Bonus Amid Bitcoin and Ethereum’s Success

As Bitcoin celebrates a record $886 million ETF inflow and Ethereum prepares…

Halving 2024: How Bitcoin (BTC), Ethereum (ETH), and Furrever Token (FURR)’s are Poised to Transform in the New ETF-Driven Market Landscape

As the cryptocurrency community approaches the 2024 Bitcoin (BTC) halving, excitement builds not…

DDBMiner helps you earn $32,800 a day: the most popular cloud mining brand

In the fast-moving world of cryptocurrency, simplicity and profitability are crucial. For…

STrade GPT and Shanghai YuanMiao Trading Co., Ltd Forge a Groundbreaking Partnership

STrade GPT, a leading AI trading company, and Shanghai YuanMiao Trading Co.,…

Retro Style NFTs That Will Take You Back In Time…


Warning: Attempt to read property "post_title" on null in /home/l9s486wnf6gu/public_html/wp-content/plugins/wp-rss-feed-to-post/includes/wprss-ftp-display.php on line 111
RetroKydz brings you an interesting opportunity to re-live your childhood. It offers…