Обучение моделей ИИ: поиск недорогой разработки ИИ

ИЗОБРАЖЕНИЕ КРЕДИТ:
Кредит изображения
Istock

Обучение моделей ИИ: поиск недорогой разработки ИИ

Обучение моделей ИИ: поиск недорогой разработки ИИ

Текст подзаголовка
Модели искусственного интеллекта, как известно, дорого строить и обучать, что делает их недоступными для большинства исследователей и пользователей.
    • Автор:
    • Имя автора
      Квантумран Форсайт
    • 21 марта 2023

    Deep learning (DL) has proven to be a competent solution to several challenges in artificial intelligence (AI) development. However, DL is also becoming more expensive. Operating deep neural networks requires high processing resources, particularly in pre-training. Worse, this energy-intensive process means that these requirements result in large carbon footprints, damaging the ESG ratings of AI research commercialization.

    Training AI models context

    Pre-training is now the most popular approach to building large-scale neural networks, and it has shown great success in computer vision (CV) and natural language processing (NLP). However, developing huge DL models has become too costly. For example, training OpenAI's Generative Pre-trained Transformer 3 (GPT-3), which has 175 billion parameters and needs access to enormous server clusters with top-notch graphics cards, had an estimated cost of USD $12 million. A powerful server and hundreds of gigabytes of video random access memory (VRAM) are also needed to run the model.

    While major tech companies might be able to afford such training costs, it becomes prohibitive for smaller startups and research organizations. Three factors drive this expense. 

    1. Extensive computation costs, which would need several weeks with thousands of graphic processing units (GPUs).

    2. Fine-tuned models require massive storage, usually taking up hundreds of gigabytes (GBs). Furthermore, multiple models for different tasks need to be stored.

    3. Training large models requires precise computational power and hardware; otherwise, results might not be ideal.

    Because of prohibitive costs, AI research has become increasingly commercialized, wherein Big Tech companies are leading the studies in the field. These firms also stand to gain the most from their findings. Meanwhile, research institutions and nonprofits often have to collaborate with these businesses if they want to conduct their exploration in the field. 

    Разрушительное воздействие

    There is evidence that suggests neural networks can be "pruned." This means that within supersized neural networks, a smaller group can achieve the same level of accuracy as the original AI model without heavy impacts on its functionality. For example, in 2020, AI researchers at Swarthmore College and the Los Alamos National Laboratory illustrated that even though a complex DL model can learn to predict future steps in mathematician John Conway's Game of Life, there is always a smaller neural network that can be taught to do the same thing.

    Researchers discovered that if they discard numerous parameters of a DL model after it has completed the entire training procedure, they can reduce it to 10 percent of its original size and still achieve the same result. Several tech companies are already compressing their AI models to save space on devices like laptops and smartphones. This method not only saves money but also allows the software to run without an Internet connection and get results in real-time. 

    There were also instances when DL was possible on devices powered by solar batteries or button cells, thanks to small neural networks. However, a limitation of the pruning method is that the model still needs to be completely trained before it can be reduced. There were some initial studies on neural subsets that can be trained on their own. However, their accuracy is not the same as those of supersized neural networks.

    Implications of training AI models

    Wider implications of training AI models may include: 

    • Increased research in different methods of training neural networks; however, progress might be slowed by lack of funding.
    • Big tech continuing to fund their AI research labs, resulting in more conflicts of interest.
    • The costs of AI development creating the conditions for monopolies to form, limiting the ability of new AI startups to compete independently with established tech firms. An emerging business scenario may see a handful of large tech firms developing giant proprietary AI models and leasing them to smaller AI firms as a service/utility.
    • Research institutions, nonprofits, and universities being funded by big tech to conduct some AI experiments on their behalf. This trend can lead to more brain drain from academia to corporations.
    • Increased pressure for big tech to publish and regularly update their AI ethics guidelines to make them accountable for their research and development projects.
    • Training AI models becoming more expensive as higher computing power is increasingly required, leading to more carbon emissions.
    • Some government agencies attempting to regulate the data used in the training of these giant AI models. As well, competition agencies may create legislation that forces AI models of a certain size to be made accessible to smaller domestic firms in an effort to spur SME innovation.

    Вопросы для рассмотрения

    • If you work in the AI sector, how is your organization developing more environmentally sustainable AI models?
    • What are the potential long-term consequences of expensive AI models?

    Ссылки на статистику

    Для этого понимания использовались следующие популярные и институциональные ссылки: