skip to main content

ThesoTM thesaurus

Content language

Home / Explore / Browse / Vocabularies / ThesoTM thesaurus

... > generically dependent continuant > information content entity > language model > large language model > Megatron-LM

Preferred term

Megatron-LM

Definition(s)

A language model pre-training based on GPU parallelism.

Broader concept(s)

large language model

Bibliographic citation(s)

• Narayanan, D., Shoeybi, M., Casper, J., LeGresley, P., Patwary, M., Korthikanti, V. A., Vainbrand, D., Kashinkunti, P., Bernauer, J., Catanzaro, B., Phanishayee, A., & Zaharia, M. (2021). Efficient large-scale language model training on GPU clusters using megatron-lm. ArXiv:2104.04473 [Cs]. http://arxiv.org/abs/2104.04473
• Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., & Catanzaro, B. (2019). Megatron-LM: Training multi-billion parameter language models using GPU model parallelism. https://arxiv.org/abs/1909.08053v1

base of

BioMegatron

has application field

generic

has for input language

English

has repository

https://github.com/NVIDIA/Megatron-LM

documentation URL

https://github.com/NVIDIA/Megatron-LM

is encoded in

Python

is executed in

word embedding

has for license

Apache License Version 2.0

In other languages

Megatron-LM

French

URI

http://data.loterre.fr/ark:/67375/LTK-DCJM3LC1-6

Download this concept:

RDF/XML TURTLE JSON-LD Last modified 6/20/24