0xFunky
0xFunky|May 29, 2025 17:09
Although I don't know@ stayloudio What does it have to do with this paper, but seeing it really gives me a sense of familiarity. This 2017 paper was written by Google Propose, abandon for the first time RNN、CNN, Proposed that language tasks can be processed solely through the 'Attention mechanism', thus initiating Transformer The era, even to this day, remains unchanged LLM( The cornerstone of large-scale language models. Transformer The core concept is: The key to language lies in the context, where the meaning of a word depends on its association with other words. So the introduction of the 'Attention mechanism' allows the model to focus on the entire sentence while looking at a word, identifying which words are the key points it should pay attention to - no longer just reading them word by word, but being able to capture the global focus. My own favorite AI The most familiar in the field is NLP( Natural Language Processing (NLP) in the past Kaggle I have participated in many of them NLP I have won some medals in the competition. The most magnificent model at that time was Google Launched BERT, Specializing in semantic understanding, Bert The variant model almost swept through everything benchmark。 But OpenAI It was also launched at that time GPT-2, Mainly focused on generating ability, but at that time the stability and accuracy were not as good BERT, It hasn't entered the mainstream yet. until GPT-3(2020) Released, with 175 billion parameters, super strong generation power and few-shot The ability completely shakes the industry. From that moment on, LLM Becoming the new king, BERT The series models are gradually phasing out, and GPT The architecture has developed all the way to ChatGPT、Claude、Gemini, Opened now AI Great era. All of this started with that paper. Transformer The structure hasn't changed, but the world has long changed. But we're still here attention In the middle. ===== Supplement model knowledge • Transformer(2017): The first pure in history Attention Architecture, created without relying on RNN The era of language models. Being able to read a complete sentence at once and decide which words to focus on greatly improves efficiency and comprehension. • BERT(2018,Google): be based on Transformer encoder The 'Understanding Model' specializes in sentiment analysis, question answering, and text classification. Like a language comprehension expert, it is super powerful for doing reading tests. • GPT(2018 Get up, OpenAI): be based on Transformer decoder The "generative model" is skilled in writing stories, dialogues, and supplementing sentences, and is a master of language creation. GPT-3 It is also a representative of the ability of few shot learning. 《Attention is All You Need》, classic.
Mentioned
Share To

Timeline

HotFlash

APP

X

Telegram

Facebook

Reddit

CopyLink

Hot Reads