If one can view transformers as Markov chains, then limiting results for Markov chains apply to transformers. Cosma noted for example that “[t]here are finite-state probabilistic languages which cannot be exactly represented by finite-order Markov chains.“. There are also arguments that finite-state, finite-order Markov processes cannot model human language in principle (Debowski). For Cosma and… Continue reading Transformers as Markov processes
Category: ML concepts
Do LMMs really train themselves?
Recently Holger Lyre presented his paper “Understanding AI”: Semantic Grounding in Large Language Models in our group seminar. And while I generally remain skeptical about his claims of semantic grounding (maybe the occasion for a separate post) here I want to address a misunderstanding in his paper about what he calls “self-learning”, “self-supervised learning” or… Continue reading Do LMMs really train themselves?