Froth on the Daydream (FOD) – our weekly summary of over 150 AI newsletters. We connect the dots and cut through the froth, bringing you a comprehensive picture of the ever-evolving AI landscape. Stay tuned for clarity amidst surrealism and experimentation.

The recent surge in the sophistication of Large Language Models (LLMs) — both proprietary and open-source — presents a paradox of potential and perplexity. These systems, characterized by their remarkable natural language processing capabilities, have propelled us into a new era of technological marvels. Yet, they also bring many challenges, chiefly in the realm of trustworthiness.

Last week’s paper, “TrustLLM: Trustworthiness in LLMs” — a joint work of almost 70 researchers — underscores the multifaceted nature of trustworthiness in LLMs. It highlights how these models, while excelling in tasks like stereotype rejection and natural language inference, still grapple with issues of truthfulness, safety, fairness, and privacy. These findings echo the complexities of ensuring AI that is both effective and ethically sound.

The paper also poses a question: “To what extent can we genuinely trust LLMs?”

But can we genuinely trust LLMs? We can’t.

Much better would be to adopt the principle of ‘trust, but verify.’ This approach, reminiscent of Cold War-era diplomacy, is increasingly relevant in the digital age, especially with advancements in AI. It suggests a balanced strategy: embracing the utility and potential of these models while stringently scrutinizing their mechanisms and outcomes.

When working with LLMs, you can trust your expertise in verifying the work that LLM automates or accelerates for you. But you can’t just genuinely trust it. I even think that, along with the new role of an AI engineer, we should have a new job position for an in-house AI Verifier, akin to a fact-checker in a media publication.

The other news from last week ‘complements’ the insights from the paper. Anthropic’s research on AI systems reveals a startling facet of deceptive ‘sleeper agents’ within LLMs. The paper studies threat models where AI models could be secretly trained or emerge to behave safely during training but unsafely in deployment. This discovery of hidden, hazardous capabilities within models, capable of evading standard safety protocols, exposes a critical vulnerability.

Meanwhile, the nuanced shift in OpenAI’s policy, discreetly lifting the prohibition on military applications, adds another layer to the debate. This move, aligning with the U.S. Defense Department’s stance, prompts a critical examination of the ethical and safety implications of AI in high-stakes domains like defense and intelligence. Here, a question of the trustworthiness of the people who build Large Language Models (LLMs) also raises its head.

On a more commercial and, so to speak, physical note, the launch of the Rabbit R1, a standalone AI device, exemplifies the rapid integration of AI in consumer technology. Its innovative use of a Large Action Model (LAM)* signals a shift towards more intuitive, seamless interactions between humans and AI-powered devices. However, it also raises concerns about the trustworthiness and security of such pervasive AI integration in everyday life.

*Many publications mistakenly attribute the coining of LAM to the Rabbit R1 team, when in fact, it was Salesforce Chief Scientist Silvio Savarese who coined it in June 2023 in his blog post “Towards Actionable Generative AI'“. Trust, but verify ;)

Adding to the global perspective: In its “Global Risks Report 2024,” the World Economic Forum identified AI-generated misinformation and disinformation, along with the resultant societal polarization, as more significant threats in its list of top 10 risks for the next two years, surpassing concerns such as climate change, war, and economic instability.

As we navigate this era of groundbreaking AI advancements, the “trust, but verify” principle remains a beacon. We need to balance the excitement of AI’s potential with rigorous, ongoing scrutiny of its trustworthiness, safety, and ethical implications.

The freshest research papers, categorized for your convenience

Efficient Model Architectures

Benchmark and Evaluation

Attention Mechanisms and Model Efficiency

Enhancing Model Performance

Machine Translation and Cross-Lingual Applications

Efficient Model Inference

In other newsletters

  1. Is this what will replace Transformers? Long-Context Retrieval Models with Monarch Mixer · Hazy Research (stanford.edu)
  2. If you are interested in reportage from CES — Hardcore software is the one to go to (a very long reportage!)
  3. A wonderful overview of Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs by Sebastian Raschka
  4. 12 techniques to reduce your LLM API bill and launch blazingly fast products by AI Tidbits
  5. DPO praise by Andrew Ng — a very interesting read

Also published here.