Much like the SaaS explosion of the 2000s, which caught many off guard before spawning an ecosystem of trillion-dollar companies, vertical AI agents are poised to unlock immense value. Not only do they streamline repetitive workflows, but they also promise to replace entire teams, cutting operational costs and enabling companies to scale faster and more...
Author: Martin Treiber
Tag Based Prompting for Better Prompting Performance
Large Language Models (LLMs) have amazed us with their ability to generate human-quality text, translate languages, and answer complex questions. But what happens when you need them to tackle something outside their general knowledge base – like predicting the properties of a protein or translating a highly structured technical document? That's where tag-based prompting comes...
In-Context Scheming in Frontier Language Models
Researches from Apollo Research have investigated the ability of large language models (LLMs) to engage in "scheming"—covertly pursuing misaligned goals. The research evaluated several leading LLMs across various scenarios designed to incentivise deceptive behaviour, finding that these models can strategically deceive, manipulate, and even attempt to subvert oversight mechanisms to achieve their objectives. The study...
The Long Context
In "You Exist In The Long Context," Steven Johnson explores the advancements in large language models (LLMs), particularly the significant impact of long context windows. Johnson illustrates this progress by creating an interactive game based on his book, showcasing the LLM's ability to handle complex narratives and maintain factual accuracy. He draws a parallel between...
The Model Context Protocol
Anthropic's Model Context Protocol (MCP) is an open-source standard for connecting AI assistants to various data sources. MCP employs a client-server architecture, enabling two-way communication between AI applications (clients) and data providers (servers) via different transports like stdio and HTTP with SSE. The protocol facilitates access to resources, tools, and prompts, enhancing AI response relevance...
Anthropic’s Enhanced Writing Styles
Anthropic's Claude AI has been updated with a "styles" feature, allowing users to customise the AI's communication style by pre-selecting formal, concise, or explanatory modes, or by uploading custom examples. This personalisation approach differentiates Claude from competitors like ChatGPT and Gemini, who maintain a single conversational style. Anthropic highlights its commitment to data privacy, stating...
TinyTroupe: Simulating Human Behaviour with AI
Microsoft has released TinyTroupe, an open-source Python library that uses large language models to simulate human behaviour in virtual environments. This allows for testing digital advertising, software, and generating synthetic data for machine learning. The library enables the simulation of multiple AI agents ("TinyPersons") with individual personalities interacting within a simulated world ("TinyWorld"), facilitating virtual...
Five Useful and Fun NotebookLM Hacks
Google’s NotebookLM has taken the tech world by storm over the past few months. By simply uploading your sources, NotebookLM becomes an instant expert—grounding its responses in your material and offering powerful ways to transform information. Plus, since it’s your notebook, your personal data remains entirely private and isn’t used to train the AI. One...
Claude 3.5 Computer Use: The AI That Sees and Controls Your Computer
Imagine an AI that can navigate your computer just like you do, using only its "eyes" to understand and interact with the screen. That's exactly what Claude 3.5 Computer Use aims to achieve. It can tackle various tasks, from browsing the web to conquering challenges in video games, all without relying on traditional methods like...
From Boom to Bust: Is Generative AI Killing Freelance Work?
Generative AI is transforming industries, and the online freelance market is no exception. Recent research explores the immediate effects of tools like ChatGPT and AI-driven image generators on freelance job opportunities. By analyzing job postings on a major freelancing platform, the study reveals a striking decline in demand for roles in writing, software development, and...