Interesting Stuff - Week 09, 2025

Posted by nielsb on Sunday, March 2, 2025

AI is pushing boundaries once again! This week, we explore a multi-agent system automating PC tasks, OpenAI’s reasoning models vs GPT workflows, and a next-gen research AI outshining Google.

Plus, we dive into LLM customisation, AI’s evolving role in engineering, and the arrival of GPT-4.5. Finally, the Call for Speakers for Data & AI Community Day Durban: AI @ Ignite++ is live—registrations open today, so don’t miss out! 🚀

Podcast

If you rather listen to the summary:

Click on the link above to listen to the podcast. Oh, the direct link to the episode is here.

Generative AI

  • PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC. This research paper introduces PC-Agent, a novel hierarchical multi-agent framework designed to automate complex tasks on personal computers. The framework addresses the challenges of the PC’s complex interactive environment and intricate workflows using an Active Perception Module (APM) to improve screenshot content perception. PC-Agent employs a hierarchical architecture with specialized agents (Manager, Progress, Decision, and Reflection) operating at different levels (Instruction-Subtask-Action) to manage instruction decomposition, track progress, make decisions, and provide error feedback. The paper also introduces PC-Eval, a new benchmark for evaluating PC agent performance on real-world complex instructions, demonstrating that PC-Agent significantly improves task completion rates over existing methods.
  • Reasoning best practices. This OpenAI document guides OpenAI’s reasoning models (like o1 and o3-mini) and how they compare to their GPT models (such as GPT-4o). It highlights that reasoning models excel at complex tasks like strategising, planning, and decision-making under ambiguous conditions, particularly in fields requiring human expertise. In contrast, GPT models are better suited for efficiently executing well-defined tasks. The guide emphasises that the best AI workflows often combine both model types, using reasoning models for high-level planning and GPT models for rapid task completion. Finally, it offers practical advice on when to use reasoning models, how to prompt them effectively, and details successful use cases observed by OpenAI customers.
  • These experts were stunned by OpenAI Deep Research. This blog post discusses OpenAI’s Deep Research, a new product built upon their powerful o3 reasoning model, capable of extended thinking and web searching to answer complex questions. Testers preferred Deep Research over Google’s equivalent, praising its depth and quality of responses, often comparing it to work done by experienced professionals. The article highlights Deep Research’s iterative search process, which mimics human research by refining searches based on gathered information, surpassing traditional Retrieval Augmented Generation (RAG) systems that struggle with complex queries. This advancement indicates significant progress in AI capabilities and suggests potential for further improvement through self-play, where models learn from their own reasoning processes.
  • 6 Common LLM Customization Strategies Briefly Explained. This Towards Data Science post outlines and elucidates various methods for adapting large language models (LLMs) to specific needs. It provides readers with a concise overview of different strategies for modifying and fine-tuning LLMs for improved performance in niche applications. Ultimately, it serves as an introductory resource on the practical customisation of these powerful AI models.
  • AI is not the engineer we wanted, but it might be the one we need. In this blog post, the argument is made that while AI isn’t currently capable of fully replacing software engineers, it has the potential to significantly disrupt the development of B2B SaaS applications and internal tools. The author contends that the modern software landscape, characterised by the widespread use of reusable components and SaaS APIs, has unknowingly created an environment where AI can bridge the gap between business logic and code implementation. Instead of viewing AI as a replacement for engineers, it should be seen as a tool that enables non-technical business experts to create applications directly, similar to how Heroku simplified cloud computing by enabling easier application development. The post suggests that generative AI is arriving at the opportune moment to capitalise on this existing ecosystem and infrastructure.
  • Introducing GPT-4.5. This blog post introduces GPT-4.5, OpenAI’s latest and most advanced chat model, a research preview available to Pro users and developers. Its primary advancement lies in scaling unsupervised learning, enhancing its knowledge base and intuition for improved accuracy and reduced hallucinations. The model is trained for better human collaboration, demonstrating greater emotional intelligence (“EQ”), nuance understanding, and creativity, particularly in writing and design applications. While reasoning capabilities are still developing, GPT-4.5 showcases improved performance on academic benchmarks. It is intended to be a foundation for future models, with OpenAI inviting user feedback to guide its continued development and API availability.

WIND (What Is Niels Doing)

Last week, we officially opened the Call for Speakers for Data & AI Community Day Durban: Season of AI - AI @ Ignite++ on March 22, and the response has been incredible! We’ve already received some fascinating session submissions covering cutting-edge AI, AI Agents, Database development, ML, and more. However, the CfS is still open, and we haven’t filled all speaker slots yet. If you have an exciting topic to share—whether it’s on AI, AI Agents, LLMs, real-time intelligence, database development or any other innovative tech—submit your session now! We’d love to see more fresh faces and unique perspectives on stage.

🔥 Submit Your Session Here

And for those eagerly waiting—registrations open later today (Sunday, March 2)! If past events are anything to go by, tickets will disappear fast 🏃💨, so don’t wait too long to grab yours. Join us for a full day of insightful talks, networking, and hands-on learning with some of the best minds in AI and data.

🎟 Register for the Event

📅 Event Details

We can’t wait to see you there! 🚀

~ Finally

That’s all for this week. I hope you find this information valuable. Please share your thoughts and ideas on this post or ping me if you have suggestions for future topics. Your input is highly valued and can help shape the direction of our discussions.


comments powered by Disqus