07-24-Daily AI News Daily
AI News Daily 2025/7/24
AI Daily | Updated at 8 AM | All-Network Data Aggregation | Exploring Frontier Science | Industry’s Free Voice | Open Source Innovation Power | AI and Humanity’s Future | Visit Web Version
AI Product Self-Recommendation: GeminiCli2API
Feeling throttled by Google Gemini’s official free API limits? Want to seamlessly integrate Gemini’s power into your favorite third-party apps? Good news: GeminiCli2API is here with the perfect solution!
GeminiCli2API is a clever local proxy that wraps the more lenient Gemini CLI into a standard, OpenAI-compatible API service. This means you can finally break free from the official free API limits, enjoy higher request quotas authorized by your Google account, and go wild with development, testing, and creation, waving goodbye to those annoying “Quota Exceeded” errors!
But wait, there’s more! The real magic of GeminiCli2API lies in its “surgical-level” control over system prompts. This feature is a total game-changer:
- Override: Set a global “golden prompt” to force all connected applications to use it, ensuring absolute uniformity in AI character and output style.
- Append: Discreetly “append” an additional layer of your instructions while retaining the client’s original system prompt, allowing for fine-tuning rules and enhancing capabilities without the client’s awareness.
- Extract & Audit: Easily record all prompts passing through the proxy for analysis, debugging, and optimization, or even for building your own high-quality datasets.
With just a few simple configuration steps, you can connect LobeChat, NextChat, and any other OpenAI-compatible tool to this locally “enhanced” Gemini service. GeminiCli2API isn’t just a proxy; it’s a powerful toolbox in your hands for mastering and taming AI. Go on, give it a try!
AI Content Summary
Kai-Fu Lee launched AI agent "Wanzai," Google released a faster, lower-cost new model.
Kuaishou and Shanghai Jiao Tong University open-sourced multimodal model Orthus, Kunlun Wanwei upgraded its AI music platform.
Frontier research aims to break through large model context limits and enhance AI's long-range reasoning capabilities.
In terms of industry dynamics, AWS dissolved its AI research institute in Shanghai.
Meanwhile, AI has also sparked data privacy ethical controversies and widespread AI anxiety in the workplace.AI Product and Feature Updates
Big news! Kai-Fu Lee’s company, 01.AI, has officially unveiled its first enterprise-grade AI agent – “Wanzai”. This isn’t just another chatty chatbot; it’s precisely positioned as a “super employee” capable of deep thinking, autonomous planning, and executing complex tasks. By seamlessly connecting with massive internal enterprise knowledge bases and external key services, “Wanzai” aims to transform from a passive “tool that takes orders” into a proactive “decision-maker that delivers results.” Kai-Fu Lee confidently predicts that AI agents are evolving from executing simple workflows (L1) to possessing autonomous planning capabilities (L2), ultimately moving towards a grand blueprint where multiple AIs collaborate to completely reshape enterprise operations (L3). Looks like your future cubicle mate might not be human anymore! This industry transformation is exactly what this edition of AI News is deeply tracking.

Google has unleashed another killer app! They’ve officially released the stable version of Gemini 2.5 Flash-Lite, proudly proclaiming it their fastest and lowest-cost AI model to date – a perfect “peacemaker” between performance and your wallet. This new model not only strikes an incredible golden balance in performance and cost but also natively supports an astonishing 1 million token context length, basically making it a “super chatterbox” with an incredible memory. Even more enticing is its highly competitive pricing strategy, costing just $0.10 per million input tokens, which is undeniably a fierce price war against all competitors. Developers, are you ready for this overwhelming value-for-money storm? Friendly reminder: the old preview version alias will be officially deprecated on August 25th, so make sure to update your code ASAP to avoid service interruptions.

What happens when a short-video giant meets a top-tier university? The answer is Orthus! Kuaishou and Shanghai Jiao Tong University jointly unveiled this brand-new multimodal model called Orthus at the prestigious International Conference on Machine Learning (ICML), generously open-sourcing it for global developers. This newcomer, based on an advanced autoregressive Transformer architecture, not only excels in freely navigating between text and image modalities but also, with astonishing computational efficiency, surpasses predecessors like Chameleon in various mainstream image understanding benchmarks. What’s even more jaw-dropping is that it actually beat the heavyweight model SDXL, specifically designed for image generation, in the text-to-image metric—truly an exceptionally talented cross-domain prodigy! This breakthrough undoubtedly declares to us: the boundaries of multimodal AI are far wider and more expansive than we imagined, and the future possibilities are simply limitless.
The domestic AI music scene is making waves again! Kunlun Wanwei’s AI music creation platform, Mureka, has received a major V7 upgrade, and its overall performance has now surpassed the popular overseas app Suno in several key dimensions, demonstrating impressive technical prowess. The biggest highlight of the new version is its self-developed music “chain-of-thought” technology, “MusiCoT.” This innovative tech allows the AI to “think deeply” about the entire song’s structure, emotion, and melodic progression before it even starts composing, resulting in more coherent melodies and richer, more emotional musical pieces. Users can not only generate songs with simple text descriptions but also upload audio samples to mimic a specific singer’s voice or even generate a rather “down-to-earth” style music video with a single click—talk about maximum entertainment! From this in-depth review - AI News , it’s clear that AI music is steadily progressing from the basic “listenable” stage to the advanced “pleasing to the ear” and emotionally resonant stage. The future music creation ecosystem is set to become much more diverse and exciting because of this.

Struggling to explain abstract concepts like “bubble sort” or “entropy increase” to students or clients? Don’t sweat it, your savior has arrived! A revolutionary AI animation engine called Fogsight has burst onto the scene, with its mission being to cure all sorts of perplexing abstract concepts. Users just need to input a keyword, and Fogsight works its magic, automatically generating a professional educational animation with complete narrative logic, excellent visual effects, and even thoughtfully equipped with bilingual narration. Built on advanced large language models, this powerful tool not only enables one-click intelligent generation but also provides a convenient conversational interface, allowing users to easily fine-tune and modify. What’s even more exciting is that, as part of the renowned WaytoAGI Open Source Project - AI News , it fully supports local deployment, offering educators and content creators worldwide an unprecedented super weapon capable of disrupting traditional creation workflows.

AI Frontier Research
For a long time, research into semantic segmentation for images and videos in the AI field has been like two parallel lines that never meet. Researchers worked in silos, lacking a unified theoretical framework, which undoubtedly hampered the development of general vision technology. Well, guess what? That situation has finally been broken! Researchers from multiple top universities have teamed up to propose the first framework capable of unifying the processing of these two heterogeneous data types: QuadMix. Its core is a highly creative “four-way mixing” mechanism that cleverly constructs rich and diverse intermediate domain representations between the source and target data domains, effectively narrowing the huge differences in cross-domain learning. This research is incredibly significant; it not only successfully unified previously fragmented research paths on a theoretical level but also broke records - AI News in multiple industry-standard benchmarks, laying a solid foundation for building more general and powerful multimodal perception systems in the future.

The limited context window of Large Language Models (LLMs) has always been their persistent “Achilles’ heel” when tackling complex long-range reasoning tasks, severely restricting their deep thinking capabilities. However, a paper titled “Beyond Context Limits: Subconscious Cues for Long-Range Reasoning” AI News brings us a ray of hope. Researchers have proposed the innovative TIM (Thread Inference Model), which mimics how the human brain processes complex information. It cleverly breaks down a big problem into a “reasoning tree” and retains only the most relevant “subconscious cues” in its “working memory.” This smart mechanism enables the model to handle almost infinitely long working memories and complex scenarios requiring multi-step tool calls, performing exceptionally well in mathematics and information retrieval tasks that demand high long-range reasoning. This opens up a highly promising new path to completely solve the “goldfish memory” stubborn problem of LLMs.
Getting AI to draw a picture and “Photoshop” an object into someone’s hand isn’t tough, but making it look like the person is actually “holding,” “lifting,” or “using” that object—that natural sense of interaction—is super hard to achieve. However, a recent study, “HOComp: Interaction-Aware Human-Object Composition” AI News , proposes an incredibly clever solution. This method first leverages powerful Multimodal Large Language Models (MLLMs) to deeply understand the type of interaction between humans and objects, for instance, whether it’s “tightly gripping” or “gently supporting.” Subsequently, it meticulously adjusts the human pose for the most natural interaction effect while simultaneously employing various carefully designed loss functions to ensure high consistency between the added object and the background in appearance. This ultimately elevates the realism and credibility of synthetic images to a whole new level, marking a significant step towards truly lifelike AI content generation.
AI Industry Outlook and Social Impact
Tech giants, in their relentless pursuit of breakthroughs, have once again crashed hard into the boundaries of personal privacy. It recently came to light that Elon Musk’s AI company, xAI, is reportedly collecting facial data from over 200 employees on a massive scale through an internal project called “Skippy” to train its core Grok model. The project’s stated goal is to enable AI to better understand and recognize complex human emotions. While xAI claims all data collection was done with signed employee consent and promises it’s only for internal training, the “permanent” access clause in the agreement has sparked widespread concern and unease among employees regarding privacy security and the potential abuse of portrait rights. This incident not only spawned the controversial virtual personas Ani and Rudi but also once again pushed tech giants’ difficult balancing act between innovation drive and ethical responsibility into the public spotlight. This piece of AI News also reminds us that technological development needs more robust regulations to safeguard it.

The AI wave is sweeping through global workplaces with unstoppable force, also giving rise to some rather comical new forms of “performance art.” A recent survey by Howdy.com reveals that about 16% of U.S. employees frankly admit they “pretend” to use AI at work, solely to meet their superiors’ expectations for technological innovation and cultivate an image of being tech-savvy. Behind this phenomenon lies widespread AI anxiety pervasive in the workplace: over one-fifth of employees feel uneasy using AI but are compelled by invisible pressure to adopt a stance of “embracing” new technologies. What’s even funnier is that another survey reveals the flip side of the coin: nearly half of employees who actually use AI in their work choose to keep it a secret from their bosses, fearing they might be perceived as slacking off or lacking in ability. This ongoing workplace “metamorphosis” profoundly exposes the huge gap between the pace of technological adoption and employees’ skill sets and psychological adaptation.
Here’s some somber AI News: Amazon Web Services (AWS) has officially confirmed the dissolution of its AI Research Institute in Shanghai, which was also AWS’s last overseas research institute globally. Dr. Wang Minjie, the institute’s chief application scientist, expressed deep emotions on his social media, stating he was “lucky to have caught the golden era of foreign enterprise research institutes in China.” Amazon officially responded that this was a “difficult decision,” aimed at streamlining teams and optimizing global resource allocation to enable more concentrated and continuous investment in core innovation areas. However, this move has undoubtedly sparked widespread concern and heated debate in the industry regarding whether foreign enterprises’ R&D strategies in China are fully contracting, seemingly also signaling the quiet end of a golden era where foreign capital led China’s frontier technology exploration.

Top Open Source Projects
moby - AI News (⭐70.1k): Think of it as the ultimate LEGO brick treasury for the containerized world! This collaborative project, initiated and led by Docker Inc., provides a complete set of standardized core components, allowing you to freely assemble and customize complex container-based systems like building with LEGOs. It’s an indispensable cornerstone for building all modern cloud-native applications.
OpenBB - AI News (⭐44.7k): This is a professional-grade investment research terminal aiming to be accessible to everyone. It cleverly integrates massive, complex financial data and expert analysis tools into a completely open-source platform. Its grand vision is to completely break down information barriers and truly democratize investment research.
hyperswitch - AI News (⭐22.3k): This open-source payment “super-switch” is meticulously crafted using the high-performance Rust language. It aims to make enterprise payment processes faster, more reliable, and more affordable than ever before, helping merchants easily integrate with and intelligently manage multiple payment channels, completely saying goodbye to the headache of being “kidnapped” by a single payment gateway.
jj - AI News (⭐17.9k): A new-generation version control system bravely claiming to be simpler and more powerful than Git. It not only achieves full compatibility with Git, allowing for seamless switching, but also offers a user experience far superior to its predecessors and a series of powerful new features. Perhaps this is the next “this is the one” tool for developers worldwide.
ConvertX - AI News (⭐5.9k): Think of this as your personal, all-in-one file conversion “super factory.” This is a fully self-hostable online file converter, powerful enough to support conversions between over 1000 file formats. It lets you easily transform any file format while ensuring absolute data privacy and security.
PakePlus - AI News (⭐4.8k): Witness the magic! This incredible tool can package any website or web project into an ultra-lightweight desktop and mobile application, less than 5MB in size, in just a few minutes. For developers looking to quickly achieve cross-platform product deployment, this is undoubtedly a highly efficient shortcut.
hrms - AI News (⭐3.1k): This is a fully functional open-source Human Resources and Payroll Management System. It provides small and medium-sized enterprises with a comprehensive and powerful HR solution. From detailed employee management to complex payroll distribution, all core HR tasks are kept under control, greatly enhancing management efficiency.
Social Media Shares
A senior engineer shared her deep concerns on Jike - AI News : A intern on her team completely relied on LLMs to write code, leading to a project riddled with bugs, and the intern himself couldn’t explain the core logic behind the code at all. She sharply pointed out that AI should be a powerful tool to aid human deep thinking, not a shortcut to skip fundamental learning processes. Young engineers who rely on models too early and neglect a solid understanding of underlying logic can easily fall into the elusive “vibe coding” trap, which is “really dangerous” for long-term personal career growth.
User wwwgoubuli deeply reviewed ByteDance’s AI programming tool Trae on X - AI News . He believes that while Trae’s performance in the full-lifecycle “solo mode” is only “six of one, half a dozen of the other” compared to other competitors, not yet creating a generational gap, its product interface design is “radical yet exceptionally reasonable.” The overall experience it delivers is unparalleled among similar domestic products. He couldn’t help but exclaim that ByteDance’s product prowess truly lives up to its reputation, powerful enough to inspire awe.
A developer praised Lovart.ai on X - AI News , hailing it as the world’s first true “Design Agent,” far more than just a simple drawing tool. This AI can think independently and fully execute a series of complex design tasks, from brand logo design and building a complete brand visual system to video ad creative and 3D model production. This undoubtedly loudly proclaims that a new AI-driven design era has arrived.
User Li Jigang shared a highly poetic and philosophical Prompt on X - AI News , designed to guide AI to embody a “language alchemist” for meticulously naming new products. The Prompt deeply emphasizes that a good name is “a container capable of holding grand dreams” and should strive for “a triple resonance between sound, form, and meaning.” The high caliber of its wording and the profound depth of its intent make it a rare work of art in the field of Prompt engineering.
If you’re eager for AI-generated images to have an amazing visual texture, then this clever trick shared on X - AI News by user Xiangyang Qiaomu is an absolute must-see. He generously shared a Prompt specifically for Claude that can consistently generate stunning, crystal-clear, light-and-shadow-intertwined 3D frosted glass card effects. Even better, he included a link to detailed instructions and impressive example images, practically holding your hand and teaching you to become an AI art master.

After “Big Tech High-Rankers,” the next status symbol that might make countless people envious could be “Independent Researcher.” User wwwgoubuli observed an interesting phenomenon on X - AI News : Many highly renowned GitHub project authors and academic bigwigs in the community seem to “vanish into thin air” in terms of their publicly published academic papers and active open-source contributions after choosing to join top tech companies like ByteDance or OpenAI. People can then only occasionally catch a glimpse of their latest research updates on these companies’ official blogs or executives’ tweets, sparking deep reflection on the relationship between open innovation and internal corporate R&D.
In the AI era, how should one choose their future professional path? A soon-to-be college freshman posted a request for help on Reddit - AI News , torn between two seemingly traditional majors: life sciences and agriculture. However, his core concern isn’t which major is currently hotter or easier to find a job in, but rather which one can better synergize and co-develop with AI technology in the future, instead of being mercilessly replaced by it. This question highlights Gen Z’s deep thinking and forward-looking planning regarding future technology and societal changes, and this piece of AI News truly gives us food for thought.
A developer excitedly launched an AI photo editor called PHOAI on Reddit - AI News . The coolest thing about this app is that it can directly transform natural language instructions like “turn me into an anime character” into stunning visual effects. Even more critically, all image processing runs efficiently on the user’s device locally, no cloud upload needed. This not only safeguards user privacy but also fully showcases the smooth experience and massive potential brought by on-device AI applications.

Want to systematically learn how to make LLMs “cite sources” and provide substantial answers? Then this new course on Retrieval-Augmented Generation (RAG) - AI News is an absolute must-not-miss. RAG technology significantly improves the factual accuracy of large model answers by intelligently retrieving and injecting relevant information from external knowledge bases before the model generates its response. It also effectively avoids costly and time-consuming model retraining processes, making it a crucial core technology for building production-grade AI applications today.
Listen to the Voice Version of AI Daily
| Xiaoyuzhou 🎙️ | Douyin 📹 |
|---|---|
| Lai Sheng Pub | Self-Media Account |
![]() | ![]() |

