Notes for Nov 27, 2025 | AI Breakfast

AI Breakfast #18 – Notes (Nov 27, 2025)

Executive Summary

At our AI Breakfast #18, our group of founders, educators, and developers discussed topics ranging from language learning and reading with AI, coding and building with AI assistants, and turning complex documents into structured knowledge. The attendees also shared the latest on their work and projects, including a German reading coach for kids, an AI speech coach for K–12 classrooms, and a tool that turns reading lists into podcasts.

Member Work and Introductions

The morning started with short introductions and a Thanksgiving-style round of gratitude for specific AI tools. One founder is building a no-code cloud app builder and said the biggest unlock has simply been everyday voice‑to‑text on the phone. Being able to speak instead of type makes it easier to capture ideas, set reminders, and draft text throughout the day.

Another attendee, a consultant and product builder working on spatial interfaces for AR and VR glasses, talked about how real-time translation has changed travel and work. They recently tried smart glasses from a Chinese hardware maker that show live subtitles in the lenses. A microphone array helps the system focus on the main speaker so the subtitles stay readable even in a noisy office.

A data science and finance professional reflected on language learning itself. They now use AI every day to study Chinese and feel it has made learning far more efficient. At the same time, they see a tension: if real-time translation becomes normal, it may reduce the need to learn new languages at all. They also see the same tension with AI‑driven scheduling and productivity tools, which can make life more organized but can also make a person feel like they are taking orders from a machine.

One founder is building an automation platform to help companies run go‑to‑market operations, and as a side project they are creating a reading app for children that gives pronunciation feedback. Another member, an AI engineer preparing to apply for policy programs in Europe and North America, shared how they are shifting from visual to audio learning. They are developing a plugin that turns written materials into spoken word, in part because of eye strain and in part because they want to keep following high‑quality research while on the move.

Several people credited modern coding assistants for changing their careers. A generalist AI builder talked about how the first usable code-completion model felt like magic and made them much more productive. A full‑stack developer shared that they only became confident writing code after trying an AI coding tool at a previous meetup. Since then, AI has become a constant partner: helping with boilerplate, catching mistakes, and even generating user interfaces with services like Vercel v0, which can propose layouts and color schemes from just a few sentences of description.

Later in the session, a founder in the education space showed an AI agent that analyzes student speeches on video. Their company builds agents for schools that assess speaking ability in the classroom, and they are already working with several schools outside China plus early pilots with institutions in China. The tool breaks a speech into many skills and gives detailed feedback that teachers can use alongside their own evaluations.

Language learning and reading with AI

Language learning and reading took up a large part of the conversation. Members described how translation and transcription tools have become part of daily life, from dictating text messages to wearing glasses that subtitle conversations. For some, these tools make it finally possible to work or socialize in languages they do not fully speak. For others, they raise a hard question: if machines can translate in real time, will fewer people invest the effort to truly learn another language.

One parent and founder walked through a reading app they are building for children learning German. The app shows short illustrated stories and listens as the child reads out loud. It scores speed, accuracy, completeness, and even gives word‑level feedback. Today it uses a pronunciation assessment service from Microsoft that goes beyond normal speech‑to‑text and returns detailed scores. The group discussed how systems like Microsoft Reading Coach and similar tools inspired this project, but the parent wanted something that runs on a phone and fits their family’s routine.

Under the hood, the group compared different ways to get high‑quality pronunciation data. One approach is to use a general speech recognizer like Whisper and then add a separate alignment step using tools such as the Montreal Forced Aligner. Another is to rely on purpose-built services that already score phoneme‑level accuracy. Members noted that each path has trade‑offs between flexibility, cost, and engineering effort.

From there, the discussion moved into how people actually read. Several members talked about the difference between reading silently and reading out loud. For young learners, the group felt that speaking text is still important because it forces attention to every sound and trains the muscles involved in pronunciation. Others described speed reading strategies they had taught or learned in the past, including work influenced by educator Tony Buzan’s ideas about mind maps and reading quickly by reducing subvocalization. People who can read without “hearing a voice” in their head often move much faster through text, which led to a wider conversation about how varied internal mental experiences can be.

Coding and building with AI assistants

Another theme was how AI has changed software development. One member described the first time they used an AI coding assistant as a “magical moment” that made programming feel approachable again. Instead of searching documentation for every small question, they could describe the goal in natural language and get working code, then refine it in conversation.

A full‑stack developer shared how this shift lowered the barrier to entry for them personally. They had resisted AI tools at first, worried that relying on them would make them “stupid” or less skilled. After seeing AI complete and explain code during a previous event, they decided to lean in. Now, they use an AI‑powered editor every day, and pair that with UI‑generation tools like Vercel v0 when they need to design their own product interfaces without a dedicated designer.

Members compared different models and tools for coding in China, where connectivity and regulation shape what is actually usable. Some tools that work well outside China are now hard to access through proxies, so people experiment with alternative models and editors that can route traffic more reliably. The group’s overall sense was that understanding code and being able to read and reason about it is becoming more important than writing every line by hand. As one member put it, if you can read code and understand what it does, modern tools let you build almost anything.

OCR and unlocking archives

The group also dug into the harder side of document processing: getting reliable text out of messy PDFs. One member who has worked with OCR for many years described PDFs as “the worst standard possible for data extraction.” Some files contain clean, selectable text, while others are just scanned images. Many important documents also mix text, charts, images, and even LaTeX formulas on the same page, which makes automated processing much harder.

To handle this, one attendee has been using an open source model called dots OCR that was developed in China. Instead of just producing plain text, it segments each page into regions, labels which parts are text, images, or formulas, and returns both the content and the coordinates. They run it on Hugging Face and then wire the results into no‑code workflows using tools like n8n. The model is not fast—it can take close to a minute per page on modest hardware—but for one‑time processing of large archives, accuracy matters more than speed.

This led to a bigger reflection on hidden knowledge. Many countries still have huge collections of paper archives, including government and historical records that are only accessible in person. Members were excited about the idea of using better OCR and layout models to unlock those materials at scale. They compared it to what happened with protein folding: what used to take one researcher years can now be run across millions of proteins with the right models and compute. In the same way, better OCR and structuring tools might let a small team process entire archives that were once only reachable through slow, individual PhD projects.

Several members raised concerns about misinformation and the data that large models learn from. One attendee pointed out that social media platforms are already full of low‑quality or deliberately false information, including coordinated campaigns by institutions and bots. When language models scrape that content at scale, it becomes difficult for everyday users to tell which answers are grounded in solid sources and which are repeating noise.

The group talked about how model makers try to clean and filter their training data, but agreed that no process can fully remove bias or error. They compared long‑running human projects like Wikipedia with newer AI‑driven efforts such as Grok and the “Grok‑pedia” style knowledge bases that try to re‑build reference works with more explicit policies. Several people shared stories about trying to “game” Wikipedia years ago, including attempts to sneak in jokes or invented facts and then watching how quickly the volunteer editor community corrected them.

The consensus in the room was that resolving questions of truth—especially on sensitive topics like politics or current conflicts—will always depend on process and transparency more than on any single model. What matters is how sources are chosen, how conflicts are handled, and whether readers can see how an answer was produced. AI can help scale the work of summarizing and cross‑checking sources, but it does not remove the need for judgment.

AI, education, and fundraising across borders

In the second half of the breakfast, the group spent a long time exploring one member’s startup in depth. This founder is building an AI speech coach for K–12 students. Teachers upload or record a video of a student giving a talk, and the system analyzes it along dozens of dimensions. For content, it looks for a clear central idea, use of evidence and examples, depth of analysis, and relevance. For organization, it examines structure and transitions. For language skill, it rates grammar, vocabulary, and overall English proficiency.

The tool also experiments with voice and body‑language analysis. It tracks pace, volume, pitch, and enunciation, and it detects gestures, eye contact, and movement. Today those signals are good enough to highlight strengths and weaknesses, but scoring remains tricky because the “right” style can change by context. For each major area, the app generates an “optimization pathway” that summarizes what the student is doing well and suggests concrete next steps.

The startup already has paying pilot customers in North America and Southeast Asia and is working to close its first long‑term contracts in China. A key part of the value is customization: when a school’s English department adopts the tool, the team sits with the teachers to adapt the rubric and feedback prompts to their curriculum. They are also building a teacher analytics dashboard and a “speaking portfolio” for students that aggregates their practice over time.

On the business side, the founder described how hard early‑stage fundraising has been across borders. Seed investors in China are cautious about education and early revenue, and many prefer later‑stage or manufacturing‑focused bets. Investors in the US and elsewhere worry about regulatory risk and data rules when they hear that most of the team is based in China. Members suggested that the company might need to tell a bigger global story—serving schools worldwide, not just international schools in China—and that the most convincing path forward is to win a few anchor customers and show real revenue and retention.

Living and building in China’s AI ecosystem

The conversation closed with people reflecting on what it means to live and build in China’s tech ecosystem over the long term. Several attendees have been in the country for more than a decade and described how internet openness, regulation, and investment patterns have shifted since the late 2000s. They talked about the “China tax” of time spent managing VPNs and proxies just to access vital tools, contrasted with the country’s rapid progress in areas like robotics, electric vehicles, and applied AI.

Long‑term residents noted that many outside analysts left China years ago and now underestimate how far local technology has advanced. At the same time, policy changes, platform blocking, and geopolitical tension have made international fundraising and cross‑border data flows much more complex. For the newer arrivals in the room, the message was that if you can adapt past the first few years, you may find deep roots and opportunities—but also that the environment will likely keep changing.

Other Resources

Vercel v0 (v0.dev) – AI‑powered UI builder for web apps. One member uses it to quickly generate reasonable starting points for product designs when they do not have a dedicated designer on the team.
Microsoft Reading Coach (support article) – Reading practice tool that gives pronunciation and fluency feedback. Members saw it as a strong reference point for designing child‑friendly reading apps, but noted that platform limits led some to build their own alternatives.
Whisper (github.com/openai/whisper) – Open‑source speech‑to‑text model. Attendees discussed using it together with other tools to get more detailed timing and pronunciation scores for language‑learning projects.
Montreal Forced Aligner (montreal-forced-aligner.readthedocs.io) – Toolkit for aligning audio with transcripts at the phoneme level. Members mentioned it as a way to turn raw speech recognition output into word‑ and sound‑level feedback for learners.
Vertex AI (cloud.google.com/vertex-ai) – Google Cloud’s platform for building and running AI models, including image generation. One founder uses it to generate story illustrations with a more controllable style than the default look of other image tools.
dots OCR (github.com/YuejianLiang/dots) – Open‑source OCR and layout model that segments PDF pages into text, images, and formulas. A member found it especially useful for extracting charts and equations reliably when processing long technical documents.
Hugging Face (huggingface.co) – Platform for hosting and running machine learning models and demos. The group used it to deploy heavy OCR models on demand so they could process large document batches without maintaining their own servers.
n8n (n8n.io) – Workflow automation tool for connecting APIs, models, and services. One attendee wired OCR results into n8n to build end‑to‑end pipelines that turn PDFs into structured markdown and images.
NotebookLM (notebooklm.google) – Google’s research tool for building AI‑enhanced notebooks from personal documents. Members compared notes on how to reach it from China and how it fits into their research and reading workflows.
Google AI Studio (ai.google.dev) – Web interface and tooling for experimenting with Google’s generative models. Some attendees have explored it as a way to prototype new agents and evaluation pipelines.
Rokid smart glasses (global.rokid.com) – AR glasses with real‑time audio capture and display. A participant used them to subtitle live conversations and was impressed by how well the microphone array focused on the intended speaker in a busy office.
Wikipedia and Grok (wikipedia.org, x.ai) – Long‑running human‑curated encyclopedia and a newer AI‑driven knowledge system. The group contrasted Wikipedia’s volunteer‑driven editing process with AI attempts to re‑build reference works and debated how each approach handles bias and misinformation.

← Back to Notes