Ishiki Labs

Building the Future of Multimodal AI

Winter 2026

Deep Learning

Generative AI

About

Current multimodal models can see and hear. But they talk when they shouldn't. They can't tell if you're speaking to them or someone else. We are building an AI that knows when to stay silent, yet still understanding what's going on in your conversation, so it can best assist when you do need it in real time. Our first version, fern-0.1: provides real-time expert opinions on demand, instant task delegation, zero interruptions. All as fast as ChatGPT voice and Gemini live.

Founders

Robert Xu

Founder

Co-founder & CTO of Ishiki Labs (W26). Previously worked on multi-modal AI and Orion AR glasses at Meta and research infra at Citadel Securities.

Amit Yadav

Founder

Cofounder and CEO at Ishiki Labs (W26). Previously: Research Scientist, first in LlaMA team training multimodal LLMs and then in Reality Labs at Meta training video assistant for smart glasses. PhD from Purdue University with 20+ publications at top conferences like CVPR, NeurIPS, and ICASSP

AI Research Report

Problem & Solution

Problem / Solution Report

Problem
Current multimodal chatbots can see and hear but often interrupt at inappropriate moments, while passive note‑taking tools capture data without acting. Teams in high‑stakes settings such as sales calls, demos, or live customer escalations need an assistant that remains continuously aware of conversation and environment, yet exercises judgment—staying silent until its input adds value and then acting autonomously in the background.

Solution – Fern
Ishiki Labs’ product Fern is a real‑time multimodal AI that continuously listens and watches, maintains long‑horizon context, filters side speech, and provides selective attention. When addressed, it offers in‑the‑moment expert opinions, delegates tasks instantly, and executes multi‑step workflows with latency comparable to live AI systems. The architecture combines an advanced orchestration layer for presence with model‑level improvements for attention and coherence.

Differentiation
Unlike conventional chatbots or transcription services, Fern is “presence‑first,” avoiding unnecessary interruptions while being ready to act. The founders’ background in Meta’s smart‑glasses assistants informs a focus on low‑latency, on‑device inference and robust context management, enabling use cases such as a “meeting intern,” a humanoid robot brain, or a home/industrial guardian that monitors and alerts with contextual understanding.

Early Wedge & Value
The company’s initial market focus is on mock sales‑call coaching and live productivity assistance, where timely, unobtrusive guidance can improve performance without causing distraction. As integrations expand into calendars, CRMs, and hardware, Fern can capture tasks, trigger follow‑ups, and automate data entry, creating compounding value.

Market & Competitors

Market and Competitors Report

Market Context & Trends
Enterprise interest in AI assistants is accelerating. MarketsandMarkets predicts conversational AI to grow from USD 17.05 billion in 2025 to USD 49.8 billion by 2032 (CAGR ≈ 19.6%). Grand View Research forecasts strong growth in call‑center AI and intelligent virtual assistants (CAGRs ≈ 24%). The shift is moving from post‑hoc analytics toward proactive, in‑moment assistance—exactly the niche Fern targets.

Competitive Landscape

Conversation Intelligence & Sales Coaching – Gong, Chorus (ZoomInfo), Clari Copilot, Revenue.io, Avoma, Fireflies.ai, Salesloft, Outreach, People.ai, etc. These platforms dominate sales‑call capture, transcription, analytics, and increasingly real‑time coaching. They have deep integration with CRM systems and large installed bases.
Contact‑Center & Customer‑Service AI Stacks – Vendors such as Genesys, Five9, NICE, Talkdesk and hyperscaler offerings (AWS, Microsoft, Google) provide real‑time agent assistance, routing, and analytics. They represent both competition and potential channel partners for Fern’s “agentic AI” capabilities.
Real‑Time Multimodal Assistants – Enterprise offerings from Microsoft Copilot, Kore.ai, and other platform players are moving toward live, proactive agents with multimodal input. Fern differentiates by being “presence‑first,” delivering selective‑attention, low‑latency video/audio understanding that stays silent until needed.

Competitive Advantages
Ishiki Labs leverages the founders’ Meta experience in smart‑glasses assistants and real‑time orchestration, enabling lower latency, longer‑horizon context, and robust multimodal perception. This gives Fern an edge in scenarios where timing and context are critical (e.g., live sales calls, high‑stakes meetings).

Traction Indicators
Gartner Peer Insights and other review aggregators list the above conversation‑intelligence vendors as the dominant players, underscoring the maturity of the competitive set. Ishiki’s early focus on sales‑call coaching provides a clear wedge to acquire initial customers before expanding into contact‑center, field‑ops, and hardware‑adjacent use cases.

Total Addressable Market

Quantitative TAM Report

Market Segmentation
Ishiki Labs targets the emerging “continuous‑presence multimodal assistant” niche, which overlaps with conversational AI, contact‑center AI, and intelligent virtual assistants. Because no single syndicated category captures this exact segment, TAM is estimated by triangulating across related markets.

Conversational AI – MarketsandMarkets projects the global conversational AI market at USD 17.05 billion in 2025, growing to USD 49.8 billion by 2032 (CAGR ≈ 19.6%). This includes chatbots, voice bots, and generative agents used in sales, marketing, and internal enterprise workflows.

Contact‑Center AI – Grand View Research estimates the call‑center AI market at USD 1.99 billion in 2024, reaching USD 7.08 billion by 2030 (CAGR ≈ 23.8%). A related segment, Contact‑Center Intelligence, is sized at USD 2.46 billion in 2023 and projected to USD 11.20 billion by 2030 (CAGR ≈ 24.3%).

Intelligent Virtual Assistants – Grand View Research reports the market at USD 3.07 billion in 2023, expected to reach USD 14.10 billion by 2030 (CAGR ≈ 24.3%).

Triangulated TAM
A conservative sum of the three 2030 figures yields roughly USD 32.4 billion (7.08 + 11.20 + 14.10). Including the broader conversational AI forecast (≈ USD 50 billion by 2032) suggests a total addressable opportunity well above USD 30–50 billion for real‑time, multimodal assistants as they become mainstream.

Methodology
The estimate combines top‑down market sizing from syndicated reports with a bottom‑up wedge analysis that starts with the sales‑enablement use case (a subset of conversation intelligence) and expands into adjacent verticals such as contact centers, field operations, and wearable/robotic interfaces. Pricing, seat‑based ARPU, and penetration assumptions would refine the Serviceable Obtainable Market once product‑market fit is confirmed.

Founder Analysis

Founders and Background Report

Amit Yadav – Co‑founder & CEO
Amit holds a PhD in AI from Purdue University and previously served as a Research Scientist at Meta. At Meta he contributed to the LLaMA team and Reality Labs, focusing on multimodal large‑language models and smart‑glasses assistants. He has authored 20+ papers at top conferences such as CVPR, NeurIPS, and ICASSP and received several fellowships and awards. His deep research experience positions him to lead product and research on continuous‑presence multimodal AI.

Robert Xu – Co‑founder & CTO
Robert brings systems‑level expertise, having built orchestration and inference infrastructure for real‑time multimodal AI at Meta and later at Citadel Securities. He worked on the Orion AR glasses project and on low‑latency AI pipelines, giving him the engineering background required to make real‑time, on‑device assistants reliable.

Joint Experience
The two met as colleagues at Meta, collaborating on the Ray‑Ban Meta assistant and other AR initiatives. Their combined strengths—Amit’s model‑level research and Robert’s systems engineering—directly address the “seam” challenges of building human‑level multimodal assistants. The company is a Y Combinator Winter 2026 batch company and lists a direct contact email for the founders.

If additional details were unavailable, the report would note that.

Unlock Full AI Research Report

Enter your email to access the complete analysis.

We'll never spam you. Unsubscribe anytime.