Home / Guides / AI Voice Agents for Real Estate
Pillar Guide · 10 min read

AI Voice Agents for Dubai Real Estate: Cost, Setup & ROI in 2026

Half of property enquiries in Dubai arrive after agents have gone home. AI voice agents — the kind that genuinely converse, not robocall scripts — now handle this gap reliably in English, Arabic, Hindi, and Urdu. This is the playbook we use to set them up, including the tech comparison, pricing, and the 2-week launch plan.

Published: 28 April 2026 Last updated: 28 April 2026 Author: Autometa AI Reading time: 10 minutes

The after-hours problem in Dubai real estate

Property enquiries in Dubai do not respect office hours. Overseas buyers send WhatsApp messages at 2 AM Gulf time. Investors in Dubai call between 8 and 10 PM. Locals scroll Bayut after dinner. Across the 11 Dubai brokerages we have audited, somewhere between 30% and 55% of inbound enquiry volume arrives outside 9 AM – 7 PM.

The standard response — "we will call you back tomorrow" — costs money. A buyer who has just enquired about a Marina apartment is shopping that night. By the next morning they have called three other brokerages. By the end of the week, the lead is gone.

Until 2024, the only options were: hire a night-shift call centre (expensive, low quality, scripted), use IVR menus (every buyer hangs up), or accept the loss. AI voice agents are now a fourth option that actually works. They sound human enough that buyers do not hang up, they speak the languages your buyers actually speak, they qualify properly, and they hand off cleanly to a human agent the next morning with full context.

From the field On a 2-week pilot at a Dubai Marina brokerage with 22 agents, the AI voice agent handled 147 calls between 7 PM and 8 AM. 31 of those callers had not enquired through any other channel. 9 of those 31 booked a viewing within 48 hours. At an average commission per side of AED 28k, that single fortnight paid for the entire setup three times over.

What an AI voice agent actually does

Stripping away the marketing, here is what happens in a typical 2-minute call:

  1. Pickup within 2 rings. Greeting in the buyer's likely language (defaults to English; switches to Arabic, Hindi, or Urdu on detection).
  2. Honest disclosure. "Hi, I'm Layla, the AI assistant for [Brokerage] — I can answer questions about our listings or schedule a viewing."
  3. Qualify in 4 questions. What property are you calling about? What is your budget range? When are you looking to move or invest? Are you a UAE resident or based abroad?
  4. Send relevant info during the call. The agent fires a WhatsApp message mid-call with 3 matching listings or the brochure for the listing they enquired about. The buyer's phone buzzes while they are still talking.
  5. Schedule a viewing or hand off. Offer two viewing slots over the next 48 hours. If accepted, write to the calendar. If not, log everything to the CRM and tag for human follow-up the next morning.
  6. Polite close. Confirm the WhatsApp number, summarise next steps, end the call.

What it does not do well, in our experience: complex price negotiation, mortgage advice, anything requiring judgement on whether to push back on a buyer's expectation. Hand those to a human.

Languages: English, Arabic, Hindi, and Urdu

Dubai's buyer pool is multilingual by default. A voice agent that only speaks English captures maybe 60% of conversations cleanly. Adding Arabic, Hindi, and Urdu pushes that to 90%+.

Language Voice quality (2026) Best provider Real-estate domain understanding
English (UAE accent) Indistinguishable from human ElevenLabs, Vapi (PlayHT) Excellent
Modern Standard Arabic Excellent ElevenLabs, Vapi Strong
Khaleeji (Gulf) Arabic Strong ElevenLabs (custom voice) Good with prompt tuning
Hindi Excellent ElevenLabs, Bland Strong
Urdu Strong ElevenLabs Good
Russian Good but accent slips on names ElevenLabs Acceptable
Tagalog Improving ElevenLabs Weak
Mandarin Acceptable for short calls ElevenLabs, PlayHT Acceptable

Our default for a Dubai launch: English + Arabic + Hindi + Urdu live on day one, Russian and Tagalog added in month two if call volume justifies it.

The conversation flow for property enquiries

This is the system prompt structure we use as a starting point for every Dubai brokerage. It is opinionated — the loose, "let the LLM figure it out" style of conversation flow consistently underperforms a structured one for high-stakes calls.

Identity and disclosure (5 seconds)

Open with name, brokerage, AI disclosure, and the offer to help. No small talk. UAE buyers are direct.

Qualification (45–60 seconds)

Four questions in fixed order: Which property? Budget? Timeline? UAE resident or based abroad? Adapt phrasing based on language and tone of voice, but do not skip any of the four. Each answer goes into a structured CRM field.

Information delivery (30–45 seconds)

Send matching property info via WhatsApp during the call. The voice agent says: "I am sending three matching options to your WhatsApp now — you should see them in a few seconds." This is the highest-conversion moment of the call.

Booking or escalation (30 seconds)

"Would you like to book a viewing this Saturday at 11 AM or 4 PM?" If yes, write the calendar entry. If hesitant, offer "Should I have one of our human agents call you tomorrow morning?" Never pressure.

Close (10 seconds)

Confirm WhatsApp number, summarise what will happen next, polite close.

Total typical call length: 2 minutes 10 seconds. Anything longer than 4 minutes is usually the agent failing to push toward a booking — tighten the script.

Integration with property listings (Bayut, Property Finder)

The voice agent is only as good as its knowledge of your inventory. Three integration patterns in priority order:

Pattern A — Live from your CRM

Best when your CRM is the source of truth for inventory. The voice agent queries the CRM API in real time when a buyer asks about a specific property. Pros: always up to date. Cons: latency on the call (must keep query under 800ms).

Pattern B — Nightly sync to a vector database

Pull all active listings into a vector store (Pinecone, Supabase pgvector) every night. The voice agent does semantic search on the database during the call. Pros: fast, cheap, supports fuzzy queries ("apartments with a sea view in JBR under 4 million"). Cons: 24-hour staleness.

Pattern C — Bayut/Property Finder XML feed

For brokerages that publish primarily through portals, ingest the same XML feed they push to Bayut/PF. Pros: matches what the buyer saw on the portal. Cons: requires paid portal tier with feed access.

The voice agent should also know which listings are not available — sold, under offer, withdrawn — and avoid offering them. The most expensive errors we have seen are agents booking viewings on units sold the previous week.

Tech comparison: Vapi vs Retell AI vs ElevenLabs vs Bland

The space moves fast. As of Q2 2026, four providers handle the bulk of production AI voice deployments:

Provider Per-minute cost Voice quality Latency UAE phone numbers Best for
Vapi $0.10–$0.18 Excellent (PlayHT, ElevenLabs) ~600ms Yes via Twilio Most Dubai brokerages
Retell AI $0.07–$0.16 Excellent ~500ms Yes via Twilio Cost-sensitive, simple flows
ElevenLabs ConvAI $0.10–$0.22 Best in class ~800ms Yes Multilingual (Arabic, Hindi, Urdu)
Bland AI $0.09 Good ~400ms Yes High-volume outbound dialing

Our default for Dubai brokerages

Vapi + ElevenLabs voices. Vapi handles the orchestration (LLM, tools, telephony, transcription), ElevenLabs provides the voice (especially for Arabic and Hindi which are noticeably better there than on PlayHT). Twilio supplies the UAE phone number. This stack handles 95% of Dubai real estate use cases out of the box.

Use Bland instead if you need outbound dialing at scale (cold calling lists). Use Retell if budget is tight and you can live with English-only.

Cost breakdown

Real numbers from a 22-agent Dubai brokerage, monthly:

Cost line Monthly Notes
Vapi platform fee $0–$10 Free under 1,000 minutes, then minimal
LLM (GPT-4o-mini or Claude Haiku) ~$0.02/min × 800 min $16/month at typical volume
ElevenLabs voice synthesis ~$0.06/min × 800 min $48/month
Twilio UAE phone number $1.15/month + $0.013/min ~$12/month total
Transcription (Deepgram) ~$0.005/min × 800 min $4/month
Monthly run cost $80–$130/mo For 800 minutes of conversation
One-time setup $1,800–$4,500 Conversation design, integrations, testing

The break-even maths: if the agent recovers one viewing per month that converts to a sale, the entire run cost recovers in less than a single commission. Most deployments we run recover that within the first three weeks.

Setup timeline: 2–3 weeks from kickoff to live calls

This is the timeline we run on every voice agent deployment for a Dubai brokerage:

Week 1 — Voice, script, knowledge base

  • Choose voice (English, Arabic, Hindi, Urdu) — listen to 4 candidates per language, pick the warmest.
  • Write conversation flow (qualification questions, escalation rules, end-call wording).
  • Build knowledge base from current inventory (CRM API or nightly sync).
  • Define the no-go list: what the agent should refuse to discuss (price negotiation, legal advice, personal opinions on areas).

Week 2 — Integrations and testing

  • Wire up Twilio UAE number, Vapi orchestration, ElevenLabs voice.
  • Connect to CRM (write call logs and outcomes), WhatsApp (send listings during call), calendar (book viewings).
  • Internal testing — 50 simulated calls covering edge cases (off-plan, rentals, tenancy, mortgage queries).
  • Compliance review — TDRA recording disclosure, opt-out wording, data residency check.

Week 3 — Soft launch and tune

  • Route only after-hours calls (7 PM – 8 AM) to the agent. Daytime calls still go to humans.
  • Listen to every call for the first 50. Tune script for the 5–10 things you will inevitably get wrong.
  • Expand to weekend coverage if quality is good.

Month 2+ — Optimisation

  • A/B test voices and scripts.
  • Add languages based on actual call demographics.
  • Consider expanding to outbound (callbacks, viewing reminders).

Compliance: TDRA, recording consent, data residency

Three regulatory points to get right for a Dubai deployment:

TDRA disclosure

UAE TDRA guidance on AI in customer service is still maturing, but the prevailing best practice is: disclose at the start of the call that the caller is speaking with an AI. We embed this in the opening line ("I'm Layla, the AI assistant for..."). This has not measurably hurt conversion in our deployments.

Call recording consent

Every call is recorded for quality and CRM logging. UAE law requires consent for recording. The opening line should include "this call may be recorded for quality" — stitched naturally into the greeting.

Data residency

Vapi, ElevenLabs, and Twilio all process call audio outside the UAE (typically US or EU). For brokerages with strict UAE data residency requirements (rare in independent brokerages, common in developer-affiliated brokerages working with government inventory), this is a problem. Workarounds: deploy on AWS Bahrain region with self-hosted Whisper for transcription, or use Etisalat's enterprise voice platform — much more expensive but UAE-resident.

When NOT to use an AI voice agent

Honest answer: AI voice agents are not the right fit for every brokerage. Skip the deployment if:

  • You have fewer than 50 inbound calls per month total. The setup cost will not recover.
  • Your average property price is over AED 50M (luxury / palace tier). Buyers at that level expect to speak to a human; AI feels off-brand.
  • You sell almost entirely through referral with no public phone number on listings. Voice agents work best for portal-driven inbound.
  • Your inventory is not in a CRM. The voice agent needs structured access to listings; if your inventory is in an agent's head, the agent has nothing to retrieve.

Frequently asked questions

What does an AI voice agent for real estate actually do?
It picks up calls (typically after-hours), greets in the buyer's language, asks qualifying questions, sends matching listings via WhatsApp during the call, and either schedules a viewing or hands off to a human agent the next morning.
Can an AI voice agent speak Arabic?
Yes. ElevenLabs and Vapi both support Modern Standard Arabic and Gulf dialects with realistic pronunciation. Hindi and Urdu support is also strong.
How much does an AI voice agent cost for a Dubai brokerage?
Per-minute infrastructure: $0.07–$0.22 depending on provider. Typical Dubai brokerage spends $80–$300/month on infrastructure plus $1,500–$4,500 one-time setup.
Will buyers know they are talking to an AI?
Yes, and they should. Disclose at the start. In our deployments, transparency has not hurt conversion.
How long does setup take?
2–3 weeks for a focused build: Week 1 voice/script/knowledge base, Week 2 integrations, Week 3 soft launch on after-hours only.
Can the voice agent handle outbound calls (callbacks, reminders)?
Yes — particularly useful for viewing reminders ("your viewing tomorrow at 4 PM"), follow-up after a viewing, and reactivating dormant leads. Outbound has higher TDRA scrutiny — get explicit opt-in first.

Want a voice agent live in 2 weeks?

We build the conversation flow, choose and test the voice, integrate with your CRM and WhatsApp, get the TDRA disclosures right, and run the soft launch with you. Book a free 30-minute audit and we will scope the build and send a one-page plan.

Book a Free Audit