v0.4.0 — Multilingual Release
GitHub stars

The open-source
Indian address parser

Parse addresses in Hindi, Tamil, Telugu, Kannada, Bengali, and Malayalam. No API keys. No rate limits. 26,711 embedded pincodes. Zero dependencies.

$ pip install bharataddress
26,711
Embedded Pincodes
6
Indic Scripts
19K+
Addresses / Second
162
Tests Passing

Real numbers, not marketing

Evaluated on 200 real Indian addresses from the public gold set. Every number here is reproducible — run evaluate.py yourself.

ApproachPincode F1State F1City F1District F1SpeedCost
bharataddress 0.995 0.971 0.959 0.965 19K addr/s Free
6-digit regex Format only N/A N/A N/A Fast Free
India Post API Lookup only Lookup only N/A N/A API latency Free
Google Maps Geocoding High High High High API latency ~$5/1K req

bharataddress numbers from gold_200.jsonl. Competitor rows show capability class, not head-to-head scores — they solve different problems.

Pincode
0.995
State
0.971
District
0.965
City
0.959
Building Number
0.962
Landmark
0.918
Locality
0.796
Building Name
0.679

Why Indian addresses break every parser

28
States & Union Territories
Each with different address conventions, administrative structures, and local naming patterns.
22
Official Languages
Addresses written in scripts that most parsers can't read. bharataddress handles 6 of them natively.
409
Vernacular Terms
Marg, Salai, Rasta, Veedhi — 409 regional terms for street, area, and building mapped to structured fields.
🔄
City Renames & Misspellings
Gurgaon/Gurugram, Bombay/Mumbai, Madras/Chennai — phonetic matching handles official renames and common typos.

Supported Scripts

Hindi
गांधी मार्ग, नई दिल्ली
हिन्दी
Tamil
அண்ணா சாலை, சென்னை
தமிழ்
Telugu
గాంధీ రోడ్, హైదరాబాద్
తెలుగు
Kannada
ಎಂ.ಜಿ. ರಸ್ತೆ, ಬೆಂಗಳೂರು
ಕನ್ನಡ
Bengali
পার্ক স্ট্রিট, কলকাতা
বাংলা
Malayalam
എം.ജി. റോഡ്, കൊച്ചി
മലയാളം

Three lines to structured data

Parse a single address or process thousands. Get structured fields with confidence scores and coordinates.

Parse an address
from bharataddress import parse

result = parse("42 MG Road, Bangalore 560001")

# result.pincode     → '560001'
# result.state       → 'Karnataka'
# result.district    → 'Bangalore'
# result.city        → 'Bangalore'
# result.locality    → 'MG Road'
# result.building_number → '42'
# result.latitude    → 12.9766
# result.longitude   → 77.60195
# result.confidence  → 0.9

# As a dictionary:
print(result.to_dict())
Phonetic city matching
from bharataddress.phonetic import normalise, fuzzy_ratio

# Official city renames
normalise("Gurgaon") == normalise("Gurugram")
# → True

normalise("Bombay") == normalise("Mumbai")
# → True

# Fuzzy matching for misspellings
fuzzy_ratio("Bengaluru", "Bangalore")
# → 1.0

# Native script (requires bharataddress[indic])
result = parse(
    "गांधी मार्ग, दिल्ली 110005",
    transliterate=True
)

Built for real-world use cases

From logistics to fintech, teams use bharataddress to handle the messy reality of Indian addresses.

🚚
Logistics
Normalise delivery addresses for route optimisation. Handle vernacular street names, missing pincodes, and non-standard formats.
🏦
Fintech KYC
Verify customer addresses during onboarding. Extract structured components for regulatory compliance and fraud detection.
🛒
E-commerce
Auto-fill city and state from pincode. Reduce checkout friction and eliminate address typos. One field replaces three.
🏛️
Government
Parse citizen addresses for welfare schemes, tax records, and identity verification across 28 states and 8 union territories.
Commercial Support

Enterprise-grade support

Need SLA guarantees, custom language support, or on-premise deployment? We offer commercial support for mission-critical address parsing.

🛡️
SLA Support
24-hour response, 4-hour critical
🌐
Custom Languages
Add support for new Indic scripts
On-Premise
Deploy within your infrastructure

Ready to parse Indian addresses?

Install in seconds. No API keys. No rate limits. MIT licensed.

$ pip install bharataddress