Parse addresses in Hindi, Tamil, Telugu, Kannada, Bengali, and Malayalam. No API keys. No rate limits. 26,711 embedded pincodes. Zero dependencies.
pip install bharataddress
Evaluated on 200 real Indian addresses from the public gold set. Every number here is reproducible — run evaluate.py yourself.
| Approach | Pincode F1 | State F1 | City F1 | District F1 | Speed | Cost |
|---|---|---|---|---|---|---|
| bharataddress | 0.995 | 0.971 | 0.959 | 0.965 | 19K addr/s | Free |
| 6-digit regex | Format only | N/A | N/A | N/A | Fast | Free |
| India Post API | Lookup only | Lookup only | N/A | N/A | API latency | Free |
| Google Maps Geocoding | High | High | High | High | API latency | ~$5/1K req |
bharataddress numbers from gold_200.jsonl. Competitor rows show capability class, not head-to-head scores — they solve different problems.
Parse a single address or process thousands. Get structured fields with confidence scores and coordinates.
from bharataddress import parse result = parse("42 MG Road, Bangalore 560001") # result.pincode → '560001' # result.state → 'Karnataka' # result.district → 'Bangalore' # result.city → 'Bangalore' # result.locality → 'MG Road' # result.building_number → '42' # result.latitude → 12.9766 # result.longitude → 77.60195 # result.confidence → 0.9 # As a dictionary: print(result.to_dict())
from bharataddress.phonetic import normalise, fuzzy_ratio # Official city renames normalise("Gurgaon") == normalise("Gurugram") # → True normalise("Bombay") == normalise("Mumbai") # → True # Fuzzy matching for misspellings fuzzy_ratio("Bengaluru", "Bangalore") # → 1.0 # Native script (requires bharataddress[indic]) result = parse( "गांधी मार्ग, दिल्ली 110005", transliterate=True )
From logistics to fintech, teams use bharataddress to handle the messy reality of Indian addresses.
Need SLA guarantees, custom language support, or on-premise deployment? We offer commercial support for mission-critical address parsing.
Install in seconds. No API keys. No rate limits. MIT licensed.
pip install bharataddress