[Feature Request] Central Country/State/City database with Multilingual support

536 views
Skip to first unread message

Dmitry A.

unread,
Jan 18, 2010, 5:02:39 PM1/18/10
to In-Portal Development Team
Hey guys,


Have have came across major improvement and at the same time feature
that will benefit all users.

We need to refactor how we currently store and use Countries and
States so it's:

1. All data (countries/states/cities) and their translations are
stored in it's own table - independent from current Phrases table
(quite big) in order to escape table JOINs.

2. Add functionality to include States/Provinces and Cities for all
Countries world wide - not only US and Canada as we have now.

3. New Sections to Manage all of it (Countries/States/Cities),
including relations and translations.

4. Connect all 3 fields (country, state, city) to ALL forms that are
using these now + add auto-complete drop-down for the City field for
selected Country/State combination.

5. Create the ability to import States and Cities data for selected
Countries right in the In-Portal Admin via web-service running on In-
Portal.com. This is quite critical since initially we'll be shipping
only US and Canada States and Cities. While complete database is about
3 million records.


Let me know if I am missing anything here!


PS.

By th way, we are already in the process of working on In-Portal 5.1.0
which will include lots of good stuff.

You can checkout the list blow. While it's not final, I think it's
most of it, though

http://tracker.in-portal.org/search.php?project_id=1&sticky_issues=on&target_version=5.1.0&sortby=last_updated&dir=DESC&hide_status_id=90


DA.

Erik Snarski

unread,
Feb 5, 2010, 8:28:53 AM2/5/10
to In-Portal Development Team
With multilanguage tables there is such problem that at addition/
removal of languages from PHP is started ALTER TABLE SQL to add/remove
corresponding columns. If all cities will be in one table such SQL
will be executed very long (about10 minutes). For 10 minutes in the
majority of systems the server will stop PHP-script by timeout, and
this is inadmissible.

I offer:

1) Store cities not in one table, but in several. For example, on one
for everyone state and for everyone country (in case there are no
states at country). That is, in about 4500 tables with columns

CityId
Name - multilanguage field, required

And with table names:
1.1) cities_US_IL - cities of state Illinois
1.2) cities_SG - cities of country Singapore (in Singapore there are
no states)

2) Make non-standard unit "cities", adhered not to one table, but to
group of tables, and in a concrete case, depending on the parameters
(country, state), choosing one table from group and working with it in
the standard way.
3) Make changes in core unit "languages" and in multilanguage_helper,
3.1) that at addition/removal languages changes in multilingual tables
structure became by repeated start of a PHP-script with use
AjaxProgressBar (for example to process on 10 tables for one loop)
3.2) that the non-standard cities unit was considered, and all its
tables instead of one have been changed
4) Store countries in the separate table with structure

CountryId
Name - multilanguage field, required
Code - a 2-symbolical code, unique, required
Code3 - a 3-symbolical code

And create unit "countries" for manipulations with countries. At new
соuntry creation it is necessary to create the table for cities
automatically. At change of field Code - rename tables with cities of
given country.

5) Store states in the separate table with structure

StateId
Name - multilanguage field, required
Country - a 2-symbolical code, required, related to Country->Code
Code - a 2-symbolical code, required

Each Country + Code combination must be unique in this table.

And create unit "states" for manipulations with states. At new state
creation it is necessary to create automatically the table for cities.
At change of field Code - rename the table with cities of given state.

Dmitry A.

unread,
Feb 6, 2010, 2:11:28 PM2/6/10
to In-Portal Development Team
Hi Erik,


Thanks for your input and great ideas on implementation.

I personally think this is the right direction, but still needs to
fine tuning before we finalize and file this as a feature request.

Some of my comments follows here:

1. I believe it's a bit too MANY tables to have just for this
functionality. What if we start with listing actual numbers of
Countries, States and Cities we have so we evaluate and figure out the
amount of data we are dealing with. While I am sure you already have
this information it's still not full clear to us.

2. What's your take (opinion) on the Import part of this data. Do you
agree it's not really necessary to have all Ci

3. I was giving a second thought of having City names as Multilingual
field. Just imaging amount of text to be translated - I don't know if
this will ever happen at all. Can you agree that English version
probably won't be changed at all. Yes, I think it's possible for
Countries and I think we can actually do a trick and have Country
names by default entered in it's local translation (to be honest not
sure if it's that important and will matter too).


Please let me know your thoughts.

Others (EVERYONE) please also share you mind here!

DA.

Alexander Obuhovich

unread,
Feb 6, 2010, 2:25:15 PM2/6/10
to in-por...@googlegroups.com
Yes, I'm also interested in numbers and how much exactly are cities inside each country and each state.

--
You received this message because you are subscribed to the Google Groups "In-Portal Development Team" group.
To post to this group, send email to in-por...@googlegroups.com.
To unsubscribe from this group, send email to in-portal-de...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/in-portal-dev?hl=en.




--
Best Regards,

http://www.in-portal.com
http://www.alex-time.com

Phil

unread,
Feb 7, 2010, 4:05:55 PM2/7/10
to In-Portal Development Team
Hello,

I bring my tought to your discussion.

Since now, I never had a customer who needed all countries for
registration/shipping, and I've always deleted all countries except
the 2-3 they needed.

I'll let you define the best way to implement his in DB, my idea is to
be able to import .csv files, this way we will benefit of :

- adding/deleting countries easily without putting hands in DB

- lightening the DB when only few countries is needed (sometimes only
1 is requested for a national-wide business)

- giving to community updated files on in-portal.com, as well as new
country files (with more details)

I wait your opinion !

Phil.

> > in-portal-de...@googlegroups.com<in-portal-dev%2Bunsu...@googlegroups.com>

Erik Snarski

unread,
Feb 8, 2010, 10:41:59 AM2/8/10
to In-Portal Development Team
Lists of the countries and states with quantity of cities I send for
Alex and Dmitry by e-mail.

What does "too many tables" mean? As far as I know in MySQL there is
no restriction on quantity of tables in a database.

Yes, probably, it is possible to add in table "Countries" field
"NeedCities". If NeedCities = true, then system will manage tables
with cities of this Country. If NeedCities = false, then system will
work without requesting those tables.

I agree with Phil that CSV-import of cities is necessary. On the in-
portal.com site I suggest to make the catalog for downloading of
corresponding CSV-files.

Files I suggest to call using value "Locale" from "Configuration -
Language Packs" section. For example, USA cities in Russian will be in
a US-cities-ru-RU-locale.csv file. At import corresponding tables of
cities (if they still are not present) will be created, and also check
probably is necessary that the administrator could not import
secondary language translations before Primary Language translation is
imported. In CSV-files at such approach enough two fields

1) CityId - To attribute Secondary Translations to the cities created
during import Primary Translation.
2) Translation - The city name in "locale" language.

Erik Snarski

unread,
Feb 9, 2010, 3:58:52 AM2/9/10
to In-Portal Development Team
There is still such idea - not to include at all in In-Portal the
table with cities, and instead to make common web service which gives
out city suggestions in the necessary language, and in In-Portal to
make only references to this service.

Alexander Obuhovich

unread,
Feb 9, 2010, 4:12:41 AM2/9/10
to in-por...@googlegroups.com
We also have imagined such idea with Dmitry, but maybe we forgot to file it here.

--
You received this message because you are subscribed to the Google Groups "In-Portal Development Team" group.
To post to this group, send email to in-por...@googlegroups.com.
To unsubscribe from this group, send email to in-portal-de...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/in-portal-dev?hl=en.

Dmitry A.

unread,
Feb 11, 2010, 11:31:42 AM2/11/10
to In-Portal Development Team
Hi guys,

1. Let's describe that scenario with State and City here too so we can
fully evaluate this approach.

2. Erick if possible would you please list here the following:

a. Total Number of Countries.
b. Total Number of States/Provinces (sum all Counties).

- Also are there any Countries without State/Province (how many if you
know?).
- Also are there any State/Province without Cities (how many if you
know?).

Example in the list you sent:
AO Luanda 0
BD Khulna 0


c. Total Number of Cities (sum all Countries).

Thanks!

DA.

On Feb 9, 3:12 am, Alexander Obuhovich <aik.b...@gmail.com> wrote:
> We also have imagined such idea with Dmitry, but maybe we forgot to file it
> here.
>
>
>
>
>
> On Tue, Feb 9, 2010 at 10:58 AM, Erik Snarski <s...@jamajo.lv> wrote:
> > There is still such idea - not to include at all in In-Portal the
> > table with cities, and instead to make common web service which gives
> > out city suggestions in the necessary language, and in In-Portal to
> > make only references to this service.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "In-Portal Development Team" group.
> > To post to this group, send email to in-por...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > in-portal-de...@googlegroups.com<in-portal-dev%2Bunsubscribe@goog legroups.com>

Erik Snarski

unread,
Feb 12, 2010, 11:23:14 AM2/12/10
to In-Portal Development Team
Total Number of Countries = 242
Total Number of States/Provinces = 4258

There are countries without States/Provinces = 49
There are States/Provinces without Cities = 717

The scenario I imagine - we will make web service which gives out city
suggestions and we will set it's URL as default value in In-Portal
configuration. If our web service seems to someone not absolutely
convenient - he can download our web service's source code, put it on
the his own server, modernise (for example, to add translations of
cities into languages which are not present in our web service).

Erik Snarski

unread,
Feb 15, 2010, 8:55:28 AM2/15/10
to In-Portal Development Team
So,

1) in In-Portal project

1.1) Store countries in the separate table with structure


CountryId
Name - multilanguage field, required
Code - a 2-symbolical code, unique, required
Code3 - a 3-symbolical code

IsInUse - tityint(1) - switch on or off countries. If country is off,
then it is not shown in front-end dropdowns.


And create unit "countries" for manipulations with countries.


1.2) Store states in the separate table with structure


StateId
Name - multilanguage field, required
Country - a 2-symbolical code, required, related to Country->Code
Code - a 2-symbolical code, required


Each Country + Code combination must be unique in this table.


And create unit "states" for manipulations with states

1.3) In "Countries" and "States" sections make button "Import
Translation", supposing csv file upload.
CSV file name must have part, which identify translation language.
For example Russian language:
1.3.1) states - states-ru-RU-locale.csv
1.3.2) countries - countries-ru-RU-locale.csv

1.4) Add configuration variable "Cities Suggestion WEB service URL",
by default pointing on In-Portal site's web service. If this variable
is not empty, then on city field input suggestions will be loaded by
ajax.

1.5) Change all existing address input forms to use new countries/
states data structure


2) Make In-Portal Customization "City Suggestions WEB Service"

2.1) Store cities not in one table, but in several. For example, on


one
for everyone state and for everyone country (in case there are no
states at country). That is, in about 4500 tables with columns


CityId
Name - multilanguage field, required


And with table names:
2.1.1) cities_US_IL - cities of state Illinois
2.1.2) cities_SG - cities of country Singapore (in Singapore there
are
no states)


2.2) Make non-standard unit "cities", adhered not to one table, but


to
group of tables, and in a concrete case, depending on the parameters
(country, state), choosing one table from group and working with it
in
the standard way.

2.3) Make City Suggestions interface. Sample request to get cities of
Illinois in Russian, started with 'ch':
http://project.path/get_cites.html?locale=-ru-RU&country=US&state=IL&city=ch
Totally, there may be 4 parameters in such request:
2.3.1) locale - define language, required
2.3.2) country - define country, required
2.3.3) state - define state, required for countries, which have
states, not used for contries which have no states
2.3.4) city - symbols, with which city name starts, required/

Alexander Obuhovich

unread,
Feb 15, 2010, 9:33:09 AM2/15/10
to in-por...@googlegroups.com
Prefix names must be "country" instead of "countries", "state" instead of "states" and so on. Why link state to country by county iso code, when we already have it's ID. I propose we link by IDs when possible. Table naming seems ok, for country without states it would be "Cities_LVA_ANY" and with states "Cities_USA_IL". Also I propose to use 3-symbol ISO code instead of 2-symbol ISO code in table names.

About city suggestion:
Of course we shouldn't pass locale to url, because we already know it from current site language. Also this all should be and event, like OnSuggestCities that will return data in JSON format. Here also we should use 3-symbol ISO code, as we use it in all in-portal modules already.

Field "IsInUse" of course should be named "Enabled" or "Status" as usual in In-Portal.


1.4) We won't be loading city suggestions from other site, so maybe you thought about url to http://www.in-portal.com site to script, which will return that CSV file?

Erik Snarski

unread,
Feb 15, 2010, 10:41:49 AM2/15/10
to In-Portal Development Team
Sure, we may use 3-symbol codes instead of 2-symbol codes. In this
case additional time will be required for data convertation, because
in existing "cities" table 2-symbol codes are used now. Also, 2-symbol
codes are preferable, because they are used in site domains, so, if I
want to get all "russians" from PortalUser table, I need know only one
code - "RU", to write SQL

SELECT Email
FROM inp_PortalUser WHERE Country='RU'
OR RIGHT(Email,3) = '.ru'

In other case, there additional time required to find additional 3-
symbol ISO code.


Why relation between states and countries by country code is
preferable. This is so because in address fields are stored country
codes, not country IDs. For example, in PortalUser record we can see
"USA", not 225, and then, if we need get all states for this country,
relation between states AND countries by ID means excessive join in
SQL request. Also, it is more convenient to work directly with DB data
(for example, in PMA) when fields contain reasonable values, not more
abstract codes.


1.4) yes, by default there must be URL to www.in-portal.com site, but,
in some In-Portal customization may be special conditions and then
www.in-portal.com site's service may be replaced to fit them.
Developer of this customization may create another cities suggestion
service, install it on other domain and join it by change this one
paremeter in In-Portal configuration.

Alexander Obuhovich

unread,
Feb 16, 2010, 10:15:04 AM2/16/10
to in-por...@googlegroups.com
All except 1st paragraph ok. User countries are stored using 3-symbol ISO code on all projects, where user can specify country, no cities database should be converted to use 3-symbol country code.

Dmitry Andrejev

unread,
Feb 24, 2010, 1:17:54 PM2/24/10
to in-por...@googlegroups.com
Hey mates,

Here is a quick list of Variables and basic process flow we are looking to have with this functionality:

1. Country/State/City Suggestion service.

To enable this we need a variable - I suggest to have variable named = EnableAddressSuggestions ( translation - "Enable Address Suggestions (ie. Registration form)" )

System will check whether to use current (OLD fashion) style or NEW on User Registration when process goes as user selects:

- Country
- State/Province
- City (suggestions pulled from In-Portal web service)
- Zip
- Street and so on

On form submission:

a. OLD style -- validating State/Country match after submissions. Note: no requests to In-Portal web-service here.

b. NEW style -- no validation as user gets suggestions. Additionally will be getting GPSLon and GPSLat of the picked City (along with the City names) and stored in PortalUser table (3 New fields are added GPSLon, GPSLat, GPSForce - in case if prev. two values should NOT be overwritten by automatic Google Map service described below)


2. In-Portal  Suggestion Service.

We'll need 2 new variable for this:

a. InPortalWebServiceApiUrl - text box for URL (ie. http://api.in-portal.org/web-service.php )
b. InPortalWebServiceApiKey - text box with 10 digits key - number only (for indexing speed)

We'll start with simple - In-Portal web-service will do City suggestions, but later will be adding more functionality to this. Key will be generated under In-Portal user account so we can make sure we don't get spamed from outside.

Example of Request:


Example of Successfull Return: PLAIN format with City Names + their GPSLon, GPSLat.

NOTE: Please consider may be should do this as SOAP server/client if it will: SPEED UP the process and allow us to MORE THINGS effectively in the future.



3. Google Address Mapping Service.

This service will allow to submit address to Google and get exact GPSLon and GPSLat based on FULL address. If enabled will overwrite previously saved values with new (for correct prefix) of Lat and Lon turned by Google.

This service will be depended on Google Maps API Key. We already have a variable storying it.


Some of my final notes are that it's critical to have this suggestion functionality:

a. EASY pluggable for ANY prefix/unit (not just User Registration process). 
b. Admin Interfaces should be properly working everywhere.


Please review and post you notes here.

Once we all agree on the plan of action we'll be creating a Feature Request in Issue Tracker for this!

DA.

PS. Huh - lots of good work here mates!



On Feb 16, 9:15 am, Alexander Obuhovich <aik.b...@gmail.com> wrote:
> All except 1st paragraph ok. User countries are stored using 3-symbol ISO
> code on all projects, where user can specify country, no cities database
> should be converted to use 3-symbol country code.
> On Mon, Feb 15, 2010 at 5:41 PM, Erik Snarski <s...@jamajo.lv> wrote:
> > Sure, we may use 3-symbol codes instead of 2-symbol codes. In this
> > case additional time will be required for data convertation, because
> > in existing "cities" table 2-symbol codes are used now. Also, 2-symbol
> > codes are preferable, because they are used in site domains, so, if I
> > want to get all "russians" from PortalUser table, I need know only one
> > code - "RU", to write SQL
> > SELECT Email
> >        FROM inp_PortalUser WHERE Country='RU'
> >        OR      RIGHT(Email,3) = '.ru'
> > In other case, there additional time required to find additional 3-
> > symbol ISO code.
> > Why relation between states and countries by country code is
> > preferable. This is so because in address fields are stored country
> > codes, not country IDs. For example, in PortalUser record we can see
> > "USA", not 225, and then, if we need get all states for this country,
> > relation between states AND countries by ID means excessive join in
> > SQL request. Also, it is more convenient to work directly with DB data
> > (for example, in PMA) when fields contain reasonable values, not more
> > abstract codes.
> > 1.4) yes, by default there must be URL towww.in-portal.comsite, but,
> > in some In-Portal customization may be special conditions and then
> >www.in-portal.comsite's service may be replaced to fit them.
> > Developer of this customization may create another cities suggestion
> > service, install it on other domain and join it by change this one
> > paremeter in In-Portal configuration.
> --
> Best Regards,
> http://www.in-portal.comhttp://www.alex-time.com

RegiststrationWithSuggestions.png

Erik Snarski

unread,
Feb 25, 2010, 11:50:43 AM2/25/10
to In-Portal Development Team
I guess, it is impossible to use SOAP for suggestuions, because those
will be taken by javascript/ajax, not from PHP code. Team Leader Alex
also recomends suggestions output as JSON array and this is
incompatible with SOAP, as I understand.

Alexander Obuhovich

unread,
Feb 25, 2010, 12:55:51 PM2/25/10
to in-por...@googlegroups.com
Not completely right. We could ask php code on our site (http://www.site.com)  to get SOAP suggestions from http://www.in-portal.com, then cache them locally in database and then return result as JSON to form, who requested that on our site. It's like our site acts as proxy for that all stuff.

Erik Snarski

unread,
Feb 25, 2010, 11:43:05 PM2/25/10
to In-Portal Development Team
And that proxy-solution with excessive cache development means "SPEED
UP the process" which goal is marked by Team Leader Dmitry?

Alexander Obuhovich

unread,
Feb 26, 2010, 2:24:12 AM2/26/10
to in-por...@googlegroups.com
No excessive cache. I just think, that city auto-complete should no go to http://www.in-portal.com for each form. With all that you've forgot original idea: administrator import only needed cities and auto-complete goes from out site, not in-portal.com at all.


On Fri, Feb 26, 2010 at 6:43 AM, Erik Snarski <sn...@jamajo.lv> wrote:
And that proxy-solution with excessive cache development means "SPEED
UP the process" which goal is marked by Team Leader Dmitry?



Erik Snarski

unread,
Feb 26, 2010, 3:46:32 AM2/26/10
to In-Portal Development Team
I do not forget original idea, and I remember also that idea is
unacceptable because of possible quantity and size of necessary tables.

Alexander Obuhovich

unread,
Feb 26, 2010, 8:04:33 AM2/26/10
to in-por...@googlegroups.com
Record count in database doesn't matter here, but total table size on disk in MB matters, because hosting takes money for hard drive space used. For example how much does table with all cities of all us states weight?


On Fri, Feb 26, 2010 at 10:46 AM, Erik Snarski <sn...@jamajo.lv> wrote:
I do not forget original idea, and I remember also that idea is
unacceptable because of possible quantity and size of necessary tables.



Dmitry Andrejev

unread,
Feb 26, 2010, 11:11:38 AM2/26/10
to in-por...@googlegroups.com
Hi Erik, Alex,


Yes, I think it's a good idea to have ability to store results locally so we don't get overwhelmed by traffic.

I suggest that we do both things - Store/Cache Results returned from In-Portal.com API and skip Cache results.


What you think?

DA.

Erik Snarski

unread,
Feb 26, 2010, 11:51:47 AM2/26/10
to In-Portal Development Team
Table with all USA cities (without multilanguage fields) may have
about - 15 MB size
All cities table on resumark (without multilanguage fields) - size is
275.2 MB

My opinion is - cache is not required, because cities suggestions SQL
use indexes and are fast.

Phil

unread,
Feb 27, 2010, 5:01:00 PM2/27/10
to In-Portal Development Team
Another important point about lightening DB: even if you have
unlimited space for your DB, when it comes to backup, we all prefer a
tiny gz.

Also, when you perform a request on a big table, few thousandth of a
second multiplied by "n" requests would create response lag for all DB
requests, just for a city name, which is a non-critical information
that users all over the world have entered by themselves the past 20
years.

I hope you don't take it personnaly Eric, but I stick on the idea to
import only needed countries, or to use suggestion service as
described.

Dmitry, the Google Address Mapping Service is an amazing start to
enhance the shipping engine.
For example, In-Commerce could be able to provide, in future release
(or custom dev), the distance needed for each delivery, plus complete
stats about distances, sorted for orders/deliveryman/total amount...
With this kind of feature, In-Portal could be of interest for bigger
projects, isn't it?

Dmitry Andrejev

unread,
Mar 25, 2010, 2:51:07 AM3/25/10
to in-por...@googlegroups.com
Erik, Alex,

We need to coordinate this task since I know BOTH of you are working on it's different parts.

Alex, did you have a chance to explain to Erik your part on this task?

Hi Phil, we'll come back to part with Google Map Services later in this discussion.


DA.
Reply all
Reply to author
Forward
0 new messages