o POSSIBLE SOLUTION(S)

+ First we set up a data set. We enter the {Rank, Population} data.

Input := 

pdata = {{1,7262700},{2,3259340},{3,3009530},{4,1728910},
{5,1642900}, {6,1086220},{7,1015190},{8,1003520},
{9,914350},{10,894070}, {11,752800},{12,749000},
{13,719820},{20,566030},{30,429550}};

+ We plot the {Rank, Population} data.

Input := 

popPlot = ListPlot[pdata,PlotStyle->{PointSize[.02]}]
Output =

-Graphics-

+ First we attack the problem as an exponential decay function, i.e.
P = K Exp[ - a r] where P is the population, r is the rank, and a and K are parameters to be determined.

- This looks like an exponential decay and if we take the logarithm of both sides of the equation (linearizing we hope), say to base e, of P component of each data point and plot log(P) vs. r we should get a linear relationship between the transformed or linearized data. The transformed equation will look like this: log(P) = log(K) - a*r.

Input := 

data = Table[{pdata[[i]][[1]],Log[pdata[[i]][[2]]]},
		{i,1,Length[pdata]}]//N;

- We examine a plot of the logged data to see if it looks like it could be linear, i.e. a possible relationship such as log(P) = log(K) - a*r.

Input := 

pLogPlot = ListPlot[data,PlotStyle->{PointSize[.02]}]
Output =

-Graphics-

- This logging does not appear to effect a change to a linear function so we try another model.

+ Now we attack the problem as a hyperbola function, i.e. P = K/ r^(a) where P is the population, r is the rank, and a and K are parameters to be determined.

- If we log both variables, P and r, we should get a linear relationship between the logged data, i.e., log(P) = log(K) - a*log(r).

Input := 

pLogData = Table[{Log[pdata[[i]][[1]]],Log[pdata[[i]][[2]]]},
		{i,1,Length[pdata]}]//N;

- A plot of the logged data looks like it could be linear, i.e. a possible relationship such as log(P) = log(K) - a*log(r).

Input := 

pLogPlot = ListPlot[pLogData,PlotStyle->{PointSize[.02]}]
Output =

-Graphics-

- We determine the line which fits the data best.

Input := 

pLog[r_] = Fit[pLogData,{1,r},r]
Output =

15.6286 - 0.839496 r

- And plot it through the logged data for a good feel.

Input := 

pLinePlot = Plot[pLog[r],{r,0,4}];
Input := 

Show[pLogPlot,pLinePlot]
Output =

-Graphics-

- This looks quite good and so we now we take the relationship as a logarithmic one and solve for the independent variable, p = f(r).

Input := 

eq  = Log[p] == pLog[Log[r]]; 
Input := 

sol = Solve[eq,p];

- Hence we have our functional relationship:

Input := 

P[r_] = p/.sol[[1]]
Output =

          15.6286 - 0.839496 Log[r]
1. 2.71828

- We can simplify this using our understanding of Logs in Base e and the fact that 2.71828 "is" e to the following form, P = PS(r) = K/ r^(a).

Input := 

PS[r_] = 2.71828^15.6286/r^(0.83946)
Output =

          6
6.12929 10
-----------
  0.83946
 r

- And then plot the fitted curve to the data.

Input := 

pf = Plot[PS[r],{r,1,30}]
Output =

-Graphics-

- It appears to be a pretty good fit.

Input := 

Show[popPlot,pf]
Output =

-Graphics-

- The model decreases more smoothly than the data, e.g. the 3 and 4 ranked cities are about the same and the 6, 7, and 8 ranked cities are about the same.

+ Finally we use a direct non-linear least squares approach to minimizing the sum of the square errors between the function of the two parameters k and a and the actual data. We enter the function model first.

Input := 

P[r_,k_,a_] = k/r^a;

- We form our sum of square errors as a function of the two parameters k and a which are to be found.

Input := 

SS[k_,a_] = 
  Sum[(pdata[[i]][[2]] - P[pdata[[i]][[1]],k,a])^2,
  		{i,1,Length[pdata]}];

- We examine this function of two variables over a wide range of variables for k and a to see reasonable set of starting values for the FindMinimum command.

Input := 

ContourPlot[SS[k,a],{k,6000000,8000000},{a,.5,1.5}]
Output =

-ContourGraphics-

- Thus reasonable starting values for k are near 7,000,000 and for a near 0.9. We can now use the power of the FindMinimum command to determine the values of parameters k and a which minimize the sum of square errors (SS):

Input := 

FindMinimum[SS[k,a],{k,7000000},{a,.9}]
Output =

           11             6
{6.48172 10  , {k -> 7. 10 , a -> 0.933854}}

- We try several other starting values for k and a.

Input := 

FindMinimum[SS[k,a],{k,7500000},{a,.9}]
Output =

          11                  6
{6.2081 10  , {k -> 7.15708 10 , a -> 0.948962}}
Input := 

FindMinimum[SS[k,a],{k,6000000},{a,.9}]
Output =

          11                  6
{6.2081 10  , {k -> 7.15708 10 , a -> 0.948962}}
Input := 

FindMinimum[SS[k,a],{k,7500000},{a,.8}]
Output =

          11                  6
{6.2081 10  , {k -> 7.15708 10 , a -> 0.948962}}
Input := 

sol = FindMinimum[SS[k,a],{k,6500000},{a,.9}]
Output =

          11                  6
{6.2081 10  , {k -> 7.15708 10 , a -> 0.948962}}

- And we accept the values of k = 7.15708 10^6 and a = 0.948962 to obtain a fitted function which we call p1[r].

Input := 

p1[r_] = P[r,k,a]/.sol[[2]]
Output =

          6
7.15708 10
-----------
  0.948962
 r

- We compare this to our previously fitted function from our linearization approach above:

Input := 

PS[r]
Output =

          6
6.12929 10
-----------
  0.83946
 r

- In fact we plot the newly obtained function, p1(r).

Input := 

pf1 = Plot[p1[r],{r,1,30}]
Output =

-Graphics-

- And we plot both functions, PS(r) and p1(r), over the data. We see that both seem to do quite well, but the direct least squares approach (p1(r)) predicts the Rank 1 city better than the linearized form, PS(r).

Input := 

pdata[[1]]
Output =

{1, 7262700}
Input := 

p1[1]
Output =

          6
7.15708 10
Input := 

PS[1]
Output =

          6
6.12929 10

- We look at a plot of both functions, PS(r) and p1(r), over the data.

Input := 

Show[popPlot,pf,pf1]
Output =

-Graphics-

- We look at the function resulting from linearized approach.

Input := 

Show[popPlot,pf]
Output =

-Graphics-

- We look at the function resulting from non-linear least square approach.

Input := 

Show[popPlot,pf1]
Output =

-Graphics-

- Finally we see that the non-linearized approach is a bit low in predicting the population of the smaller cities from the rank.