problem with reading data

780 views
Skip to first unread message

babu9708011 .

unread,
Feb 24, 2014, 2:12:58 AM2/24/14
to am...@googlegroups.com
Hi All:

I am trying to solve a problem using following AMPL read command in my run file:


read {t in Y..X,l in Y..X}S ["dis",t,l]<Distance.txt;

Distance.txt is a 7000 by 7000 coefficient matrix. I would like to solve part of my big problem by changing the value of Y & X. For example, I chose value of Y=1 & X=200 for solving top left corner's 200x200 coefficient matrix of the big matrix. Surprising, if I choose a value of X & Y way greater than 7000, cplex able to solve those instances (though those values does not exist). Any suggestion to fix this error will be highly appreciated.

Thanks.

Noor

fbahr

unread,
Feb 24, 2014, 12:46:34 PM2/24/14
to am...@googlegroups.com
First of all:

read {t in Y..X, l in Y..X} S["dis",t,l] < Distance.txt;

just means:

read (X-Y)^2 elements from Distance.txt, starting at the current file cursor position.

For instance:

param Y;
param X;
set I := {"dis"};
set T := Y..X;
set L := Y..X;
param S {I, T, L};

let Y :=  1;
let X := 10;

read {t in Y..X, l in Y..X} S["dis",t,l] < Distance.txt; # 10x10 matrix

display S;

S [dis,*,*]
:    1   2   3   4   5   6   7   8   9   10    :=
1    1   2   3   4   5   6   7   8   9   10
2    1   2   3   4   5   6   7   8   9   10
3    1   2   3   4   5   6   7   8   9   10
4    1   2   3   4   5   6   7   8   9   10
5    1   2   3   4   5   6   7   8   9   10
6    1   2   3   4   5   6   7   8   9   10
7    1   2   3   4   5   6   7   8   9   10
8    1   2   3   4   5   6   7   8   9   10
9    1   2   3   4   5   6   7   8   9   10
10   1   2   3   4   5   6   7   8   9   10
;

close Distance.txt; # reset cursor
reset data S;

let Y := 1;
let X := 6;

read {t in Y..X, l in Y..X} S["dis",t,l] < Distance.txt;

display S;

S [dis,*,*]
:   1    2   3    4   5    6     :=
1   1    2   3    4   5    6
2   7    8   9   10   1    2
3   3    4   5    6   7    8
4   9   10   1    2   3    4
5   5    6   7    8   9   10
6   1    2   3    4   5    6
;

close Distance.txt;
reset data S;

let Y :=  7;
let X := 16; # X > 10, but: X - Y < 10

read {t in Y..X, l in Y..X} S["dis",t,l] < Distance.txt;

display S;

S [dis,*,*]
:    7   8   9  10  11  12  13  14  15   16    :=
7    1   2   3   4   5   6   7   8   9   10
8    1   2   3   4   5   6   7   8   9   10
9    1   2   3   4   5   6   7   8   9   10
10   1   2   3   4   5   6   7   8   9   10
11   1   2   3   4   5   6   7   8   9   10
12   1   2   3   4   5   6   7   8   9   10
13   1   2   3   4   5   6   7   8   9   10
14   1   2   3   4   5   6   7   8   9   10
15   1   2   3   4   5   6   7   8   9   10
16   1   2   3   4   5   6   7   8   9   10
;

close Distance.txt;
reset data S;

let Y :=  1;
let X := 20; # X > 10 and X - Y > 10

read {t in Y..X, l in Y..X} S["dis",t,l] < Distance.txt;

Error at _cmdno 19 executing "read" command
(file test.mod, line 39, offset 517):
    "read" command error: expected a number


display S;

S [dis,*,*] (tr)
:    1    2    3    4    5     :=
1     1    1    1    1    1
2     2    2    2    2    2
3     3    3    3    3    3
4     4    4    4    4    4
5     5    5    5    5    5
6     6    6    6    6    6
7     7    7    7    7    7
8     8    8    8    8    8
9     9    9    9    9    9
10   10   10   10   10   10
11    1    1    1    1    1
12    2    2    2    2    2
13    3    3    3    3    3
14    4    4    4    4    4
15    5    5    5    5    5
16    6    6    6    6    6
17    7    7    7    7    7
18    8    8    8    8    8
19    9    9    9    9    9
20   10   10   10   10   10
;

I. e., in your your case:

- Y=1, X=200 doesn't actually read a 200x200 submatrix, but the first 57 lines of the bigger 7000x7000 matrix.

- As long as X - Y < 7000, AMPL will [try to] read (X-Y)^2 elements from Distance.txt [and succeed if the number of yet unread elements is sufficient].

- If X - Y >= 7000, AMPL gives an error and S has "missing values".

--fbahr

babu9708011 .

unread,
Feb 24, 2014, 7:31:24 PM2/24/14
to am...@googlegroups.com
Thanks for your reply. I am still having difficulty with the above issue. My plan is to solve the entire matrix part by part by using following loop:

model large.mod;
data large.dat;
param S{J,Y..X,Y..X};

let Y :=  1;
let X := 200;

for{1..70}{



read {t in Y..X,l in Y..X}S ["dis",t,l]<Distance.txt; # Distance is a 7000x7000 (unformatted data) text file


solve;

close Distance.txt;

reset data S;

let X :=X+100;

let Y :=Y+100;
 

}

If I run above loop, it gives me same objective values with same run time statistics.

I need to solve my problem for 1 to 200, 101 to 300, 201 to 400 and so on.So, how can I make my read command such that it only reads that specific portion of the text files corresponds to the value of Y & X.

Many thanks for your help.



--
You received this message because you are subscribed to the Google Groups "AMPL Modeling Language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ampl+uns...@googlegroups.com.
To post to this group, send email to am...@googlegroups.com.
Visit this group at http://groups.google.com/group/ampl.
For more options, visit https://groups.google.com/groups/opt_out.

fbahr

unread,
Feb 25, 2014, 10:42:02 AM2/25/14
to am...@googlegroups.com
Since, as explained before, data is read sequentially [from left to right, ignoring line breaks], you (also) need to deal with those elements from Distance.txt you're actually not interested in.

Let's say, e. g. you wanted to extract a 4x4 submatrix of a 10x10 matrix stored in Distance.txt...


1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6  7  8  9  10

1  2  3  4  5  6  7  8  9  10    

1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6
  7  8  9  10
1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6  7  8  9  10
1  2  3  4  5  6  7  8  9  10


...then there's [afaik] no way to read the red components without also reading the green components surrounding them.

It's not very elegant, but ...you could try sth. like this:

param drop;

for {i in 1..4} {
  let Y := 2 * (i - 1) + 1;
  let X := Y + 3;

  for {t in 1..X} {
    if (t < Y) then
      read {1..10} drop < Distance.txt;
    else
      read {1..Y-1} drop, {l in Y..X} S["dis",t,l], {X+1..10} drop < Distance.txt;
  }

  display S;

  reset data S;
  close Distance.txt
}


which deals w/ components that have to be omitted (i.e., the "green" ones) by "storing" them in an aux. parameter "drop", and just keeps the "red" ones (in S).

Output:

S [dis,*,*]
:   1   2   3   4    :=

1   1   2   3   4
2   1   2   3   4
3   1   2   3   4
4   1   2   3   4
;

S [dis,*,*]
:   3   4   5   6    :=

3   3   4   5   6
4   3   4   5   6
5   3   4   5   6
6   3   4   5   6
;

S [dis,*,*]
:   5   6   7   8    :=

5   5   6   7   8
6   5   6   7   8
7   5   6   7   8
8   5   6   7   8
;

S [dis,*,*]
:    7   8   9   10    :=
7    7   8   9   10
8    7   8   9   10
9    7   8   9   10
10   7   8   9   10
;


--fbahr

Robert Fourer

unread,
Feb 26, 2014, 12:28:03 PM2/26/14
to am...@googlegroups.com

I think the cleanest approach would be to read in all the data first:

 

   param Xmax = 7000;

   param S {J, 1..Xmax, 1..Xmax};

 

   read {t in 1..Xmax, l in 1..Xmax} S["dis",t,l] <Distance.txt;

 

Then define X and Y to be appropriate integers in the range 1..Xmax:

 

   param Y integer >= 1;

   param X integer > Y, <= Xmax;

 

Then define your model in terms of X and Y and index sets Y..X, and run your loop:

 

   let Y := 1;

   let X := 200;

 

   for {1..68} {

      solve;

      let X := X+100;

      let Y := Y+100;

   }

 

If you don't have enough memory to hold all of your data, then you should turn to the previous suggestion to read the file multiple times but discard some of what is read.

 

Bob Fourer

4...@ampl.com

 

 

From: am...@googlegroups.com [mailto:am...@googlegroups.com]

On Behalf Of babu9708011 .
Sent: Monday, February 24, 2014 6:31 PM
To: am...@googlegroups.com
Subject: Re: [AMPL 8082] Re: problem with reading data

babu9708011 .

unread,
Feb 26, 2014, 4:21:18 PM2/26/14
to am...@googlegroups.com
Dear Professor Fourer:

Thanks a lot for your reply. I am having difficulty with memory space and AMPL exits as I tried with your approach first. Then I implemented it with the other approach (thanks fbahr) and having difficulty with memory problem after some time as below:

Error at _cmdno 577674 executing "read" command
(file cpff.run, line 43, offset 881):

        Too much memory used -- 103493363336 bytes; couldn't get 32792 more.
Highest address used = 103524425880.

Could you please advise me how I can discard some of what is read earlier.

Thanks.

Noor

fbahr

unread,
Feb 28, 2014, 5:32:28 AM2/28/14
to am...@googlegroups.com
I can confirm this behavior...

...and -- unless I'm not missing something very obvious [which is always an option] -- I'm inclined to call this a bug in AMPL's "read" command.

I've tested the code I had proposed [adjusted for 200 x 200 submatrices] on a 7000 x 7000 elements data set [which sums up to sth. about >225 MB], and AMPL exits after 18 iterations with

Error at _cmdno 81634 executing "read" command
(file d.txt, line 16, offset 231):

        Too much memory used -- 2107621656 bytes; couldn't get 32776 more.


Since [iterations * lines/iteration * elements/line * memory/element (in bytes [assuming 'long']) :=] ( sum {i in 2..19} i ) * 100 * 7000 * 8 == 1.058 gigabytes, my best guess [without knowledge of the internal data structure's memory overhead] is: the "read" command doesn't free the memory used for an element read from an input file after assigning it to a param.

--fbahr



On Wednesday, February 26, 2014 10:21:18 PM UTC+1, Noor wrote:
Dear Professor Fourer:

Thanks a lot for your reply. I am having difficulty with memory space and AMPL exits as I tried with your approach first. Then I implemented it with the other approach (thanks fbahr) and having difficulty with memory problem after some time as below:

Error at _cmdno 577674 executing "read" command
(file cpff.run, line 43, offset 881):

        Too much memory used -- 103493363336 bytes; couldn't get 32792 more.
Highest address used = 103524425880.

Could you please advise me how I can discard some of what is read earlier.

Thanks.

Noor
On Wed, Feb 26, 2014 at 10:28 AM, Robert Fourer wrote:

I think the cleanest approach would be to read in all the data first:

 

   param Xmax = 7000;

   param S {J, 1..Xmax, 1..Xmax};

 

   read {t in 1..Xmax, l in 1..Xmax} S["dis",t,l] <Distance.txt;

 

Then define X and Y to be appropriate integers in the range 1..Xmax:

 

   param Y integer >= 1;

   param X integer > Y, <= Xmax;

 

Then define your model in terms of X and Y and index sets Y..X, and run your loop:

 

   let Y := 1;

   let X := 200;

 

   for {1..68} {

      solve;

      let X := X+100;

      let Y := Y+100;

   }

 

If you don't have enough memory to hold all of your data, then you should turn to the previous suggestion to read the file multiple times but discard some of what is read.

 

Bob Fourer

4...@ampl.com

 

 

From: am...@googlegroups.com

On Behalf Of babu9708011 .

Robert Fourer

unread,
Mar 1, 2014, 3:14:32 PM3/1/14
to am...@googlegroups.com

It's not clear to me why you define

   param S {J, 1..Xmax, 1..Xmax};

rather than

   param S {1..Xmax, 1..Xmax};

There are already 49 million numbers in {1..Xmax, 1..Xmax} and so if J is has more than the one element "dis" then I can see where an excessive number of parameter values might be created.  On the other hand if J has only one element then I don't see why S should need to be indexed over it.  How do you actually define and use J in your model?

Bob Fourer

am...@googlegroups.com

From: am...@googlegroups.com [mailto:am...@googlegroups.com]

On Behalf Of babu9708011 .

Sent: Wednesday, February 26, 2014 3:21 PM
To: am...@googlegroups.com
Subject: Re: [AMPL 8106] Re: problem with reading data

Victor

unread,
Mar 2, 2014, 11:04:57 AM3/2/14
to am...@googlegroups.com
Hi Noor,

I've tried to reproduce the issue using fbahr's code, but the amount of memory consumed by the AMPL process was fairly small and didn't increase over time. Could you provide a test case to reproduce the problem and let us know which version of AMPL you use.

Here's the code that I used:

param Y;
param X;
param N := 7000;
param S{{'dis'}, Y..X, Y..X};

param drop;

for {i in 1..100} {
  let Y := 2 * (i - 1) + 1;
  let X := Y + 200;
  
  display i, Y, X;

  for {t in 1..X} {
    if (t < Y) then
      read {1..N} drop < Distance.txt;
    else
      read {1..Y-1} drop, {l in Y..X} S["dis",t,l], {X+1..N} drop < Distance.txt;
  }

  display S;

  reset data S;
  close Distance.txt
}

HTH,
Victor

Noor

unread,
Mar 1, 2014, 11:03:39 PM3/1/14
to am...@googlegroups.com
In my model J has 11 elements and I am using 11 read commands to read all S.

Sent from my iPhone
--

fbahr

unread,
Mar 2, 2014, 2:41:32 PM3/2/14
to am...@googlegroups.com
So,

I gave it another shot [on a different PC[1], though] -- using both an early February release of AMPL as well as the latest one ...and -- of course -- things worked just fine this time.

The approach I'd suggested [with let Y := 100 * (i - 1) + 1; let X := Y + 199;] is annoyingly slow, though.

Hence, if Bob Fourer's approach actually doesn't work for you [@Noor]... you might try this one:

Extract the 200 x 200 element submatrices (you're interested in) using, for instance, AWK[2]:

With AWK in [the right] place...

-- in Windows's cmd.exe:

for /l %y in (1,100,6801) do ( ^
gawk -v row=%y "NR >= row && NR < row+200 { print }" Distance.txt | ^
gawk -v col=%y "{ for (i = col; i < col+200; i++) printf(\"%s%s\", $i, (i==col+199) ? \"\r\n\" : OFS) }" > ^
Distance%y.txt )


-- or in Linux's bash [untested]:

for y in {1..6801..100}; \
do \
  gawk -v row=$y "NR >= row && NR < row+200 { print }" Distance.txt | \
  gawk -v col=$y "{ for (i = col; i < col+200; i++) printf(\"%s%s\", $i, (i==col+199) ? \"\n\" : OFS) }" > \
  Distance$y.txt
done


...and, then, let AMPL iterate over the generated Distance<Y>.txt files:

for {i in 1..68} {
  let Y := 100 * (i - 1) + 1;
  let X := Y + 199;

  read {t in Y..X, l in Y..X} S["dis",t,l] < ('Distance' & Y & '.txt');
  close ('Distance' & Y & '.txt');

  solve;

  reset data S;
}


--fbahr

[1]: RAM on both systems is equally... well, poor [2 GB].
[2]: http://www.gnu.org/software/gawk/ -- Windows binary: http://gnuwin32.sourceforge.net/packages/gawk.htm

Robert Fourer

unread,
Mar 4, 2014, 4:15:12 PM3/4/14
to am...@googlegroups.com

As I understand it, you are reading 11 x 7000 x 7000 = 539,000,000 values into parameter S.  If you really have 103,524,425,880 bytes on your computer, as your previous error message suggests, then I wouldn't expect "read" to run out of memory.  But how much memory do you have?

 

You could consider extracting just the 70 different 200 x 200 matrices into separate files Distance1.txt, Distance2.txt, ... using a simple program outside of AMPL, and then reading those files in successive passes through the "for" loop.  For example:

 

   for {t in 1..70} {

      read {t in 1.200, l in 1..200} S["dis",t,l] < ('Distance' & t & '.txt');

      solve;

      close ('Distance' & t & '.txt');

   }

 

The "close" command isn't normally necessary, but you might need it to prevent the operating system from running out of file descriptors due to the large number of files open.

 

Bob Fourer

am...@googlegroups.com

 

 

From: am...@googlegroups.com [mailto:am...@googlegroups.com]

On Behalf Of Noor
Sent: Saturday, March 1, 2014 10:04 PM
To: am...@googlegroups.com
Subject: Re: [AMPL 8135] Re: problem with reading data

 

In my model J has 11 elements and I am using 11 read commands to read all S.

On Behalf Of babu9708011 .


Sent: Wednesday, February 26, 2014 3:21 PM
To: am...@googlegroups.com
Subject: Re: [AMPL 8106] Re: problem with reading data

Pratim Vakish

unread,
Apr 17, 2014, 11:18:49 PM4/17/14
to am...@googlegroups.com

Hello,

I have a number n of continuous variables y[i] that can take any value in [0,1].
I know that in the optimal solution at least m (<n) will take value 0.

How can I introduce a linear constraint that requires explicitly at least m of the n variables y[i] are equal to 0?

Any suggestion?
Pratim

Robert Fourer

unread,
Apr 20, 2014, 9:07:44 PM4/20/14
to am...@googlegroups.com
You can define corresponding binary variables z[i], with constraints y[i] <=
z[i] to force y[i] to zero whenever z[i] is zero, and a constraint

sum {i in 1..n} (1-z[i]) >= m

to require that at least m of the z[i] variables are zero. I don't know of
a way to do it without introducing some class of "logical" variables like
z[i], however.

Bob Fourer
am...@googlegroups.com

=======

From: am...@googlegroups.com [mailto:am...@googlegroups.com] On Behalf Of
Reply all
Reply to author
Forward
0 new messages