sample size error in model output

139 views
Skip to first unread message

Samira

unread,
Jul 24, 2018, 1:22:44 PM7/24/18
to Biogeme

Dear all,

I'm running a simple multinomial logit model in pythonbiogeme. The sample size is about 3000 but in the output html, sample size is reported as 1! and therefore the model is unidentified. I have checked the data and in the text file all the 3000 rows exist.

I have attached  screenshots of the textfile, output file as well as the syntax I am using.

I would be grateful if you could guide me what the problem source could be.

All the best,
Samira
screenshot2.png
Screenshot 1.png
Screenshot 3.png

Bierlaire Michel

unread,
Jul 24, 2018, 2:04:19 PM7/24/18
to samira...@gmail.com, Bierlaire Michel, Biogeme
Biogeme uses blank spaces and tabs as separators. Sometimes, these characters are not coded properly in text files. This happens a lot on Windows, namely. Export the file in another format. 



--
You received this message because you are subscribed to the Google Groups "Biogeme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeme.
For more options, visit https://groups.google.com/d/optout.
<screenshot2.png><Screenshot 1.png><Screenshot 3.png>

Samira

unread,
Jul 26, 2018, 1:24:33 PM7/26/18
to Biogeme
Dear Michel,

Thank you for the information. I could finally resolve the issue by changing the variable names. It seems that when the variable names are too long, saving an xls file as tabdelimited text file will create more than one row for the headers.

All the best,
Samira

Samira

unread,
Jul 26, 2018, 8:36:23 PM7/26/18
to Biogeme
Sorry for multiple posts, but in an another attempt I got this syntax error:

samira@samira-VirtualBox:~/Downloads/lasttest$ pythonbiogeme lastscript reduced.dat
This is biogeme (pythonbiogeme) 2.6a
Warning: [02:25:40]bioMain.cc:108  pythonbiogeme lastscript reduced.dat
[02:25:40]bioParameters.cc:399  Parameter documentation generated: pythonparam.html
[02:25:40]bioModelParser.cc:109  lastscript.py exists
Traceback (most recent call last):
  File "/home/samira/Downloads/lasttest/lastscript.py", line 10, in <module>
    from headers import *
  File "/home/samira/Downloads/lasttest/headers.py", line 3
    ID=Variable('ID')
      ^
SyntaxError: invalid character in identifier

Warning: [02:25:41]bioModelParser.cc:142  Error: Failed to load lastscript
Warning: [02:25:41]bioMain.cc:169  Error: Failed to load lastscript
Warning: [02:25:41]bioMain.cc:100  Error: Failed to load lastscript
Warning: [02:25:41]pybiogeme.cc:27  Error: Failed to load lastscript

Thank you,
Samira

On Tuesday, July 24, 2018 at 9:04:19 PM UTC+3, Bierlaire Michel wrote:

Samira

unread,
Jul 26, 2018, 8:38:45 PM7/26/18
to Biogeme

Dear Michel,

Thank you for the tip. But, I have tried to save the file with several other extensions (.dat, .tsv) and the problem persists. What format would you suggest?  Is there any way to convert the .xls into a tab delimited format other than .txt? I would be grateful for any tips or suggestions.

I also tried to export the file from SPSS in .dat format and when running the model I ran into another problem receiving the following warning:


Warning:  Index 0.310589 not valid in expression Elem({1: Alternative specific constant - Car + Travel time (hours) * timecar + Travel Cost * costcar},{2: Alternative specific constant - Transit + Travel time (hours) * timetransit + .....( 3 ) )}[choice][0]


Does this mean, it is reading another column for choice? Is it still a datafile issue?

Thank you.


All the best,
Samira
On Tuesday, July 24, 2018 at 9:04:19 PM UTC+3, Bierlaire Michel wrote:

jz1...@nyu.edu

unread,
Aug 27, 2018, 3:07:01 PM8/27/18
to Biogeme

Hi Samira, 
So I got your same error previously and I was able to solve it using the following method:

1) Run your CSV data file using biopreparedata.py (http://biogeme.epfl.ch/utilities.html#BIOPREPAREDATA

2) Open the output file (biogeme_filename.csv) using Notepad (I used the notepad in windows) and save it as .dat file.

3) Run the .dat file using the biocheckdata.py

4) If you don't get errors, you are good to GO!

Hope it helps. 
 

jz1...@nyu.edu

unread,
Aug 28, 2018, 3:10:27 AM8/28/18
to Biogeme
Hi Samira,

Were you able to solve this error? I'm getting the same problem. Please let me know. Thanks. 

Samira Ramezani

unread,
Aug 28, 2018, 11:28:28 AM8/28/18
to jz1...@nyu.edu, Biogeme
Hi,

Yes I did. I noticed that the problem was the long name for variables. When the variable names are too long, converting the data file into a Text file will create more than one row for the headers. It will not show up in Notepad++ but will be noticed when openning the file with older version of Notepad. I just changed the name of variables (short names) and the problem was resolved.

Hope it helps.

All the best,
Samira

To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages