> There are about ~3000 unique origin-destination pairs. Each pair has multiple unique routes, as noted by 'ChoiceNum'. Each route consists of a feasible route involving bus travel, and an access and egress leg by one of the four combinations
of walking and biking, as noted by 'AEID'. For each OD pair, I had an estimate of travel demand via public transport. That total demand was split over all routes, first based on access/egress mode combination, and then in cases where multiple routes with the
same access/egress mode combination are available, using the proportions of tap in/out card data for the first leg of the route.
>
> Thus, for the example OD pair 339-1426, there was a demand of 179 travellers, 177 of whom chose route 1 (with mode 2 - walk-bike as A/E), and 2 who chose route 2 (with mode 4 - bike-bike as A/E). Then the number of rows for each traveller is equal to the
number of options available to them. All routes are unique to their OD pair, and all travellers can pick any of the routes specific to their OD pair. I have removed cases where only one route existed, or where the average number of travellers for all routes
in the pair is less than 50 (arbitrarily, for now). One potential issue is that each OD pair has a unique choice set, including cases where only two routes with the same access/egress combination is used, or a case where, say, five routes for each of the four
A/E combinations are available. However, I think that I have removed that complexity from my simplified model, which only looks at one pair and removes AEID completely.
>
> I have tried the model with and without alternative-specific constraints, with no difference in results.
> In the original route output file, each route was listed once with an estimated time value for each component. When I duplicated the rows based on travel demand, I generated the time values for each traveller from a lognormal distribution, using the time
value from the 'parent' row as the median. I do not think that there can be any co-linearity between the time variables, and even when I run the model using just one of the variables, I get the same issue. I have also tried constraining each of the Beta values
in turn to not be estimated by Biogeme, but this did not change anything either.
>
> I split access and egress time into two columns based on their AEID, and thought that maybe the 0 values for the unused mode could be the issue, but this complexity was also removed from my simplified model.
>
> Finally, in this version of the model, Choice is binary where 1 represents the chosen alternative. I also have an exact copy of the file where Choice is instead the ID from ChoiceNum that the given traveller chose. The AI assistant got stuck in a loop of
suggesting the opposite format each time, but trying both produced the same error.
>
>
>
> I really appreciate any insight into what might be going wrong, whether I am unable to see a fundamental issue with my dataset, or if I am missing an error in my code. I am happy to answer any questions or to provide the code I am using for the full dataset
if that would help.
>
> Thank you for your time and assistance,
> Kaden Herner
Michel Bierlaire
Transport and Mobility Laboratory
School of Architecture, Civil and Environmental Engineering
EPFL - Ecole Polytechnique Fédérale de Lausanne
http://transp-or.epfl.ch
http://people.epfl.ch/michel.bierlaire