Posterior Predictive Checks, Error: positional indexers are out-of-bounds

lena Pollerhoff

unread,

Mar 9, 2023, 5:25:06 AM3/9/23

to hddm-users

Dear HDDM users/team,

I am currently trying to run posterior predictive checks for two regression models. We fitted a model for a prosocial decision-making task, but due to a mixed design, separately for a sample of younger adults and older adults.

This means we run the same model twice, one including younger adults only, and one including older adults only.

Now I want to run the PPC, which is no problem for the younger-adults-model.

But every time I want to run the PPC for the older-adults model (exact same code, same model, only different data) I get the following error message:

ppc_data = hddm.utils.post_pred_gen(m12OA_reg)

Traceback (most recent call last):

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexing.py", line 1469, in _get_list_axis

return self.obj._take_with_is_copy(key, axis=axis)

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/generic.py", line 3363, in _take_with_is_copy

result = self.take(indices=indices, axis=axis)

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/generic.py", line 3351, in take

indices, axis=self._get_block_manager_axis(axis), verify=True

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1449, in take

indexer = maybe_convert_indices(indexer, n)

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexers.py", line 250, in maybe_convert_indices

raise IndexError("indices are out-of-bounds")

IndexError: indices are out-of-bounds

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "<ipython-input-6-a454363c6e94>", line 1, in <module>

ppc_data = hddm.utils.post_pred_gen(m12OA_reg)

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/kabuki/analyze.py", line 328, in post_pred_gen

for name, data in iter_data:

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/kabuki/analyze.py", line 324, in <genexpr>

iter_data = ((name, model.data.iloc[obs['node'].value.index]) for name, obs in model.iter_observeds())

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexing.py", line 879, in __getitem__

return self._getitem_axis(maybe_callable, axis=axis)

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexing.py", line 1487, in _getitem_axis

return self._get_list_axis(key, axis=axis)

File "/Users/lena/opt/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexing.py", line 1472, in _get_list_axis

raise IndexError("positional indexers are out-of-bounds") from err

IndexError: positional indexers are out-of-bounds

For your information, I am working on a Mac Book, using Anconda and Spyder (5.0.5), with Python 3.6.13 and HDDM 0.8.0. However, a colleague also tried to run the PPC on a windows PC (Spyder 3.8, HDDM 0.8.0), getting the same error message.

I’ll send you a google drive link including the m12OA_reg traces, to replicate the error message: https://drive.google.com/file/d/1JXBCBsVlfnAz7FqqW69Rvv49I0XKmxuI/view?usp=sharing

I would really appreciate your help, as we were not able to solve the problem!

Best wishes and thanks in advance!

Lena

lena Pollerhoff

unread,

Apr 3, 2023, 3:56:40 AM4/3/23

to hddm-users

Hello, does anyone might help us with this specific problem?

Best

Lena

Alexander Fengler

unread,

Apr 4, 2023, 11:28:23 PM4/4/23

to hddm-users

Hi Lena,

will look in this tomorrow and report back.

Best,

Alex

Alexander Fengler

unread,

Apr 5, 2023, 4:47:36 PM4/5/23

to hddm-users

Hi Lena,

I tried to open the file you linked to, but it doesn't not seems to have the right encoding.

(Clicking the link it directly want's to go for a download also)

Can you just share the python scripts and potentially representative data in a folder and share the folder?

Best,

Alex

lena Pollerhoff

unread,

Apr 6, 2023, 4:37:45 AM4/6/23

to hddm-...@googlegroups.com

Dear Alex,

I am sorry for the inconvenience! Could you check whether the current link works for you? Four files should be in the folder (example data (from three older participants, due to the big size (180 trials)), python script, and the model files).

https://drive.google.com/drive/folders/1SMhDkNDslcklCS08AXN3zFg0dBx5TCf1?usp=sharing

Thank you so much for helping us out!

Best

Lena

--
You received this message because you are subscribed to a topic in the Google Groups "hddm-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hddm-users/hNBakvESD_8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hddm-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/df7cc694-f286-45a1-ae21-90cd89436538n%40googlegroups.com.

Alexis Pérez

unread,

May 12, 2023, 1:16:03 PM5/12/23

to hddm-users

Hi,

I am finding the same problem using the docker HDDM. Could you find the origin of this problem?

Thanks!

Alex

lena Pollerhoff

unread,

Jun 22, 2023, 2:46:03 AM6/22/23

to hddm-users

Hey ,

we finally found a solution for our problem by changing iloc to loc (https://groups.google.com/g/hddm-users/c/Is6AM7eN0fo).

Attached you'll find my code. Hopefully it might help people with similar problems:

# Define the necessary functions
def _parents_to_random_posterior_sample(bottom_node, pos=None):
"""Walks through parents and sets them to pos sample."""
import pymc as pm
import numpy as np
for i, parent in enumerate(bottom_node.extended_parents):
if not isinstance(parent, pm.Node): # Skip non-stochastic nodes
continue

if pos is None:
# Set to random posterior position
pos = np.random.randint(0, len(parent.trace()))

assert len(parent.trace()) >= pos, "pos larger than posterior sample size"
parent.value = parent.trace()[pos]

def _post_pred_generate(bottom_node, samples=500, data=None, append_data=False):
"""Generate posterior predictive data from a single observed node."""
datasets = []

# Sample and generate stats
for sample in range(samples):
_parents_to_random_posterior_sample(bottom_node)
# Generate data from bottom node
sampled_data = bottom_node.random()
if append_data and data is not None:
sampled_data = sampled_data.join(data.reset_index(), lsuffix='_sampled')
datasets.append(sampled_data)

return datasets

def post_pred_gen(model, groupby=None, samples=500, append_data=False, progress_bar=True):
"""Run posterior predictive check on a model.
:Arguments:
model : kabuki.Hierarchical
Kabuki model over which to compute the ppc on.
:Optional:
samples : int
How many samples to generate for each node.

groupby : list
Alternative grouping of the data. If not supplied, uses splitting
of the model (as provided by depends_on).
append_data : bool (default=False)
Whether to append the observed data of each node to the replicatons.
progress_bar : bool (default=True)
Display progress bar
:Returns:
Hierarchical pandas.DataFrame with multiple sampled RT data sets.
1st level: wfpt node
2nd level: posterior predictive sample
3rd level: original data index
:See also:
post_pred_stats
"""
import pymc.progressbar as pbar
results = {}

# Progress bar
if progress_bar:
n_iter = len(model.get_observeds())
bar = pbar.progress_bar(n_iter)
bar_iter = 0
else:
print("Sampling...")

if groupby is None:
iter_data = ((name, model.data.loc[obs['node'].value.index]) for name, obs in model.iter_observeds())
else:
iter_data = model.data.groupby(groupby)

results = {}

for name, data in iter_data:

node = model.get_data_nodes(data.index)

if node is None or not hasattr(node, 'random'):
continue # Skip

# Generate posterior predictive data from the node
datasets = _post_pred_generate(node, samples=samples, data=data, append_data=append_data)
results[name] = pd.concat(datasets, names=['sample'], keys=list(range(len(datasets))))

# Concatenate the results
ppc_data = pd.concat(results, names=['node'])

return ppc_data

# Generate posterior predictive data
ppc_data = post_pred_gen(m12OA_reg)

# Perform posterior predictive checks
ppc_stats = hddm.utils.post_pred_stats(data, ppc_data)
print(ppc_stats.head())

Reply all

Reply to author

Forward