Hi there,
which i tested and it works good, but i couldn't get the same methods to work in Cython (based on my limited knowledge on the matter)
We plan to iterate rows in a for loop, but we can't figure out how to "access / check / print" values for each row.
How could we, for example, print a single row from the passed arrow table (or multiple rows if you so prefer) of "date, name, age, weight" inside Cython? Could you give us an example please?
python code:
===============================================
import pandas as pd
import prophet.cython.arrow.myarrow as myarrow
df = pd.DataFrame({
'date': pd.date_range(start='2020-01-01 00:00:00', periods=3, freq='1min'),
'name': ['jack', 'tim', 'frank'],
'age': [32, 25, 65],
'weight': [66.46, 84.11, 71.52]
})
table = pa.Table.from_pandas(df)
myarrow.iterate_table(table) # This is where arrow table is being passed from Python to Cython
===============================================
cython code:
===============================================
from __future__ import print_function
cimport pyarrow
from pyarrow.lib cimport *
def iterate_table(obj):
cdef int num_columns = 0
cdef int num_rows = 0
cdef:
shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
shared_ptr[CChunkedArray] array
shared_ptr[CArray] chunk
shared_ptr[CArrayData] data
if table.get() == NULL:
raise TypeError("not a table...")
num_columns = table.get().num_columns()
num_rows = table.get().num_rows()
print("num_columns: ",num_columns) # prints 4 as expected
print("num_rows: ",num_rows) # prints 3 as expected
array = table.get().column(2)
chunk = array.get().chunk(0)
data = chunk.get().data()
print("chunk length: ", chunk.get().length()) # prints 3 as expected
print("data length: ", data.get().length) # prints 3 as expected
===============================================
best regards,
Neon