Paging limitation above 300000 records

228 views
Skip to first unread message

Gowsalai M

unread,
Jun 29, 2021, 1:59:53 AM6/29/21
to gocql
Hi,

We are trying the fetch all the records from a cassandra table which contain more than five lakh records.
We are following the paging logic from a page: https://pkg.go.dev/github.com/gocql/gocql#hdr-Paging

Ours is a multi-node cassandra cluster. 
We are not able to fetch more than 3.2 lakh records with the pageSize 5000 (tried 500 and 1000 as well). 
  
Is there any limitation at the qocql driver?.. 

Regards,
Gowsalai

Martin Sucha

unread,
Jun 29, 2021, 2:34:07 AM6/29/21
to Gowsalai M, gocql
Hi Gowsalai,

There is not any limitation in the driver that I'm aware of. Do you use the built-in iterator or do you set the page state manually? Do you receive any errors?

Martin


This email, including attached files, may contain confidential information and is intended only for the use of the individual and/or entity to which it is addressed. If you are not the intended recipient, disclosure, copying, use, or distribution of the information included in this email and/or in its attachments is prohibited.
If you have received it by mistake, please do not read, copy or use it, or disclose its contents to others. Please notify the sender that you have received this email by mistake by replying to the email, and then delete the email and any copies and attachments of it. Thank you.

Gowsalai M

unread,
Jun 29, 2021, 7:26:51 AM6/29/21
to gocql

Hi Martin,

We use the builtin iterator. Actually the table has 460000 records. At the max it is able to fetch only 323000 records.
No errors observed.

Version Info:

sm@cqlsh:sm> show version
[cqlsh 5.0.1 | Cassandra 4.0-beta2 | CQL spec 3.4.5 | Native protocol v4]
sm@cqlsh:sm>

Code Snippet:

         var pageState []byte
         var count int //TODO: remove after testing

log.Info("pageState:", pageState)

       query := c.Session.Query("select * from candidate").PageState(nil).PageSize(5000)


for {
iter := query.Iter()

tmplist, err := iter.SliceMap()
if len(tmplist) == 0 {
log.Info("No data to fetch, break for loop")
break
} else if err != nil {
log.Info("Error fetching record:", err)
return ErrDBnotAccessible
}

count += len(tmplist)
nextPageState := iter.PageState()
if len(nextPageState) == 0 {
log.Info("No more pages found, break the loop ")
break
} else {
//iterate over more pages
iter = query.PageState(nextPageState).Iter()
}
}

Regards,
Gowsalai

Martin Sucha

unread,
Jun 29, 2021, 9:40:46 AM6/29/21
to Gowsalai M, gocql
I think the above should work. Does another client (like cqlsh, python client, etc.) return the correct amount of rows? Please try enabling gocql_debug tag when building. This will cause gocql to log verbosely.
Alternatively it should be possible to check if there is any anomaly if you record the network traffic for example with tcpdump or Wireshark.

M.

Gowsalai M

unread,
Jun 29, 2021, 12:07:29 PM6/29/21
to gocql
Yes Martin, we have issue when we retrieve using the cqlsh client as well.

Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: READ_TOO_MANY_TOMBSTONES from /10.254.2.129:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}

Does this error mean that all the remaining records are tombstones??

Martin Sucha

unread,
Jun 30, 2021, 2:12:55 AM6/30/21
to Gowsalai M, gocql
On Tue, Jun 29, 2021 at 6:07 PM Gowsalai M <gows...@gmail.com> wrote:
Yes Martin, we have issue when we retrieve using the cqlsh client as well.

Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: READ_TOO_MANY_TOMBSTONES from /10.254.2.129:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
Ah, I see now why your Go program didn't print the error message:
tmplist, err := iter.SliceMap()
if len(tmplist) == 0 {
log.Info("No data to fetch, break for loop")
break
} else if err != nil {
log.Info("Error fetching record:", err)
return ErrDBnotAccessible
}

You should check the error before checking the returned list, as the list is empty in case of error:
tmplist, err := iter.SliceMap()
if err != nil {
log.Info("Error fetching record:", err)
return ErrDBnotAccessible
}
else if len(tmplist) == 0 {
log.Info("No data to fetch, break for loop")
break
}
Does this error mean that all the remaining records are tombstones??
 
I can't help you with this one, I've never encountered this error before. It is a server error, so please check the documentation for Cassandra.

Martin
 
--
You received this message because you are subscribed to the Google Groups "gocql" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gocql+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gocql/b38a5247-cd47-48c2-9623-68c89dd8aa6an%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages