Using Pandra to Select Data with Regex?

14 views
Skip to first unread message

Nuris Kandar Musthafa

unread,
Jul 14, 2010, 6:19:07 AM7/14/10
to pandr...@googlegroups.com
Hi,
from the example @ http://wiki.github.com/mjpearson/Pandra/examples
I get this: 
// By pluggable 'Clause'

$c = new PandraColumnFamily();
$c['username'] = 'myuser';
$c['homeAddress'] = ' MY HOUSE ';
$c['phone'] = '987654231';
$c['mobile'] = '011465987';
$c['workAddress'] = ' MY WORK ';

// regex extraction column references ending in 'address' 
// (ie: homeAddress and workAddress)
$q = new PandraQuery();
$addresses = $c[$q->Regex('/address$/i')];
foreach ($addresses as $addressColumn) {
  echo "QUERIED PATH : ".$addressColumn->value."<br>";
But, my question, How to use regex in Pandra Class to filter the value of record we want to display?

this is my class

public function getSuperColumn($key_id, $column_family, $search = '' , $start = 'column' , $output = "array")
{
$this->extendClass('PandraSuperColumnFamily', $key_id, $this->cassa_keyspace, $column_family);
$ColumnFamily = $this->PandraSuperColumnFamily;
//echo var_dump($ColumnFamily);
                if($this->getLimit() != NULL){
   $ColumnFamily->start($start)->limit($this->getLimit())->load();
}else{
   $ColumnFamily->load();
}

if($output == "array"){
   return $ColumnFamily->toArray();
}else{
   return $ColumnFamily->toJSON();
}
}

from the parameter of my function I've added variable $search, It will be used to filter the value of my record. But from the example, I just get the regex to filter the index of array Column.
I have no idea about  "how to make the filter of record value that will be displayed?" 
The filter will be like this in SQL "SELECT FROM table WHERE column LIKE '%value%';"

Thanks.

Regards

Nuris

mjpearson

unread,
Jul 14, 2010, 5:17:03 PM7/14/10
to pandra-dev
Hi Nuris,

There's not really a way to do that from pandra right now - mostly
because I haven't figured a clean notation for it across the
permutations of (super) column families and (super) columns. If you
have a suggestion for this API let me know!

The simplest way to implement it in your class for the time being
would be something like :

public function getSuperColumn($key_id, $column_family, $search = '' ,
$start = 'column' , $output = "array") {
$this->extendClass('PandraSuperColumnFamily', $key_id,
$this->cassa_keyspace, $column_family);
$ColumnFamily = $this->PandraSuperColumnFamily;

if($this->getLimit() != NULL){
$ColumnFamily->start($start)->limit($this->getLimit())->load();
foreach ($ColumnFamily as $SuperName => &$SuperColumn) {
foreach ($SuperColumn as $ColumnName => $Column)
// Drop columns which do not match
if (!preg_match($search, $Column->value) $SuperColumn-
>destroyColumns($ColumnName);
// If the supercolumn is empty, drop it from the column
family
if (empty($SuperColumn)) $ColumnFamily-
>destroyColumns($SuperName);
}
}
}else{
$ColumnFamily->load();
}
return $ColumnFamily;
}

... There's no way of matching by value in Cassandra so unfortunately
this is relegated to a purely client side implementation.

-michael

On Jul 14, 3:19 am, Nuris Kandar Musthafa <codeg...@gmail.com> wrote:
> Hi,
> from the example @http://wiki.github.com/mjpearson/Pandra/examples

Nuris Kandar Musthafa

unread,
Jul 15, 2010, 1:51:04 AM7/15/10
to pandr...@googlegroups.com
Hi,
Thanks for the response, michael

After adding the script bellow to my class,

 foreach ($ColumnFamily as $SuperName => &$SuperColumn) { 
//$this->debug($SuperColumn);
       foreach ($SuperColumn as $ColumnName => $Column) { 
         $this->debug($ColumnName);
         // Drop columns which do not match 
         if (!preg_match($search, $Column->value)) $SuperColumn->destroyColumns($ColumnName); 
         // If the supercolumn is empty, drop it from the column family 
         if (empty($SuperColumn)) $ColumnFamily->destroyColumns($SuperName); 
      
    

I get this error :

Fatal error: An iterator cannot be used with foreach by reference in /var/www/kaskuscassa/system/application/libraries/Cassandra_interface.php on line 124

So I change it to: 

foreach ($ColumnFamily as $SuperName => $SuperColumn) { 
//$this->debug($SuperColumn);
        foreach ($SuperColumn as $ColumnName => $Column) { 
          // Drop columns which do not match 
          if (!preg_match($search, $Column->value)) $ColumnFamily[$SuperName]->destroyColumns($ColumnName); 
          // If the supercolumn is empty, drop it from the column family 
          if (empty($SuperColumn)) $ColumnFamily->destroyColumns($SuperName); 
       
     

The script above can run with no error, but the filter doesn't run, because it just check and destroy the first column of $SuperColumn.
if I add this function print_r($ColumnName), it will return the first Column Name only that is (attach) whereas there are other columns value must be checked to get the valid record.

these below are data of Cassandra got from command cassandra-cli:
cassandra> get Keyspace.Thread['data1']
=>(super_column=373239313930,
     (column=attach, value=0, timestamp=1279012442)
     (column=dateline, value=1200422706, timestamp=1279012442)
     (column=deletedcount, value=0, timestamp=1279012442)
     (column=firstpostid, value=24423582, timestamp=1279012442)
     (column=forumid, value=12, timestamp=1279012442)
     (column=hiddencount, value=0, timestamp=1279012442)
     (column=iconid, value=0, timestamp=1279012442)
     (column=lastpost, value=1204600367, timestamp=1279012442)
     (column=lastposter, value=badapple, timestamp=1279012442)
     (column=lastpostid, value=27121298, timestamp=1279012442)
     (column=node, value=3, timestamp=1279012442)
     (column=notes, value=, timestamp=1279012442)
     (column=open, value=0, timestamp=1279012442)
     (column=pollid, value=0, timestamp=1279012442)
     (column=postuserid, value=317331, timestamp=1279012442)
     (column=postusername, value=mindy22, timestamp=1279012442)
     (column=prefixid, value=, timestamp=1279012442)
     (column=replycount, value=8, timestamp=1279012442)
     (column=similar, value=, timestamp=1279012442)
     (column=sticky, value=0, timestamp=1279012442)
     (column=threadid, value=729190, timestamp=1279012442)
     (column=title, value=pretty ones - Season 3 Episode 7, timestamp=1279012442)

Thanks,

Regards,

Nuris

Reply all
Reply to author
Forward
0 new messages