how to blank down within a record

466 views
Skip to first unread message

r...@rel8ed.to

unread,
Mar 30, 2016, 6:09:37 PM3/30/16
to OpenRefine

I view it in record mode, I want to blank down the duplicated value in each record. Anyone knows how to do it? If I just simply use blank down function, it will blank down all value of 'Amalgamation' except the first one. 


Thanks for your time in advance.


Tom Morris

unread,
Mar 30, 2016, 6:11:38 PM3/30/16
to openr...@googlegroups.com
Try switching to Row mode and see if that helps.

Tom


--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Owen Stephens

unread,
Apr 18, 2016, 5:07:34 AM4/18/16
to OpenRefine
Unfortunately this is how blank down works - it just keeps blanking in the column until it hits a new value, it doesn't care about records.

There are some other approaches that might do what you want, but to know exactly you'd need to say more about whether it is possible for a single record to have multiple values in the 'Amalgamation' column, and if so what behaviour you want in that case.

If there will never be multiple values in the record, or if you only need the list of values for the record but don't need to keep the values in their specific rows you can try:

In the 'Amalgamation' column use 'Join Multi-value cells', choosing a join character that won't appear in the column - e.g. a pipe | symbol
The first row in each record should now have a cell containing all the Amalgamation values
In the 'Amalgamation' column do a cell transformation to split the cell into an array based on the join character you chose, and then use the uniques function to get rid of duplicate values, then use join to create a string. e.g.:

value.split("|").uniques().join("|") 

Given the data you have shown in your screenshot, this would work for this data. However, if you had multiple values in that column within the same record, you'd have multiple values in that cell.

If you have multiple values in the column within the same record and want to preserve their position in the right row, I think this is possible, but you need to something a little more complex. Let us know if the above is adequate, and if not, what the issues are and I'll try to help further.

Owen

Pauline

unread,
May 3, 2016, 10:37:47 AM5/3/16
to OpenRefine
Hello,

  Recently I also had to blank down within records.  Since I ultimately want to delete the blank down empty rows, so what I ended up doing was to change to row mode.  Using the record number column (assuming every record has this number), I faceted it by blank and then All > remove matching rows.  If you have the same situation, that will get rid of the rows of duplicated values within the records, but retain the row with the record number.

Pauline

r...@rel8ed.to

unread,
May 25, 2016, 3:04:14 PM5/25/16
to OpenRefine
This really works, I think sometimes we need to figure out how to bypass the problem. Thanks for your help.

Zachary Schoenberger

unread,
Dec 9, 2016, 12:56:01 PM12/9/16
to OpenRefine
 
I also recently had this problem. There are not many solutions online. So here is another simple solution.
Transform cells with:

 value + row.record.index

You can then blank down your column within records and not between, as each record will be differentiated by the value of its own index. Simply use

value.replace(/\d+$/,'')

or any other number of solutions to remove the trailing digits.



Thad Guidry

unread,
Dec 9, 2016, 1:35:45 PM12/9/16
to OpenRefine
Ah, that's actually a pretty slick way to handle some use cases, Zachary !  Good Job !
KISS : Keep It Simple Stupid

-Thad

Reply all
Reply to author
Forward
0 new messages