Dupicate and increment lines

0 views
Skip to first unread message

epanda

unread,
Dec 24, 2009, 5:49:43 AM12/24/09
to vim_use
Hi,

Goal : multiply some parts of lines which contain factors

foo(2)bar(9)

Should become

foo1bar1
foo1bar2
foo1bar3
foo1bar4
foo1bar5
foo1bar6
foo1bar7
foo1bar8
foo1bar9
foo2bar1
foo2bar2
foo2bar3
foo2bar4
foo2bar5
foo2bar6
foo2bar7
foo2bar8
foo2bar9


I have done a func that receive (foo,2,bar,9) but it take 0.0008
second.
In fact I have more than 2000 lines with this format : foo(2)bar(9)
or other possible cases

foobar
foo(X)bar
foobar(Y)
foo(X)bar(Y)

foo(X)bar(Y)foo(Z)
foobar(Y)foo(Z)
foo(X)barfoo(Z)
foo(X)bar(Y)foo
foobar(Y)foo


Have you got another way to improve instead of doing
g/\([^(]\)\((\d\+)\)....../\=myFunc(submatch(1) etc...)/

Thanks

g/

Tim Chase

unread,
Dec 24, 2009, 9:14:51 AM12/24/09
to vim...@googlegroups.com
> Goal : multiply some parts of lines which contain factors
>
> foo(2)bar(9)
>
> Should become
>
> foo1bar1
> foo1bar2
...
> foo1bar9
> foo2bar1
> foo2bar2
...

> foo2bar9
>
> I have done a func that receive (foo,2,bar,9) but it take 0.0008
> second.

Given that regardless of how the function is implemented, it will have
polynomial time characteristics, 0.0008 seconds isn't bad at all. Given
that you haven't posted the body of your function, it's a little hard to
sniff for hot-spots. If it's something you only have to perform once,
it may be best to just let it run (albeit slowly), rather than spend
hours optimizing something for a one-time run. However, if it's
something repeatable, you might already be close to optimal performance.
Otherwise, I'd write a little python (or if you're a perl/awk/ruby
programmer, use one of those) script to do the dirty work for you.

> In fact I have more than 2000 lines with this format : foo(2)bar(9)
> or other possible cases
>
> foobar
> foo(X)bar
> foobar(Y)
> foo(X)bar(Y)
>
> foo(X)bar(Y)foo(Z)
> foobar(Y)foo(Z)
> foo(X)barfoo(Z)

You haven't mentioned what to do with missing X or Y values, and haven't
detailed Z...An empty value could be interpreted as 0 or 1, and Z could
be an additional combinatoric factor. And you don't give a decent way
of recognizing the various parts...is "foobar" just one instance, or is
that "foo" followed by an implicit count, followed by a "bar"? And in
the Z lines, does it repeat the same match as before? Can there be more
than 3 number instances? Are "foo" and "bar" regular expressions or are
they constant text?

There are a lot of details you omit which makes it hard to provide a
solution. Again. You've been asked multiple times in other threads to
pose the problem in its entirety which includes expected inputs *and*
the expected outputs.

> Have you got another way to improve instead of doing
> g/\([^(]\)\((\d\+)\)....../\=myFunc(submatch(1) etc...)/

Well, again, you've not posted the actual code you're using since
submatch(n) is only available within a :s command (which you likely
intended) and not a :g command (which you typed), and your function
syntax doesn't compile, so it's hard to even guess what your function does.

Please answer all of the above questions regarding the problem
space...only then can we begin to help you. Though given that it sounds
like you have a working-but-slow solution, just use that one. Or maybe
even post its code so we have something to work from to try and infer
"correct" behavior.

I was going to hammer out a quick python script to tackle the problem,
but the problem is too ill-defined to get anywhere.

-tim


epanda

unread,
Dec 24, 2009, 10:12:30 AM12/24/09
to vim_use

> -tim- Masquer le texte des messages précédents -
>
> - Afficher le texte des messages précédents -

If I don't post the code, that's there is a reason. Give me your email
and I will tell you.


""""""""""""""""""""""""""""""""""""""""""""""""""""
"
" Fonction Multiply
" ------------
" Description : cette fonction est appelee a chaque lecture d'une
ligne
" contenant (N) afin de dupliquer la ligne N fois + iteration
"
" Retour : elle retourne les données correctement structurees
" au format Xml (gestion balises ouvrante/fermantes et regroupement de
" plusieurs proprietes d'un meme device_type
"
""""""""""""""""""""""""""""""""""""""""""""""""""""
function! Multiply(chemin,nbChemin,nom,nbNom,lastPart)

let cheminReconstitue = ''

" balise Xml

" attributs
let categorie = 'categorie'
let type = 'type'


" cas ou seul le chemin ne contient pas de nombre
if a:nbChemin == "" && a:nbNom != ""

try
let nbRepetChemin = 1
let nbRepetNom = a:nbNom
let part1ARepeter = a:chemin . a:nom
let part2ARepeter = ""
catch /.*/
return " [DBG] : " . getline(".") . ' => ' . "cas ou seul le chemin
ne contient pas de nombre"
endtry

" cas ou l'on ne passe jamais en raison de la regexp d'entree
elseif a:nbChemin != "" && a:nbNom == ""

try
let nbRepetChemin = a:nbChemin
let nbRepetNom = 1
let part1ARepeter = a:chemin
let part2ARepeter = a:nom
catch /.*/
return " [DBG] : " . getline(".") . ' => ' . "cas ou seul le chemin
ne contient pas de nombre"
endtry

"return "DBG : " . getline(".") . ' => ' . 'Cas ou on ne doit jamais
passer'

" cas ou ni le chemin ni le nom ne contiennent de nombre
elseif a:nbChemin == "" && a:nbNom == ""

"return "DBG : " . getline(".") . ' => ' . a:chemin . a:nom . ' cas
ou il ny a pas de numero'
" gere internal ref categorie et type
let chemin = a:chemin . a:nom
return a:chemin . a:nom . '
'

" cas ou le chemin et le nom contiennent un nombre
else

let nbRepetChemin = a:nbChemin
let nbRepetNom = a:nbNom
let part1ARepeter = a:chemin
let part2ARepeter = a:nom
" return "DBG : " . getline(".") . ' => ' . a:chemin . ' ' .
a:nbChemin . ' ' . a:nom . ' ' . a:nbNom
endif

" duplication des lignes
let idxRepetChemin = 1
let idxRepetNom = 1

while idxRepetChemin <= nbRepetChemin
while idxRepetNom <= nbRepetNom

if nbRepetChemin > 1 && nbRepetNom > 1
let cheminReconstitue .= part1ARepeter . idxRepetChemin .
part2ARepeter . '_' . idxRepetNom
elseif (nbRepetChemin > 1) && (nbRepetNom == 1)
let cheminReconstitue .= part1ARepeter . idxRepetChemin .
part2ARepeter
else
let cheminReconstitue .= part1ARepeter . part2ARepeter
endif

" gere internal ref categorie et type

let cheminReconstitue .= a:lastPart . '
'
let idxRepetNom+=1

endwhile
"
let cheminReconstitue .= '
'
let idxRepetNom=1
let idxRepetChemin+=1
endwhile


return cheminReconstitue

endfunction

function! MultiplyFactorisedLines()

let startConditionnalTime = reltime()
%s/^\([^(]*\)\((\(\d\+\))\)\?\([^(]*\)\((\(\d\+\))\)\?\(.*$\)/
\=Multiply(submatch(1),submatch(3),submatch(4),submatch(6),submatch
(7))/g
echo "Tps Multiply : pris par le conditionnel : " . reltimestr(reltime
(startConditionnalTime)) . " seconde(s)"

g/^$/d

endfunc

Tim Chase

unread,
Dec 24, 2009, 2:42:15 PM12/24/09
to vim...@googlegroups.com
epanda wrote:

>
> On 24 d�c, 15:14, Tim Chase <v...@tim.thechases.com> wrote:
>>> Goal : multiply some parts of lines which contain factors
>>> foo(2)bar(9)
>>> Should become
>>> foo1bar1
>>> foo1bar2
>> ...
>>> foo1bar9
>>> foo2bar1
>>> foo2bar2
>> ...
>>> foo2bar9

For something close, you can use the following python code that is
pretty speedy and should handle arbitrary counts of repeats, and pad in
the optionally missing bits with "1"

###########################################
import re
r = re.compile(r'\((\d+)\)')
def printall(leader, bits):
if bits and bits[0].strip():
text = bits[0]
try:
times = int(bits[1])
except:
times = 1
for i in range(times):
printall(
'%s%s%i' % (leader, text, i+1),
bits[2:]
)
else:
print leader
for line in file('epanda.txt'):
line = line.rstrip('\n')
if r.search(line):
printall('', r.split(line))
else:
print line

###########################################

You can tweak the behavior for those lines that have no count so they
behave as you want.

-tim

Dominique Pellé

unread,
Dec 24, 2009, 4:30:01 PM12/24/09
to vim...@googlegroups.com
epanda wrote:

> Hi,
>
> Goal : multiply some parts of lines which contain factors
>
> foo(2)bar(9)
>
> Should become
>
> foo1bar1
> foo1bar2
> foo1bar3
> foo1bar4
> foo1bar5
> foo1bar6
> foo1bar7
> foo1bar8
> foo1bar9
> foo2bar1
> foo2bar2
> foo2bar3
> foo2bar4
> foo2bar5
> foo2bar6
> foo2bar7
> foo2bar8
> foo2bar9

Hi, here is a solution in Perl:

#!/usr/bin/perl -wn
sub expand {
my ($p, $_) = @_;
if (/\((\d+)\)/) {
my ($w, $c, $r) = ($`, $1, $');
for (1 .. $c) {
print "$p$w$_\n" unless ($r);
expand("$p$w$_", $r);
}
} else {
print "$p$_\n";
}
}
chomp;
expand '', $_;


Examples:

$ cat test.txt
foo(2)bar(9)
foobar(3)aaa

$ ./epanda.pl test.txt


foo1bar1
foo1bar2
foo1bar3
foo1bar4
foo1bar5
foo1bar6
foo1bar7
foo1bar8
foo1bar9
foo2bar1
foo2bar2
foo2bar3
foo2bar4
foo2bar5
foo2bar6
foo2bar7
foo2bar8
foo2bar9

foobar1aaa
foobar2aaa
foobar3aaa

-- Dominique

epanda

unread,
Dec 27, 2009, 11:50:39 PM12/27/09
to vim_use
Thank you both for all.

Reply all
Reply to author
Forward
0 new messages