Borland
compiler
(bcc32.exe) Python 2.5 + Psyco
================================
1 1
2 2
13 13
5 5
9 9
59078 59078
1000 1000
321000 321000
=============================
0.016 0.0150001049042 <--- exec times
=============================
----------------------------------------------------
My test input file:
----------------------------------------------------
8
a aa
abbaz
CCCCC ABABA
fgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryududdisisisidfjfjnfnfgngnnghhhhhhhdeggrgtfgjtjdjjdjdjjryyryryudud
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
=====================================================
import psyco
psyco.full()
def foo(s):
n = len(s)
s = s + ' '
a = [[] for i in xrange(128)]
ans = 0
for i in xrange(n - 1, -1, -1):
lev = 0
for st in xrange(len(a[ord(s[i])]) - 1, -1, -1):
j = a[ord(s[i])][st]
if (n - j <= lev):
break
if (s[j + lev] != s[i + lev]):
continue
if (s[j + lev / 2] != s[i + lev / 2]):
continue
k = 0
while (s[j + k] == s[i + k]):
k += 1
if (k > lev):
lev = k
a[ord(s[i])] += [i]
ans += n - i - lev
return ans
from time import time
t = time()
import sys
sys.stdin = open('D:/88.txt', 'rt')
f = sys.stdin.read().split()
sys.stdin.close()
z = open('D:/99.txt', 'wt')
for tc in range(int(f[0])):
s = f[tc + 1]
print >> z, foo(s)
print >> z, time() - t
z.close()
========================================================
import itertools
def subs(s):
return len(set(itertools.chain(
s[i:j]
for i in xrange(len(s))
for j in xrange(i, len(s)+1)))) - 1
from time import time
t = time()
import sys
sys.stdin = open('D:/88.txt', 'rt')
f = sys.stdin.read().split()
sys.stdin.close()
z = open('D:/99.txt', 'wt')
for tc in range(int(f[0])):
s = f[tc + 1]
print >> z, subs(s)
There's no need of a chain here, so you can rewrite this:
import itertools
def subs(s):
return len(set(itertools.chain(
s[i:j]
for i in xrange(len(s))
for j in xrange(i, len(s)+1)))) - 1
as:
def subs(s):
return len(set(s[i:j] for i in xrange(len(s))
for j in xrange(i, len(s)+1))) - 1
Psyco helps a little (about 10% on my PC) if you use it like this:
from time import time
import psyco; psyco.full()
def main():
t = time()
fin = open("input.txt")
fout = open("output.txt", "w")
fin.next()
for line in fin:
r = set()
s = line.rstrip()
len_s = len(s)
for i in xrange(len_s):
for j in xrange(i, len_s + 1):
r.add(s[i : j])
print >> fout, len(r) - 1
fout.close()
print time() - t
main()
Bye,
bearophile
Cute. You can replace:
a[ord(s[i])] += [i]
With:
a[ord(s[i])].append(i)
If you want to micro-optimize this with Psyco may be a little faster:
(lev >> 1)
Than:
lev // 2
But to increase speed it's usually better to reduce memory
allocations. So you can try to find a way to pull this out of the
function:
a = [[] for i in xrange(128)]
And to avoid a true append:
a[ord(s[i])] += [i]
(There are many ways to avoid an append. One of them is to have an
already allocated list/array and just bump forward an index variable
that starts from 0. This is worth doing especially when you use
Psyco).
Bye,
bearophile
> =============================
> 0.016 0.0150001049042 <--- exec times
Maybe in your C++ code there's something that can be improved, this is
a 1:1 translation to D (V.1) language (using dlibs) and it's about 2.2
times faster than the Psyco version:
http://codepad.org/MQLj0ydB
Using a smarter usage of memory (that is avoiding all or most memory
allocations inside all loops), the performance difference will surely
grow.
Bye,
bearophile
Very interesting. Thanks.
D code looks pretty neat. Btw D lang is among accepted langs there.
Even if Py by 4x *slower* -- it's still a perfect Ok for it (C# will
be
much (much) slower than Python).
Micro-optimizations.
Of course, the best optimization would be to implement Suffix Tree:
http://en.wikipedia.org/wiki/Trie
Currently I hardly understand/know/read/etc its core idea. My algo is
plainly stupid as soldier muddy boots.
My C++ code:
<BEGIN>
#include <stdlib.h>
#include <stdio.h>
//#include <string.h>
//#include <ctype.h>
//#include <math.h>
#include <time.h>
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main() {
clock_t start_time = clock();
freopen("88.txt", "rt", stdin);
freopen("99.txt", "wt", stdout);
int tcs;
string s;
cin >> tcs;
while (tcs-- > 0) {
cin >> s;
int n = s.size();
s = s + ' ';
vector< vector<int> > a(128);
int ans = 0;
for (int i = n - 1; i >= 0; --i) {
int lev = 0;
for (int st = (int)a[s[i]].size() - 1; st >= 0; --st) {
int j = a[s[i]][st];
if (n - j <= lev) break;
if (s[j + lev] != s[i + lev]) continue;
if (s[j + lev / 2] != s[i + lev / 2]) continue;
int k = 0;
while (s[j + k] == s[i + k]) ++k;
if (k > lev) lev = k;
}
a[s[i]].push_back(i);
ans += n - i - lev;
}
cout << ans << endl;
}
cout << (clock() - start_time) / CLOCKS_PER_SEC << endl;
return 0;
}
<END>
Looks not very friendly appealing :-)
===========================================
from random import choice
s = ''
for i in xrange(10000):
s += choice(('a','b','c','d','e','f'))
===========================================
C++: ~0.28s
Python: ~0.48s
PS
I suspect that building of Suffix Tree would
be a big exec.time-consuming overhead
> I suspect that building of Suffix Tree would
> be a big exec.time-consuming overhead
In C/D/C++ there are ways to allocate memory in smarter ways, using
pools, arenas, stacks, freelists, etc. With that, using a compact
encoding of tree nodes (there are many different tricks to reduce
space used by a suffix tree), you can go fast enough. Probably a basic
implementation too will suffice in many cases :-)
Anyway, you may try a pure-Python2.x implementation:
http://suffixtree.googlecode.com/files/suffixtree-0.1.py
If you find time & will to try to use that (or similar code) to
improve your Python solution to the problem 99, you can show us the
code here...
Bye,
bearophile
Ouch, Bearie... 214 lines of code are there.
Ok.. I tested it.
Firstly I replaced all "print " with "pass##print "
(aiming to avoid intensive printing from inside of the code).
Then I left in its final part only building of suffix trees.
==================================================================
...
...
...
from time import time
t = time()
import sys
sys.stdin = open('D:/88.txt', 'rt')
f = sys.stdin.read().split()
sys.stdin.close()
z = open('D:/99.txt', 'wt')
for tc in range(int(f[0])):
s = f[tc + 1]
test_str = s + '$'
POSITIVE_INFINITY = len(test_str) - 1
suffix_tree = SuffixTree(test_str)
print >> z, 'len(s) =', len(s)
print >> z, 'time =', time() - t
z.close()
============================================================
Output:
============================================================
len(s) = 1000
len(s) = 1000
len(s) = 10000
time = 0.641000032425
============================================================
0.64s > 0.48s (of my algo)
I.e. the code can't help in my very special and narrow case.
But of course it is worthy to be studied and memoized.
E.g.:
from huge string "s" we built its Suffix Tree.
Now we are given arbitrary string "ss".
Task: to find the largest its prefix occured in "s",
traversing its Suffix Tree.