Hello,
FYI: I've reworked the import procedure for the massive RSSSF
(Rec.Sport.Soccer Statistics Foundation) archive data. What's news?
1) HTML to .txt converter
I've created a HTML to .txt converter for the RSSSF table pages.
The script will fetch an RSSSF table page, for example,
rsssf.com/tablese/eng2015.html and turn it into plain text e.g.
## England 2014/15
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Premier League
Cup Tournaments
Championship
Division 1
Division 2
Conference
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
#### Premier League
Final Table:
1.Chelsea 38 26 9 3 73-32 87 Champions
2.Manchester City 38 24 7 7 83-38 79
...
See (real-world) examples in /tables folder in all rsssf repos.
2) Match Schedule Finder / Extractor
Next I've created a script that will split-up the all-in-one table
docs into one file for each match schedules - ready-to-use for
importing into the football.db e.g.
Round 1
[Aug 16]
Arsenal 2-1 Crystal P
Leicester 2-2 Everton
Manchester U 1-2 Swansea
QPR 0-1 Hull
Stoke 0-1 Aston Villa
West Bromwich 2-2 Sunderland
West Ham 0-1 Tottenham
[Aug 17]
Liverpool 2-1 Southampton
Newcastle 0-2 Manchester C
[Aug 18]
Burnley 1-3 Chelsea
Again see (real-world) examples in all /2015-16, /2014-15, and so on
folders in all rsssf repos.
Questions? Comments? Welcome. Cheers.
[1]
github.com/rsssf