Can I recover data from scraped tasks?.....

16 views
Skip to first unread message

ecom4...@gmail.com

unread,
Aug 10, 2020, 9:16:08 PM8/10/20
to pyspider-users
My laptop switched off suddenly, and I had the results of allmost one week scraping.. maybe much data
Locally.

This is the first error:

================================
Error: Could not create web server listening on port 25555
Process Process-2:
Traceback (most recent call last):
  File "/Users/USER/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/Users/USER/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/run.py", line 299, in result_worker
    result_worker = ResultWorker(resultdb=g.resultdb, inqueue=g.processor2result)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/utils.py", line 355, in __getattr__
    return ret.__get__(self, ObjectDict)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/libs/utils.py", line 342, in __get__
    return self.getter()
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/run.py", line 127, in <lambda>
    db, kwargs['data_path'], db[:-2])))
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/__init__.py", line 44, in connect_database
    db = _connect_database(url)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/__init__.py", line 67, in _connect_database
    return _connect_sqlite(parsed,dbtype)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/__init__.py", line 140, in _connect_sqlite
    return ResultDB(path)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/sqlite/resultdb.py", line 25, in __init__
    self._list_project()
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/sqlite/sqlitebase.py", line 53, in _list_project
    where='type = "table"'):
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/basedb.py", line 54, in _select
    for row in self._execute(sql_query, where_values):
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/database/basedb.py", line 37, in _execute
    dbcur.execute(sql_query, values)
sqlite3.DatabaseError: database disk image is malformed
================================

I found this online and tried to recover it...https://segmentfault.com/q/1010000010521078

but now when I do pyspider all I get...

================================
Traceback (most recent call last):
  File "/Users/USER/anaconda3/bin/pyspider", line 8, in <module>
    sys.exit(main())
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/core.py", line 1256, in invoke
    Command.invoke(self, ctx)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/USER/anaconda3/lib/python3.6/site-packages/pyspider/run.py", line 125, in cli
    os.mkdir(kwargs['data_path'])
PermissionError: [Errno 13] Permission denied: './data'
================================


Can I by any chance recover some of my data?

Any help will ve very welcommed


thanks.

ecom4...@gmail.com

unread,
Aug 10, 2020, 10:56:09 PM8/10/20
to pyspider-users


Thank you.

I kept the "data" folder with almost 1 GB safe: copied in other place and renamed the original one.

I could not figure out how to fix the sql... so I am installing it from zero.

Would that be as simple as put the data with result.db, task.db, etc. in its place?

Should I do something on them?

ecom4...@gmail.com

unread,
Aug 11, 2020, 1:23:23 AM8/11/20
to pyspider-users
I follow the tutorial for the files one by one...

and the error is only into result.db.

I put convert into sql by following this:

deleted by hand the rollback and the beginning, but using a mac.. now figuring out how to convert back into db and recover the data as csv.

options:

but now... NEW ISSUE.

I cannot get rid of this:

Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555

every time I start pyspider all

I tried several methods and killing the process, it changes again to another new process.

and again.

Should to uninstall all?

ecom4...@gmail.com

unread,
Aug 11, 2020, 11:11:41 AM8/11/20
to pyspider-users


allelluja!!!!...

This one worked:

Fixing result.db with the steps on the link, then copy - paste pack on data/folder.

Run pyspider and could recover as csv (what I am using).

Still is coming and going the

Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555

But... it will go sooner or later... my worst worry is passed.

Thanks for listening and guide.


On Monday, August 10, 2020 at 9:16:08 PM UTC-4, ecom4...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages