You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encounter the following error while I try to run the crawler for the second time:
Traceback (most recent call last):
File "/home/sadaf/store_crawler/stores_crawler/d/dookcollection.py", line 401, in <module>
asyncio.run(main())
File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/stores_crawler/d/dookcollection.py", line 377, in main
request_queue = await RequestQueue.open(name="dookcollection")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/store_crawler_venv/lib/python3.11/site-packages/crawlee/storages/_request_queue.py", line 165, in open
return await open_storage(
^^^^^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/store_crawler_venv/lib/python3.11/site-packages/crawlee/storages/_creation_management.py", line 170, in open_storage
storage_info = await resource_collection_client.get_or_create(name=name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/store_crawler_venv/lib/python3.11/site-packages/crawlee/storage_clients/_memory/_request_queue_collection_client.py", line 35, in get_or_create
resource_client = await get_or_create_inner(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/store_crawler_venv/lib/python3.11/site-packages/crawlee/storage_clients/_memory/_creation_management.py", line 143, in get_or_create_inner
found = find_or_create_client_by_id_or_name_inner(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/store_crawler_venv/lib/python3.11/site-packages/crawlee/storage_clients/_memory/_creation_management.py", line 102, in find_or_create_client_by_id_or_name_inner
storage_path = _determine_storage_path(resource_client_class, memory_storage_client, id, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sadaf/store_crawler/store_crawler_venv/lib/python3.11/site-packages/crawlee/storage_clients/_memory/_creation_management.py", line 412, in _determine_storage_path
metadata = json.load(metadata_file)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/__init__.py", line 293, in load
return loads(fp.read(),
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I removed the related directory in the storage/request_queues and re-ran it but I still have the same problem.
I appreciate if you guys can help! Thanks!
Package version
crawlee==0.5.0
The text was updated successfully, but these errors were encountered:
Hi @sadaffatollahy and thank you for your interest in Crawlee! Could you please provide a short script that reproduces the issue you encountered? It would help us greatly in diagnosing the problem.
asyncdefmain() ->None:
# Open or create a named request queuerequest_queue=awaitRequestQueue.open(name="dookcollection")
# Initialize the crawler with the named request queuecrawler=BeautifulSoupCrawler(
max_requests_per_crawl=100,
request_handler=router,
request_manager=request_queue,
)
# Start the crawler with the initial URLawaitcrawler.run(
["https://dookcollection.ir/"],
)
# Export the entire dataset to a JSON file.awaitcrawler.export_data_json(
path="./storage/results_dookcollection.json",
dataset_name="dookcollection",
ensure_ascii=False,
)
This is the main function for my crawler code. it has error when it opens RequestQueue
Issue description
Hi crawlee team. Thank you for the great work.
I encounter the following error while I try to run the crawler for the second time:
I removed the related directory in the storage/request_queues and re-ran it but I still have the same problem.
I appreciate if you guys can help! Thanks!
Package version
crawlee==0.5.0
The text was updated successfully, but these errors were encountered: