Python错误情况该怎样做,解决方法是什么
Admin 2022-08-27 群英技术资讯 1472 次浏览
这篇文章主要介绍了Python错误情况该怎样做,解决方法是什么相关知识,内容详细易懂,操作简单快捷,具有一定借鉴价值,相信大家阅读完这篇Python错误情况该怎样做,解决方法是什么文章都会有所收获,下面我们一起来看看吧。RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
import multiprocessing as mp
import time
from urllib.request import urlopen,urljoin
from bs4 import BeautifulSoup
import re
base_url = "https://morvanzhou.github.io/"
#crawl爬取网页
def crawl(url):
response = urlopen(url)
time.sleep(0.1)
return response.read().decode()
#parse解析网页
def parse(html):
soup = BeautifulSoup(html,'html.parser')
urls = soup.find_all('a',{"href":re.compile('^/.+?/$')})
title = soup.find('h1').get_text().strip()
page_urls = set([urljoin(base_url,url['href'])for url in urls])
url = soup.find('meta',{'property':"og:url"})['content']
return title,page_urls,url
unseen = set([base_url])
seen = set()
restricted_crawl = True
pool = mp.Pool(4)
count, t1 = 1, time.time()
while len(unseen) != 0: # still get some url to visit
if restricted_crawl and len(seen) > 20:
break
print('\nDistributed Crawling...')
crawl_jobs = [pool.apply_async(crawl, args=(url,)) for url in unseen]
htmls = [j.get() for j in crawl_jobs] # request connection
print('\nDistributed Parsing...')
parse_jobs = [pool.apply_async(parse, args=(html,)) for html in htmls]
results = [j.get() for j in parse_jobs] # parse html
print('\nAnalysing...')
seen.update(unseen) # seen the crawled
unseen.clear() # nothing unseen
for title, page_urls, url in results:
print(count, title, url)
count += 1
unseen.update(page_urls - seen) # get new url to crawl
print('Total time: %.1f s' % (time.time()-t1)) # 16 s !!!
import multiprocessing as mp
import time
from urllib.request import urlopen,urljoin
from bs4 import BeautifulSoup
import re
base_url = "https://morvanzhou.github.io/"
#crawl爬取网页
def crawl(url):
response = urlopen(url)
time.sleep(0.1)
return response.read().decode()
#parse解析网页
def parse(html):
soup = BeautifulSoup(html,'html.parser')
urls = soup.find_all('a',{"href":re.compile('^/.+?/$')})
title = soup.find('h1').get_text().strip()
page_urls = set([urljoin(base_url,url['href'])for url in urls])
url = soup.find('meta',{'property':"og:url"})['content']
return title,page_urls,url
def main():
unseen = set([base_url])
seen = set()
restricted_crawl = True
pool = mp.Pool(4)
count, t1 = 1, time.time()
while len(unseen) != 0: # still get some url to visit
if restricted_crawl and len(seen) > 20:
break
print('\nDistributed Crawling...')
crawl_jobs = [pool.apply_async(crawl, args=(url,)) for url in unseen]
htmls = [j.get() for j in crawl_jobs] # request connection
print('\nDistributed Parsing...')
parse_jobs = [pool.apply_async(parse, args=(html,)) for html in htmls]
results = [j.get() for j in parse_jobs] # parse html
print('\nAnalysing...')
seen.update(unseen) # seen the crawled
unseen.clear() # nothing unseen
for title, page_urls, url in results:
print(count, title, url)
count += 1
unseen.update(page_urls - seen) # get new url to crawl
print('Total time: %.1f s' % (time.time()-t1)) # 16 s !!!
if __name__ == '__main__':
main()
综上可知,就是把你的运行代码整合成一个函数,然后加入
if __name__ == '__main__': main()
这行代码即可解决这个问题。
python报错:RuntimeError:fails to pass a sanity check due to a bug in the windows runtime这种类型的错误
1.当前的python与numpy版本之间有什么问题,比如我自己用的python3.9与numpy1.19.4会导致这种报错。
2.numpy1.19.4与当前很多python版本都有问题。
在File->Settings->Project:pycharmProjects->Project Interpreter下将numpy版本降下来就好了。
1.打开interpreter,如下图:

2.双击numpy修改其版本:

3.勾选才能修改版本,将需要的低版本导入即可:

弄完了之后,重新运行就好。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:mmqy2019@163.com进行举报,并提供相关证据,查实之后,将立刻删除涉嫌侵权内容。
猜你喜欢
这篇文章主要介绍了Python 生成器yield原理及用法,yield 是实现生成器方法之一,当函数使用yield方法,则该函数就成为了一个生成器,更多相关资料需要的小伙伴可以参考一下下面文章内容
Python内置函数-isinstance() 函数。isinstance() 函数来判断一个对象是否是一个已知的类型,类似 type()。type() 不会认为子类是一种父类类型,不考虑继承关系。isinstance() 会认为子类是一种父类类型,考虑继承关系。
这篇文章给大家分享的是Python中的__new__和__init__的区别,对于__new__和__init__两者的区别及关联,有一些朋友不是很清楚,对此这篇文章就给大家来介绍一下,有需要的朋友接下来一起跟随小编看看吧。
最小公倍数可用于解决一些问题,因此要关注最小公倍数。在python中怎么求去最小公倍数呢?下面,小编来教教你吧。
pandas中DataFrame提供了一个灵活高效的groupby功能,它使你能以一种自然的方式对数据集进行切片、切块、摘要等操作,下面这篇文章主要给大家介绍了关于Python groupby函数详解的相关资料,需要的朋友可以参考下
成为群英会员,开启智能安全云计算之旅
立即注册关注或联系群英网络
7x24小时售前:400-678-4567
7x24小时售后:0668-2555666
24小时QQ客服
群英微信公众号
CNNIC域名投诉举报处理平台
服务电话:010-58813000
服务邮箱:service@cnnic.cn
投诉与建议:0668-2555555
Copyright © QY Network Company Ltd. All Rights Reserved. 2003-2020 群英 版权所有
增值电信经营许可证 : B1.B2-20140078 ICP核准(ICP备案)粤ICP备09006778号 域名注册商资质 粤 D3.1-20240008