讨论详情 - Hydro

Python
Python爬虫
admin LV 2 SU @ 2024-6-14 15:27:19

pip install requests -i https://pypi.tuna.tsinghua.edu.cn/simple

抓取章节网址

import requests

path = "https://www.biqooge.com"
url = "https://www.biqooge.com/3_3319/"

r = requests.get(url)
r.encoding = r.apparent_encoding
# 获取的源代码， text变量里面存储这个网站的源代码
text = r.text
# 找到 <dd><a href="/3_3319/33248097.html">第1785章 破界龙影</a></dd>
# 通过一定方法，得到了 /3_3319/33248097.html，存储到一个变量中

# 直接利用 正则表达式 来匹配
import re
hrefs = re.findall("<dd><a href=\"(.*?)\">(.*?)</a></dd>", text)
print(hrefs)

for href in hrefs:
    print(href[1] + ":", path + href[0])

目前还没有评论...

Python爬虫

0 条评论

状态

开发

支持

Python爬虫

0 条评论

状态

开发

支持

还没有账户？

登录