71 – 使用Beautiful Soup 的节点选择器获取节点信息

如何使用Beautiful Soup 的节点选择器获取节点信息

from bs4 import BeautifulSoup
html = '''
<html>
<head>
    <title>获取节点信息</title>
</head>
<body>
<div>
    <ul>
        <li class="item1" value1="1234", value2="hello world"><a href="https://www.xxx.com">ruochen</a></li>
        <li class="item2"><a href="https://www.xxx.com">若尘</a></li>
    </ul>
    <button id="button1">确定</button>
    <ul>
        <li class="item3"><a href="https://www.taobao.com">淘宝</a></li>
        <li class="item4"><a href="https://www.microsoft">微软</a></li>
        <li class="item5"><a href="https://www.google.com">谷歌</a></li>
    </ul>
</div>
<body>
'''

soup = BeautifulSoup(html, 'lxml')
print(soup.title.name)
print(soup.title.text)

print(soup.li.attrs)
print(soup.li.attrs['value2'])
print(soup.li['value1'])

print(soup.a['href'])
print(soup.a.string)
print(soup.a.text)

title

获取节点信息

{'class': ['item1'], 'value1': '1234', 'value2': 'hello world'}

hello world

1234

https://www.xxx.com

ruochen

ruochen

正文完