python

python使用lxml加载 html---xpath

首先确定安装了lxml。
然后按照以下代码去使用

#-*-coding=utf-8-*-

__author__ = 'rocchen'

from lxml import html

from lxml import etree

import urllib2



def lxml_test():

    url="http://www.caixunzz.com"

    req=urllib2.Request(url=url)

    resp=urllib2.urlopen(req)

    #print resp.read()



    tree=etree.HTML(resp.read())

    href=tree.xpath('//a[@class="label"]/@href')

    #print href.tag

    for i in href:

        #print html.tostring(i)

        #print type(i)

        print i



    print type(href)



lxml_test()

使用urllib2读取了网页内容，然后导入到lxml，为的就是使用xpath这个方便的函数。比单纯使用beautifulsoup要方便的多。（个人认为）

0

2016-06-23

0 个评论

要回复文章请先登录或注册

python使用lxml加载 html---xpath

0 个评论

发起人

推荐内容

python使用lxml加载 html---xpath

0 个评论

发起人

推荐内容

相关问题