python抓取网页内容示例分享

代码如下:

import socketdef open_tcp_socket(remotehost,servicename): s=socket.socket(socket.af_inet,socket.sock_stream) portnumber=socket.getservbyname(servicename,’tcp’) s.connect((remotehost,portnumber)) return smysocket=open_tcp_socket(‘www.taobao.com’,’http’)mysocket.send(‘hello’)while(1): data=mysocket.recv(1024) if(data): print data.decode(‘gbk’).encode(‘utf-8’)#对于gbk编码网页必须这样转化一下 else: breakmysocket.close()

Posted in 未分类

发表评论