python 2.7 - Getting Inner Nested Tag Data with BeautifulSoup -


image nest of tags

i want information in inner tag, keep returning empty. code:

import requests bs4 import beautifulsoup  url = "http://www.krak.dk/cafe/s%c3%b8g.cs?consumer=suggest?search_word=cafe" r = requests.get(url)  soup = beautifulsoup(r.content, 'html.parser')  gendata = soup.find_all("ol", {"class": "hit-list"}) print gendata infox in gendata:     print inforx.text 

what missing?

the html broken, need different parser, can use lxml if have it:

soup = beautifulsoup(r.content, 'lxml') 

or use html5lib:

soup = beautifulsoup(r.content, 'html5lib') 

lxml has dependencies libxml, html5lib can installed pip.

in [9]: url = "http://www.krak.dk/cafe/s%c3%b8g.cs?consumer=suggest?search_word=cafe"  in [10]: r = requests.get(url) in [11]: soup = beautifulsoup(r.content, 'html.parser') in [12]: len(soup.find_all("ol", {"class": "hit-list"}))out[12]: 0  in [13]: soup = beautifulsoup(r.content, 'lxml') in [14]: len(soup.find_all("ol", {"class": "hit-list"})) out[14]: 1  in [15]: soup = beautifulsoup(r.content, 'html5lib')  in [16]: len(soup.find_all("ol", {"class": "hit-list"})) out[16]: 1 

there 1 hit-list can use find in place of find_all , can use use id soup.find(id="hit-list"). if run html thorugh w3c's html validator can see there lots of issues.


Comments

Popular posts from this blog

angular - Is it possible to get native element for formControl? -

unity3d - Rotate an object to face an opposite direction -

javascript - Why jQuery Select box change event is now working? -